利用维对象来优化数据仓库的高级技巧

发表于:2007-06-13来源:作者:点击数: 标签:
在Oracle的数据仓库(OLAP)中,实体化视图(MVIEW),查询重写(Query Rewrite)和维(Dimension)是非常重要的优化手段,对于前两者我不想在这儿重复讲了,主要来体验一下维的作用。要发挥维的作用,还是需要用到前面两者,下面是我设计的只有一个维表的最简单的例

在Oracle的数据仓库(OLAP)中,实体化视图(MVIEW),查询重写(Query Rewrite)和维(Dimension)是非常重要的优化手段,对于前两者我不想在这儿重复讲了,主要来体验一下维的作用。要发挥维的作用,还是需要用到前面两者,下面是我设计的只有一个维表的最简单的例子。数据库用户除了connect, resource外, 还要给予Query Rewrite,Create Materialized View,Create Dimension权限。

1、创建一个维护表

CREATE TABLE TIME_DIM

AS

SELECT TO_CHAR(SYSDATE+ROWNUM,'YYYY') F_YEAR,

TO_CHAR(SYSDATE+ROWNUM,'YYYY-Q') F_QUATER,

TO_CHAR(SYSDATE+ROWNUM,'YYYY-MM') F_MONTH,

TRUNC(SYSDATE+ROWNUM,'DD') F_DAY

FROM DBA_OBJECTS

WHERE ROWNUM < 1000;

ALTER TABLE TIME_DIM MODIFY F_YEAR NOT NULL;

ALTER TABLE TIME_DIM MODIFY F_QUATER NOT NULL;

ALTER TABLE TIME_DIM MODIFY F_MONTH NOT NULL;

ALTER TABLE TIME_DIM MODIFY F_DAY NOT NULL;

ALTER TABLE TIME_DIM ADD PRIMARY KEY (F_DAY);

2、创建一个事实表

CREATE TABLE FACT_SALES

AS

SELECT TRUNC(SYSDATE+ROWNUM,'DD') F_DAY,

TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT1,

TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT2

FROM DBA_OBJECTS

WHERE ROWNUM < 1000;

-- Please execute the following insert multiple times

INSERT INTO FACT_SALES

SELECT F_DAY,

TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT1,

TRUNC(DBMS_RANDOM.VALUE * 10000,2)  M_AMOUNT2

FROM FACT_SALES

ALTER TABLE FACT_SALES MODIFY F_DAY NOT NULL;

ALTER TABLE FACT_SALES ADD FOREIGN KEY (F_DAY) REFERENCES TIME_DIM(F_DAY);

3、将数据按月份进行汇总生成中间表

CREATE MATERIALIZED VIEW MV_FACT_SALES

ENABLE QUERY REWRITE

AS

SELECT D.F_MONTH,

SUM(F.M_AMOUNT1) M_AMOUNT1,

SUM(F.M_AMOUNT2) M_AMOUNT2

FROM TIME_DIM D, FACT_SALES F

WHERE D.F_DAY = F.F_DAY

GROUP BY D.F_MONTH

/

4、分析表, 并在会话级启用查询重写

ANALYZE TABLE TIME_DIM COMPUTE STATISTICS;

ANALYZE TABLE FACT_SALES  COMPUTE STATISTICS;

ANALYZE TABLE MV_FACT_SALES COMPUTE STATISTICS;

ALTER SESSION SET QUERY_REWRITE_ENABLED=TRUE;

ALTER SESSION SET QUERY_REWRITE_INTEGRITY=TRUSTED;

5、先来看一下按季汇总的SQL语句的执行计划, 理论上, 可以从按月的汇总中进一步汇总生成, 但这时去执行的话, Oracle并不能这样执行

ASQL> SELECT D.F_QUATER,

2        SUM(F.M_AMOUNT1) M_AMMOUNT1,

3        SUM(F.M_AMOUNT2) M_AMMOUNT2

4 FROM TIME_DIM D, FACT_SALES F

5 WHERE D.F_DAY = F.F_DAY

6 GROUP BY D.F_QUATER

7 /

Execute Plan

SQLPLAN                                                  COST   CARD KBYTE PS PE

-------------------------------------------------------- ---- ------ ----- -- --

0     SELECT STATEMENT Optimizer=ALL_ROWS               626     12     1     

1   0   HASH (GROUP BY)                                 626     12     1     

2   1     HASH JOIN                                     609 287712 16858     

3   2       TABLE ACCESS (FULL) OF TIME_DIM (TABLE)       4    999    13     

4   2       TABLE ACCESS (FULL) OF FACT_SALES (TABLE)   602 287712 13206 

6、接下来来创建一个维对象, 用来告诉Oracle在TIME_DIM表的四个字段上存在的树状关系, 如果没有这个声明, Oracle会认为数据是不附合这个树状关系的

CREATE DIMENSION TIME_DIM

LEVEL YEAR IS (TIME_DIM.F_YEAR)

LEVEL QUATER IS (TIME_DIM.F_QUATER)

LEVEL MONTH IS (TIME_DIM.F_MONTH)

LEVEL DAY IS (TIME_DIM.F_DAY)

HIERARCHY Y_Q_M_D

(

DAY CHILD OF

MONTH CHILD OF

QUATER CHILD OF  YEAR

)

HIERARCHY Y_M_D

(

DAY CHILD OF

MONTH CHILD OF YEAR

)

7、建好维后, 重新跑按季度汇总或按年汇总的SQL, 看他们的执行计划有什么不同?

ASQL> SELECT D.F_QUATER,

2        SUM(F.M_AMOUNT1) M_AMMOUNT1,

3        SUM(F.M_AMOUNT2) M_AMMOUNT2

4 FROM TIME_DIM D, FACT_SALES F

5 WHERE D.F_DAY = F.F_DAY

6 GROUP BY D.F_QUATER

7 /

Execute Plan

SQLPLAN                                                     COST CARD KBYTE PS PE

----------------------------------------------------------- ---- ---- ----- -- --

0     SELECT STATEMENT Optimizer=ALL_ROWS                   10   12     1     

1   0   HASH (GROUP BY)                                     10   12     1     

2   1     HASH JOIN                                          9  289    17     

3   2       MAT_VIEW REWRITE ACCESS (FULL) OF MV_FACT_...    3   34     2     

4   2       VIEW OF                                          5  289     4     

5   4         HASH (UNIQUE)                                  5  289     4     

6   5           TABLE ACCESS (FULL) OF TIME_DIM (TABLE)      4  999    13     

ASQL> SELECT /*+ all_rows */ D.F_YEAR,

2        SUM(F.M_AMOUNT1) M_AMMOUNT1,

3        SUM(F.M_AMOUNT2) M_AMMOUNT2

4 FROM TIME_DIM D, FACT_SALES F

5 WHERE D.F_DAY = F.F_DAY

6 GROUP BY D.F_YEAR

7 /

Execute Plan

SQLPLAN                                                      COST CARD KBYTE PS PE

------------------------------------------------------------ ---- ---- ----- -- --

0     SELECT STATEMENT Optimizer=HINT: ALL_ROWS              10    4     0     

1   0   HASH (GROUP BY)                                      10    4     0     

2   1     HASH JOIN                                           9   97     6     

3   2       MAT_VIEW REWRITE ACCESS (FULL) OF MV_FACT_S...    3   34     2     

4   2       VIEW OF                                           5   97     1     

5   4         HASH (UNIQUE)                                   5   97     1     

6   5           TABLE ACCESS (FULL) OF TIME_DIM (TABLE)       4  999    11      

我不太会用语言表解说, 因此设计了这个实例来说明, 如果看不懂, 请多看几次吧!

(责任编辑 火凤凰 sunsj@51cto.com  QQ:34067741  TEL:(010)68476636-8007)



原文转自:http://www.ltesting.net

...