1.1测试数据准备
create table t1(
date_ord date constraint t1_dto_nn not null,
seq_ord number(6) constraint t1_sqo_nn not null,
small_vc varchar2(10)
)
pctfree 90
pctused 10
;
drop sequence t1_seq;
create sequence t1_seq;
create or replace procedure t1_load(i_tag varchar2) as
m_date date;
begin
dbms_output.put_line(
dbms_lock.request(
1,dbms_lock.s_mode, release_on_commit => true
)
);
commit;
for i in 0..25 loop
m_date := trunc(sysdate) + i;
for j in 1..200 loop
insert into t1 values(
m_date,
t1_seq.nextval,
i_tag || j
);
commit;
dbms_lock.sleep(0.01);
end loop;
end loop;
end;
/
begin
dbms_output.put_line(
dbms_lock.request(
1,dbms_lock.x_mode,
release_on_commit=>true
)
);
end;
/
prompt
prompt From five different sessions, execute a call to procedure
prompt t1_load with a different tag for each session, e.g.
prompt execute t1_LOAD('a')
prompt
prompt These will all suspend, waiting to acquire a share lock that
prompt this progam is holding in exclusive mode. When all five sessions
prompt have called the procedure, press return in this screen to allow them
prompt to continue
accept x
commit;
create index t1_i1 on t1(date_ord, seq_ord);
1.2统计信息收集
rem
rem Report how many times five processes have collided on a block
rem The first column is the number of different processes that
rem have insertde into a block, the second is the number of blocks
rem with that many hits.
rem
select ct, count(*)
from
(
select block, count(*) ct
from
(
select
distinct dbms_rowid.rowid_block_number(rowid) block,
substr(small_vc,1,1)
from t1
)
group by block
)
group by ct
;
CT COUNT(*)
---------- ----------
5 738 有738个数据块是5个session一起插入数据的
4 2 有2个数据块是4个session一起插入数据的
3 1 有2个数据块是3个session一起插入数据的
begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/
select
blocks,
num_rows
from
user_tables
where
table_name = 'T1';
BLOCKS NUM_ROWS
---------- ----------
744 26000
前面的查询结果来看,真正用于数据存储的使用了738+2+1=741,剩下的3个数据块用于甚么去了呢?待查?
select
index_name, blevel, leaf_blocks, clustering_factor
from
user_indexes
where
table_name = 'T1'
;
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
T1_I1 1 86 840
1.3测试
实例1
select
count(small_vc)
from
t1
where
date_ord = trunc(sysdate) + 7
;
Execution Plan
----------------------------------------------------------
Plan hash value: 269862921
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 38 |
| 1 | SORT AGGREGATE | | 1 | 13 | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 1000 | 13000 | 38 |
|* 3 | INDEX RANGE SCAN | T1_I1 | 1000 | | 5 |
----------------------------------------------------------------------
cost =
blevel +
ceil(effective index selectivity * leaf_blocks) +
ceil(effective table selectivity * clustering_factor)
In this case, we are after 1 day out of 26—a selectivity of 3.846% or 0.03846—and the two selectivities are identical. Putting these figures into the formula:
cost =
1 +
ceil(0.03846 * 86) +
ceil(0.03846 * 840)
= 1 + 4 + 33 = 38
select
/*+ full(t1) */
rowid, date_ord, seq_ord, small_vc
from
t1
where
rownum <= 10
;
ROWID DATE_ORD SEQ_ORD SMALL_VC
------------------ --------- ---------- ----------
AAAMJHAAJAAAAAKAAA 26-JUL-11 1 A1
AAAMJHAAJAAAAAKAAB 26-JUL-11 2 B1
AAAMJHAAJAAAAAKAAC 26-JUL-11 3 C1
AAAMJHAAJAAAAAKAAD 26-JUL-11 4 A2
AAAMJHAAJAAAAAKAAE 26-JUL-11 5 D1
AAAMJHAAJAAAAAKAAF 26-JUL-11 6 E1
AAAMJHAAJAAAAAKAAG 26-JUL-11 7 B2
AAAMJHAAJAAAAAKAAH 26-JUL-11 8 D2
AAAMJHAAJAAAAAKAAI 26-JUL-11 9 B3
AAAMJHAAJAAAAAKAAJ 26-JUL-11 10 E2
Remember that the extended rowid is made up of the following:
• object_id First six letters (AAAMJH)
• Relative file_id Next three letters (AAJ)
• Block within file Next six letters (AAAAAK)
• Row within block Last three letters (AAA, AAB, AAC ...)
All these rows are in the same block (AAAAAK). In my test run, I populated the column small_vc with a tag that could be used to identify the process that inserted the row. All five of our processes were busy hitting the same table block at the same time. In a very busy system (in particular, one with a high degree of concurrency), we might have seen lots of buffer busy waits for blocks of class data block for whichever one block was the current focus of all the inserts.
How do we address this issue? Simple: we read the advice in the Oracle Performance Tuning Guide and Reference and (for older versions of Oracle particularly) realize that we should have created the table with multiple freelists. In this case, because we expect the typical degree of concurrency to be 5, we might go to exactly that limit, and create the table with the extra clause:
storage (freelists 5)
二、Reducing Table Contention (Multiple Freelists)
把1.1测试数据准备部分的表结构定义改成如下定义。
create table t1(
date_ord date constraint t1_dto_nn not null,
seq_ord number(6) constraint t1_sqo_nn not null,
small_vc varchar2(10)
)
pctfree 90
pctused 10
storage (freelists 5)
;
CT COUNT(*)
---------- ----------
1 298
3 445
BLOCKS NUM_ROWS
---------- ----------
754 26000
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
T1_I1 1 86 16330
Execution Plan
----------------------------------------------------------
Plan hash value: 3724264953
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 13 | 116 |
| 1 | SORT AGGREGATE | | 1 | 13 | |
|* 2 | TABLE ACCESS FULL| T1 | 1000 | 13000 | 116 |
-----------------------------------------------------------
全表查询cost=754/6.41=118
如果走index,按照上面的公式
cost =
1 +
ceil(0.03846 * 86) +
ceil(0.03846 * 16330)
= 1 + 4 + 629 = 634
所以Oracle选择使用全表扫描。
三、Reducing Leaf Block Contention (Reverse Key Indexes)
利用1.1测试数据准备部分创建数据,之后使用下面的index定义创建reverce index。
alter index t1_i1 rebuild reverse;
A reverse key index operates by reversing the byte order of each column of the index before inserting the resulting value into the index structure. The effect of this is to turn sequential values into index entries that are randomly scattered.
四、Column Order
This is just one important consideration when deciding the column order of an index. Another is the possibility for improving the compressibility of an index by putting the least selective (most repetitive) columns first. Another is the option for arranging the columns so that some very popular queries can perform an order by without doing a sort (an execution mechanism that typically appears as sort (order by) nosort in the execution plan).
Whatever your reason for deciding to change the order of columns in an index, remember that it might change the clustering_factor. The knock-on effect of this might be that the calculated cost of the index for a range scan becomes so high that Oracle ignores the index.
4.1测试数据准备
create table t1
pctfree 90
pctused 10
as
select
trunc((rownum-1)/ 100) clustered,
mod(rownum - 1, 100) scattered,
lpad(rownum,10) small_vc
from
all_objects
where
rownum <= 10000
;
create index t1_i1_good on t1(clustered, scattered);
create index t1_i2_bad on t1(scattered, clustered);
4.2统计信息收集
begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/
select
blocks,
num_rows
from
user_tables
where
table_name = 'T1'
;
BLOCKS NUM_ROWS
---------- ----------
292 10000
select
index_name, blevel, leaf_blocks, clustering_factor
from
user_indexes
where
table_name = 'T1'
;
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
T1_I1_GOOD 1 24 278
T1_I2_BAD 1 24 10000
4.3测试
select
/*+ index(t1 t1_i1_good) */
count(small_vc)
from
t1
where
scattered = 50
and clustered between 1 and 5
;
Execution Plan
----------------------------------------------------------
Plan hash value: 2833503776
---------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
---------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 4 |
| 1 | SORT AGGREGATE | | 1 | 17 | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 6 | 102 | 4 |
|* 3 | INDEX RANGE SCAN | T1_I1_GOOD | 6 | | 3 |
---------------------------------------------------------------------------
select
/*+ index(t1 t1_i2_bad) */
count(small_vc)
from
t1
where
scattered =50
and clustered between 1 and 5
;
Execution Plan
----------------------------------------------------------
Plan hash value: 2223336776
--------------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
--------------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 17 | 9 |
| 1 | SORT AGGREGATE | | 1 | 17 | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 6 | 102 | 9 |
|* 3 | INDEX RANGE SCAN | T1_I2_BAD | 6 | | 2 |
--------------------------------------------------------------------------
Irrespective of which index we use in this example, we will visit exactly the same number of table blocks—but the order in which we visit them will be different, and this has been enough to make a big difference to the optimizer’s calculations.
For completeness, let’s just run our statistics through the formula.
Selectivity of scattered = 50: 1 / 100 = 0.01
Selectivity of clustered between 1 and 5: (5 – 1) / ( 99 – 0 ) + 2/100 = 0.060404
Combined selectivity: 0.01 * 0.060404 = 0.00060404
cost (t1_i1_good) =
1 +
ceil(0.060404 * 24) + -- range on first column, invalidates second column
ceil(0.00060404 * 278) -- 2nd column can be used before visiting the table
= 1 + 2 + 1 = 4
cost (t1_i2_bad) =
1 +
ceil(0.00060404 * 24) + -- can use both columns for start/stop keys
ceil(0.00060404 * 10000) -- but the clustering_factor is overwhelming
= 1 + 1 + 7 = 9
五、Extra Columns
It’s not just a change in column order that could introduce a problem. It’s a fairly common (and often effective) practice to add a column or two to an existing index. By now I’m sure you won’t be surprised to discover that this, too, can make a dramatic difference to the clustering_factor, hence to the desirability of the index.Imagine a system that includes a table for tracking product movements. It has a fairly obvious index on the movement_date, but after some time, it might become apparent to the DBA that a number of commonly used queries would benefit from the addition of the product_id to this index (see script extra_col.sql in the online code suite).
5.1测试数据准备
create table t1
as
select
sysdate + trunc((rownum-1) / 500) movement_date,
trunc(dbms_random.value(1,60.999)) product_id,
trunc(dbms_random.value(1,10.000)) qty,
lpad(rownum,10) small_vc,
rpad('x',100) padding
from
all_objects
where
rownum <= 10000
;
5.2测试
5.2.1在movement_date上创建index进行测试
create index t1_i1 on t1(movement_date);
begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/
select
blocks,
num_rows
from
user_tables
where
table_name = 'T1';
BLOCKS NUM_ROWS
---------- ----------
194 10000
select
index_name, blevel, leaf_blocks, clustering_factor
from
user_indexes
where
table_name = 'T1'
;
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
T1_I1 1 27 184
测试
select
sum(qty)
from
t1
where
movement_date = trunc(sysdate) + 7
and product_id = 44
;
Execution Plan
----------------------------------------------------------
Plan hash value: 269862921
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 | 12 |
| 1 | SORT AGGREGATE | | 1 | 14 | |
|* 2 | TABLE ACCESS BY INDEX ROWID| T1 | 8 | 112 | 12 |
|* 3 | INDEX RANGE SCAN | T1_I1 | 500 | | 2 |
----------------------------------------------------------------------
select
product_id, max(small_vc)
from
t1
where
movement_date = trunc(sysdate) + 7
group by
product_id
;
Execution Plan
----------------------------------------------------------
Plan hash value: 2629508439
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 60 | 1320 | 40 |
| 1 | HASH GROUP BY | | 60 | 1320 | 40 |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 500 | 11000 | 12 |
|* 3 | INDEX RANGE SCAN | T1_I1 | 500 | | 2 |
----------------------------------------------------------------------
The first query is an example of the type of query that encouraged us to add the extra column to the index. The second query is an example of a query that will suffer as a conse-quence of the change. In both cases, Oracle will be visiting the same little clump of about ten blocks in the table—but the extra column changes the order in which the rows are visited (which is what the clustering_factor is about), so the cost changes, and in the second case the execution plan changes for the worse.
5.2.2在movement_date上创建index进行测试
drop index t1_i1;
create index t1_i1 on t1(movement_date, product_id);
begin
dbms_stats.gather_table_stats(
user,
't1',
cascade => true,
estimate_percent => null,
method_opt => 'for all columns size 1'
);
end;
/
select
index_name, blevel, leaf_blocks, clustering_factor
from
user_indexes
where
table_name = 'T1'
;
INDEX_NAME BLEVEL LEAF_BLOCKS CLUSTERING_FACTOR
-------------------- ---------- ----------- -----------------
T1_I1 1 31 6645
select
table_name,
blocks,
num_rows
from user_tables
where table_name = 'T1'
;
TABLE_NAME BLOCKS NUM_ROWS
-------------------- ---------- ----------
T1 194 10000
select
sum(qty)
from
t1
where
movement_date = trunc(sysdate) + 7
and product_id = 44
;
Execution Plan
----------------------------------------------------------
Plan hash value: 269862921
----------------------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
----------------------------------------------------------------------
| 0 | SELECT STATEMENT | | 1 | 14 | 7 |
| 1 | SORT AGGREGATE | | 1 | 14 | |
| 2 | TABLE ACCESS BY INDEX ROWID| T1 | 8 | 112 | 7 |
|* 3 | INDEX RANGE SCAN | T1_I1 | 8 | | 1 | ----------------------------------------------------------------------
select
product_id, max(small_vc)
from
t1
where
movement_date = trunc(sysdate) + 7
group by
product_id
;
Execution Plan
----------------------------------------------------------
Plan hash value: 136660032
-----------------------------------------------------------
| Id | Operation | Name | Rows | Bytes | Cost |
-----------------------------------------------------------
| 0 | SELECT STATEMENT | | 60 | 1320 | 59 |
| 1 | HASH GROUP BY | | 60 | 1320 | 59 |
|* 2 | TABLE ACCESS FULL| T1 | 500 | 11000 | 31 |
-----------------------------------------------------------
全表扫描cost = 194/6.41=31
index扫描cost = 1+0.05*31+0.05*6645=336
所以选择全表扫描。
没有评论:
发表评论