Wednesday, November 28, 2012

Table function

I have been thinking about using table functions to parse strings and use the results in joins to other tables. Table functions can be used to make PL/SQL collections behave like tables.

How efficient are they and what effect can they have on a plan?

I am using 11.2.0.3 for these tests

The function is created as follows:
CREATE OR REPLACE TYPE t_string_table IS TABLE OF VARCHAR2(32000);
/
show errors

 CREATE OR REPLACE FUNCTION string_token(pi_input IN VARCHAR2,  
                     pi_del  IN VARCHAR2 DEFAULT ',')  
 RETURN t_string_table   
 PIPELINED  
 IS  
   l_idx  PLS_INTEGER;  
   l_list VARCHAR2(32767) := pi_input;  
   l_value VARCHAR2(32767);  
 BEGIN   
   LOOP  
     l_idx := INSTR(l_list,pi_del);  
     IF l_idx > 0 THEN  
       pipe ROW(LTRIM(RTRIM(SUBSTR(l_list,1,l_idx-1))));  
       l_list := SUBSTR(l_list,l_idx+LENGTH(pi_del));  
     ELSE  
       pipe ROW(LTRIM(RTRIM(l_list)));  
       EXIT;  
     END IF;  
   END LOOP;  
   RETURN;  
 END string_token;
 /
 show errors

Now we are good to go. First thing to establish is what cardinality is estimated by Oracle when calling the pipelined function: I am passing in 5 values
 var str  VARCHAR2(4000);  
 var delim VARCHAR2(1);  
 exec :str := '10,11,12,13,14';  
 exec :delim := ',';  
 SELECT /*+ gather_plan_statistics monitor bind_aware */
 FROM  TABLE(string_token(:str,:delim));  

Running the trace through TRCA outputs the following plan (edited to remove information about the reads):
 ID  PID  Card   Rows    Row Source Operation 
 --- ---- ------ ------- ----------------------------------------------- 
  1:  0   8168   5       COLLECTION ITERATOR PICKLER FETCH STRING_TOKEN
The estimated cardinality (8168) almost matches the block size which I am using in my database (8192).

What about a more complex example?

I have created two tables which can be used for this test

 exec dbms_random.seed(0);  
 create table from_table  
 pctfree 99  
 pctused 1  
 as  
 with generator as (  
   select --+ materialize  
       rownum id  
   from  all_objects  
   where  rownum <= 100  
 )  
 select  
   rownum       id,  
   CASE WHEN MOD(rownum,10) < 5 THEN 4 ELSE MOD(rownum,10) END num,  
   lpad(rownum,10,'0') small_vc,  
   trunc(100 * dbms_random.normal) val,  
   rpad('x',100)    padding  
 from  
   generator  v1,  
   generator  v2  
 where  
   rownum <= 1000;

 CREATE UNIQUE INDEX i_from_table_01 ON from_table(id);  
 -- Gather stats 

create table join_table
pctfree 99
pctused 1
as
with generator as (
    select  --+ materialize
            rownum  id
    from    all_objects
    where   rownum <= 1000
)
select
    rownum              id,
    CASE WHEN MOD(rownum,10) < 5 THEN 4 ELSE MOD(rownum,10) END num,
    lpad(rownum,10,'0') small_vc,
    trunc(100 * dbms_random.normal) val,
    rpad('x',100)       padding,
    (MOD(rownum, 1000) + 1) ft_id
from
    generator   v1,
    generator   v2
where
    rownum <= 10000
;

CREATE UNIQUE INDEX i_join_table_01 ON join_table(id);
CREATE INDEX i_join_table_02 ON join_table(ft_id);
-- Gather stats

Now the table function is used to create a result set which joins to the from_table.
exec :str := '10,11,12,13,14';  
exec :delim := ',';  
SELECT /*+ gather_plan_statistics monitor bind_aware */
 FROM  from_table ft  
 JOIN  TABLE(string_token(:str,:delim)) ON column_value = ft.id  
 JOIN  join_table jt ON ft.id = jt.ft_id;

ID   PID   Card    Rows               Row Source Operation              
--- ---- ------ ------- ------------------------------------------------
 1:    0  81680      50 HASH JOIN
 2:    1   8168       5  COLLECTION ITERATOR PICKLER FETCH STRING_TOKEN
 3:    1  10000   10000  HASH JOIN
 4:    3   1000    1000 . TABLE ACCESS FULL FROM_TABLE
 5:    3  10000   10000 . TABLE ACCESS FULL JOIN_TABLE

In this case, Oracle assumes that the best way to get the data from the tables is by two full table scans - note that the discrepancy between the estimated Card (8168) and the actual returned (5).

Now add a cardinality hint saying that a handful of rows are returned. Note that I have rewritten the query to use an inline view.


 exec :str := '10,11,12,13,14';  
 exec :delim := ',';  
 SELECT /*+ gather_plan_statistics monitor bind_aware */
 FROM  from_table ft  
 JOIN  (SELECT /*+ cardinality(t 5) */* FROM TABLE(string_token(:str,:delim))t) ON column_value = ft.id  
 JOIN  join_table jt ON ft.id = jt.ft_id;  

ID   PID   Card    Rows               Row Source Operation               
--- ---- ------ ------- -------------------------------------------------
 1:    0     50      50 NESTED LOOPS
 2:    1      5       5  NESTED LOOPS
 3:    2      5       5 . COLLECTION ITERATOR PICKLER FETCH STRING_TOKEN
 4:    2      1       5 . TABLE ACCESS BY INDEX ROWID FROM_TABLE
 5:    4      1       5 .. INDEX UNIQUE SCAN I_FROM_TABLE_01
 6:    1     10      50  TABLE ACCESS BY INDEX ROWID JOIN_TABLE
 7:    6     10      50 . INDEX RANGE SCAN I_JOIN_TABLE_02

A far better plan.

To summarize, consider using the CARDINALITY hint to tell Oracle how many rows are expected to be returned when using pipelined functions where possible. It should lead to better plans being used for your queries.

Monday, November 19, 2012

Does an index work without a where clause? (part 1)

I was looking at some of the queries that the front end runs against one of the databases I support. The query currently does a full scan of one of the fact tables in order to generate a small set of static information - performance is acceptable at the moment but will gradually slow down as the fact table gets bigger.

Unfortunately amending the data model is not an option at the moment so I started thinking about other ways to do speed up the table. Could I use an index?

Tests are carried out on 11.2.0.3.

What is the performance like with a small table?

 CREATE TABLE small_tab
 AS
 SELECT *
 FROM  all_objects
 where rownum <= 70000;
 -- Gather statistics
SELECT DISTINCT object_type, object_type||' text'  
 FROM  small_tab;  
--------------------------------------------------------------------------------
 | Id | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
 --------------------------------------------------------------------------------
 |  0 | SELECT STATEMENT  |           |       |       |  307 (100) |          |
 |  1 | HASH UNIQUE       |           |  39   |  351  |  307  (2)  | 00:00:04 |
 |  2 |  TABLE ACCESS FULL| SMALL_TAB | 76220 |  669K |  304  (1)  | 00:00:04 |
 --------------------------------------------------------------------------------

What if I add a normal index?

 CREATE INDEX i_small_tab ON small_tab(object_type, object_type||' text');
 -- gather statistics
 SELECT DISTINCT object_type, object_type||' text'
 FROM  small_tab;
 --------------------------------------------------------------------------------
 | Id | Operation         | Name      | Rows  | Bytes | Cost (%CPU)| Time     |
 --------------------------------------------------------------------------------
 |  0 | SELECT STATEMENT  |           |       |       |  307 (100) |          |
 |  1 | HASH UNIQUE       |           |  39   |  897  |  307  (2)  | 00:00:04 |
 |  2 |  TABLE ACCESS FULL| SMALL_TAB | 76220 | 1711K |  304  (1)  | 00:00:04 |
 --------------------------------------------------------------------------------

Still a full scan of the table. Note that I am also unable to hint Oracle to use the index. I might take a closer look at this in a further post.

What about a bitmap index? Logically this should provide the best performance because of the way a bitmap operates internally. But is this the case?

 CREATE BITMAP INDEX i_small_tab_bmp ON small_tab(object_type, object_type||' text');  
 -- Gather statistics  
 SELECT DISTINCT object_type, object_type||' text'  
 FROM  small_tab;
 ------------------------------------------------------------------------------------------------
 | Id | Operation                   | Name            | Rows  | Bytes | Cost (%CPU)| Time     |
 ------------------------------------------------------------------------------------------------
 |  0 | SELECT STATEMENT            |                 |       |       |   7 (100)  |          |
 |  1 | HASH UNIQUE                 |                 |  39   |  897  |   7 (29)   | 00:00:01 |
 |  2 |  BITMAP INDEX FAST FULL SCAN| I_SMALL_TAB_BMP | 76220 | 1711K |   5  (0)   | 00:00:01 |
 ------------------------------------------------------------------------------------------------

... and Oracle uses the bitmap index.  Response from the bitmap on my system is about 3 times as fast as well - approximately 0.01s rather than 0.03s when using a full scan.

My next post will summarize the same tests against a large table of approximately 2,000,000 rows.