Analyze a remote query

After we moved a production database from our US office to the office in Sweden (Gothenburg) I started to see a remote query having issues with SQL*Net message from dblink. I wanted to analyze that further and see if I could fix it.

dblink_before

This is the scenario, you see the second sql having high waits on sql*net message from dblink.

It is a bit tricky to analyze a remote query because its not easy to see what is excuted on the remote side. If I trace the execution on the local side it looks like this.

SELECT T1.CREATED, T1.CREATED_BY, T1.ROW_ID
FROM SIEBEL.V_MDM_MISSING_ACCOUNTS T1
WHERE (T1.ROW_ID = :v1)

call     count      cpu    elapsed       disk      query    current       rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1     0.00       0.05          0          0          1          0
Execute      1     0.00       0.00          0          0          0          0
Fetch        2     0.00       0.26          0         12          0          1 
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        4     0.00       0.31          0         12          1          1

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 1514 (APPDBA_SHARED)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
         1          1          1 NESTED LOOPS (cr=12 pr=0 pw=0 time=262378 us cost=4 size=243 card=1)
         1          1          1  NESTED LOOPS (cr=8 pr=0 pw=0 time=60 us cost=2 size=41 card=1)
         1          1          1   TABLE ACCESS BY INDEX ROWID S_ORG_EXT (cr=5 pr=0 pw=0 time=42 us cost=1 size=32 card=1)
         1          1          1    INDEX UNIQUE SCAN S_ORG_EXT_P1 (cr=3 pr=0 pw=0 time=25 us cost=1 size=0 card=1)(object id 27276)
         1          1          1   INDEX UNIQUE SCAN S_ADDR_PER_P1 (cr=3 pr=0 pw=0 time=15 us cost=1 size=9 card=1)(object id 19011)
         1          1          1  VIEW VW_NSO_1 (cr=4 pr=0 pw=0 time=262315 us cost=2 size=202 card=1)
         1          1          1   MINUS (cr=4 pr=0 pw=0 time=262313 us)
         1          1          1    TABLE ACCESS BY INDEX ROWID S_ORG_EXT (cr=4 pr=0 pw=0 time=16 us cost=1 size=17 card=1)
         1          1          1     INDEX UNIQUE SCAN S_ORG_EXT_P1 (cr=3 pr=0 pw=0 time=7 us cost=1 size=0 card=1)(object id 27276)
         0          0          0    SORT UNIQUE (cr=0 pr=0 pw=0 time=262289 us)
         0          0          0     REMOTE (cr=0 pr=0 pw=0 time=262283 us)

The execution takes 0.31 seconds to execute, of those 0,31 seconds we spend 0,26 seconds on the remote query. (262283 us is 0,26 seconds). With an execution time of 0,31 seconds is it worth the time to tune it ? In our case this sql is executed ~1600 times each hour so yes I think we will gain from tuning this.

The name of the table implies it can be a view and that is correct, the full sql looks like this.

SELECT a.row_id, 
 a.name, 
 a.loc, 
 a.bu_id, 
 a.created, 
 a.created_by 
FROM s_org_ext a, 
 s_addr_per b 
WHERE a.pr_addr_id = b.row_id 
 AND a.row_id IN (SELECT row_id 
                  FROM s_org_ext 
                  WHERE int_org_flg = 'N' 
                  AND prtnr_flg = 'N' 
                  AND accnt_flg = 'Y' 
                  AND cmpt_flg = 'N' 
 MINUS 
                  SELECT ext_org_id1 
                  FROM s_cif_org_map@mdm 
                  WHERE cif_ext_syst_id = (SELECT row_id 
                                           FROM s_cif_ext_syst@mdm 
                                           WHERE name = 'CRM')) 
 AND a.created > SYSDATE - 365;

Since most of the time is spent on the remote side it would be interesting to see what is happening on the remote side when the sql executes. How can we do that?
We need to create a remote procedure that enables tracing on the remote side.

These are the steps to do that:

GRANT ALTER SESSION TO siebel; 
GRANT EXECUTE ON sys.dbms_monitor TO siebel; 

CREATE OR replace PROCEDURE siebel.Set_trace 
AS 
 sess_user VARCHAR2 (64); 
BEGIN 
 SELECT Sys_context ('USERENV', 'SESSION_USER') 
 INTO sess_user 
 FROM dual; 
 EXECUTE IMMEDIATE 'alter session SET max_dump_file_size=UNLIMITED';  
 EXECUTE IMMEDIATE 'alter session set tracefile_identifier = '||sess_user; 
 dbms_monitor.Session_trace_enable(NULL, NULL, TRUE, TRUE); 
END; 
/

The steps above is done on the remote side, this should be done for the same user you are using in your db link configuration on the local side. This is how my db link MDM is created, (note that you have to have global_names=false to freely choose your db link name):

CREATE DATABASE LINK “MDM.GSO.VOLVO.NET”
CONNECT TO “SIEBEL” IDENTIFIED by “******” USING ‘GMDM1P02’

I will do the steps for the Siebel user. First we grant the right to do alter session, then we grant execute on the tracing package dbms_monitor. If you don’t have dbms_monitor installed you can run  OH$/rdbms/admin/dbmsmntr.sql (as sys).

Then we create the package set_trace. The name of the tracefile will have SIEBEL in the name since that is the username in the dblink I use. If you have another username in your dblink look for that name among the tracefiles.

Short about what we do in the procedure, we start by selecting the username into sess_user variable, then we set the limit for the dump file generated, I have unlimited but if you have limited of space you can set it to a specific size. Then we use the sess_user variable and assign that to the trace file identifier which means it will be assigned to the tracefilename. Last we enable the tracing, parameters specified are in order (sid, serial#, waits,binds). Fore more information about tracing see:
How To Collect 10046 Trace (SQL_TRACE) Diagnostics for Performance Issues (Doc ID 376442.1)

Now the remote side is ready, now it is time to start our trace from the local side.

Run the remote procedure by executing it

exec set_param@mdm

mdm is the database link, the same as we use in the full sql. When we run it we will enable tracing on the remote side for this session.

And then we run our sql:

variable v1 VARCHAR2(128);
EXECUTE :v1 := '1-J2XDAK';

SELECT T1.created, 
 T1.created_by, 
 T1.row_id 
FROM siebel.v_mdm_missing_accounts T1 
WHERE ( T1.row_id = :v1 )

Now we will look at the remote side for our tracefile, i find the file gmdm1p02_ora_32152_SIEBEL.trc and if I run tkprof on that file I can see:

SELECT "A1"."ext_org_id1" 
FROM "s_cif_org_map" "A1" 
WHERE "A1"."cif_ext_syst_id" = (SELECT "A2"."row_id" 
                                FROM "s_cif_ext_syst" "A2" 
                                WHERE "A2"."name" = 'CRM') 
 AND "A1"."ext_org_id1" = :1

call     count      cpu    elapsed       disk      query    current       rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1     0.00       0.00          0          0          0          0
Execute      1     0.00       0.00          0          6          0          0
Fetch        1     0.24       0.24          0      28111          0          0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        3     0.24       0.24          0      28117          0          0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 45 (SIEBEL)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
         0          0          0 TABLE ACCESS BY INDEX ROWID S_CIF_ORG_MAP (cr=28117 pr=0 pw=0 time=246834 us cost=17 size=19 card=1)
    915923     915923     915923  INDEX RANGE SCAN S_CIF_ORG_MAP_F1 (cr=7401 pr=0 pw=0 time=95708 us cost=3 size=0 card=47015)(object id 28719)
         1          1          1   TABLE ACCESS FULL S_CIF_EXT_SYST (cr=6 pr=0 pw=0 time=38 us cost=4 size=18 card=1)

We can see the sql executed on the remote side and we can see statistics and execution plan. Execution time is 0,24 seconds which matches our timing from our local tracefile (see above) we do 28117 buffer gets. We start with a full table scan of S_CIF_EXT_SYST returning 1 row, then we scan the index S_CIF_ORG_MAP_F1 returning 915923 rows, the last step is a filtering on the table S_CIF_ORG_MAP returning 0 rows. Lets see if we can have more information from the execution. I run that sql again but now on the remote side (using autotrace). I only show the predicate info here, the plan is the same as the one above.

Predicate Information (identified by operation id):
---------------------------------------------------

 1 - filter("A1"."EXT_ORG_ID1"=:V1)
 2 - access("A1"."CIF_EXT_SYST_ID"= (SELECT "A2"."ROW_ID" FROM "S_CIF_EXT_SYST" "A2"
     WHERE "A2"."NAME"='CRM'))
 3 - filter("A2"."NAME"='CRM')

What improvment can we do here ? if we have a full table scan  that returns 1 row I believe an index will be useful. Even if the table is small, table  S_CIF_EXT_SYST  has just 44 rows I beleive an index will help us. I will create one on name to remove the full table scan.

CREATE INDEX SIEBEL.S_CIF_EXT_SYST_Q1 ON SIEBEL.S_CIF_EXT_SYST(NAME)

Next is the access of index S_CIF_ORG_MAP_F1 which relates to id 2 in the predicate list above. We return 915923 rows here but then we throws all of them away in the filtering part (id 1), which relates to EXT_ORG_ID1″=:V1. If we have an index on these two columns we should be able to do all the work on the index level. I will create an index like this:

CREATE INDEX SIEBEL.S_CIF_ORG_MAP_Q1 
 ON SIEBEL.S_CIF_ORG_MAP(CIF_EXT_SYST_ID,EXT_ORG_ID1)

 

Now lets see what happens with our query on the remote side with the new indexes.

SELECT "A1"."ext_org_id1" 
FROM "s_cif_org_map" "A1" 
WHERE "A1"."cif_ext_syst_id" = (SELECT "A2"."row_id" 
                                FROM "s_cif_ext_syst" "A2" 
                                WHERE "A2"."name" = 'CRM') 
 AND "A1"."ext_org_id1" = :1

call     count      cpu    elapsed       disk      query    current       rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1     0.00      0.00           0          0          0          0
Execute      1     0.00      0.00           0          2          0          0
Fetch        1     0.00      0.00           0          3          0          0
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        3     0.00       0.00          0          5          0          0

Misses in library cache during parse: 0
Optimizer mode: ALL_ROWS
Parsing user id: 45 (SIEBEL)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
         0          0          0 INDEX RANGE SCAN S_CIF_ORG_MAP_Q1 (cr=5 pr=0 pw=0 time=29 us cost=1 size=19 card=1)(object id 171537)
         1          1          1  TABLE ACCESS BY INDEX ROWID S_CIF_EXT_SYST (cr=2 pr=0 pw=0 time=17 us cost=1 size=18 card=1)
         1          1          1   INDEX RANGE SCAN S_CIF_EXT_SYST_Q1 (cr=1 pr=0 pw=0 time=10 us cost=1 size=0 card=1)(object id 171534)

Now it looks really good, we reduced buffer gets from 28117 to 5, we use my new indexes and we removed access to the actual table. This will surely make a difference for my local sql over the db link. Especially with ~1600 executions per hour.

Last step is to compare my local execution. I enable the 10046 trace again and run my local sql. This is the trace information (you should compare this to the one on the top)

SELECT T1.CREATED, T1.CREATED_BY, T1.ROW_ID
FROM SIEBEL.V_MDM_MISSING_ACCOUNTS T1
WHERE (T1.ROW_ID = :v1)

call     count      cpu    elapsed       disk      query    current       rows
------- ------ -------- ---------- ---------- ---------- ---------- ----------
Parse        1     0.00       0.00          0          0          0          0
Execute      1     0.00       0.00          0          0          0          0
Fetch        2     0.00       0.00          0         12          0          1
------- ------ -------- ---------- ---------- ---------- ---------- ----------
total        4     0.00       0.00          0         12          0          1

Misses in library cache during parse: 1
Misses in library cache during execute: 1
Optimizer mode: ALL_ROWS
Parsing user id: 45 (SIEBEL)
Number of plan statistics captured: 1

Rows (1st) Rows (avg) Rows (max) Row Source Operation
---------- ---------- ---------- ---------------------------------------------------
         1          1          1 NESTED LOOPS (cr=12 pr=0 pw=0 time=395 us cost=4 size=243 card=1)
         1          1          1  NESTED LOOPS (cr=8 pr=0 pw=0 time=20 us cost=2 size=41 card=1)
         1          1          1   TABLE ACCESS BY INDEX ROWID S_ORG_EXT (cr=5 pr=0 pw=0 time=14 us cost=1 size=32 card=1)
         1          1          1    INDEX UNIQUE SCAN S_ORG_EXT_P1 (cr=3 pr=0 pw=0 time=9 us cost=1 size=0 card=1)(object id 27276)
         1          1          1   INDEX UNIQUE SCAN S_ADDR_PER_P1 (cr=3 pr=0 pw=0 time=5 us cost=1 size=9 card=1)(object id 19011)
         1          1          1  VIEW VW_NSO_1 (cr=4 pr=0 pw=0 time=372 us cost=2 size=202 card=1)
         1          1          1   MINUS (cr=4 pr=0 pw=0 time=371 us)
         1          1          1    TABLE ACCESS BY INDEX ROWID S_ORG_EXT (cr=4 pr=0 pw=0 time=9 us cost=1 size=17 card=1)
         1          1          1     INDEX UNIQUE SCAN S_ORG_EXT_P1 (cr=3 pr=0 pw=0 time=6 us cost=1 size=0 card=1)(object id 27276)
         0          0          0    SORT UNIQUE (cr=0 pr=0 pw=0 time=360 us)
         0          0          0     REMOTE (cr=0 pr=0 pw=0 time=357 us)

The interesting part is that the elapsed time for the whole execution is now 0.00 seconds (was 0,31 secionds before). The execution of the remote part went from 262283 us to 357 us, a great improvement .

 

If you find this (and my other posts) interesting, please follow me on twitter.
There is a link on the main page.

 

References:

How To Collect 10046 Trace (SQL_TRACE) Diagnostics for Performance Issues (Doc ID 376442.1)
Troubleshooting Distributed Query Performance Issues (Doc ID 1921358.1)
Determining the execution plan for a distributed query (9i and below) (Doc ID 33838.1)

 

Advertisements

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s