We experienced an issue of spinning cpu which required us to apply a one of patch,
at the same time we applied the latest PSU patch April 2016.
3 days later when a batch was running we got performance problems,
the batch was taking much longer than before.
I was asked to have a look, I got the sid for the batch process.
Using grid control I had a look at the session and this is what I found:
As you can see we have 206 sql commands executing. Many of them have the
same plan hash value. My first reflection of that is that we have similar sqls
executing. I created an ASH report for the time and sid, I found:
Here we can see that we have 153 “# of sampled SQL versions” that has the same
force_matching_signature. It also give 2 examples of similar sqls.
Force_matching_signature will be the same for sqls that has been rewritten due to having cursor_Sharing=force.If you as we do, have cursor_sharing=exact the optimizer still create a force_matching_signature by removing white space and replacing literals with bind variables.
The purpose of this to be used with sql Profiles where you can set force_match=true
to force similar sqls to use the sql profile. It is also a good way to find similar sql using literals.
The indication is that we have many sqls that after you remove white space and turn literals into bind variables are the same. If we check those (also presented in the ash report) sqls listed in the ash report, we can see that they look the same, and only differs by literal values.
It seems the issue is related to executions of same sqls with different literal values.
But why do we see this now ? prior to the patching we never had any reports of
this batch performing poor.
My next step is to trace an execution and see what I can find..
I’m using one of the sqls listed in the ash report.
I start by enabling 10046 traces
-- Allow the tracefile to reach unlimited size, make sure you have space on disk alter session set MAX_DUMP_FILE_SIZE = 'unlimited'; -- Set a tracefile identifier to easier find the tracefile in diag directory alter session set tracefile_identifier = 'LITERALS'; -- Enable 10046 tracing for my current session, with waits and binds EXECUTE dbms_monitor.session_trace_enable (waits=>true, binds=>true);
Looking at the diag directory I can now se my tracefile
Now I run the sql
SELECT stdoco, stedbt, stedct, stedln, stedoc, stekco, stmath01, stqnty, stsgvl, stupmj FROM f5547018 WHERE stedbt LIKE '8108574 1' AND stekco LIKE '00100' AND stedoc LIKE '226138' AND stedct LIKE 'PO' AND stedln LIKE'2000' ORDER BY stmath01 ; no rows selected Elapsed: 00:00:08.34 SQL > EXECUTE dbms_monitor.session_trace_disable; -- disable tracing
It took 8 seconds to execute, looking at the plan after the execution we can see
that A-Time (actual time) is shown as 0.01 seconds. We can also see in the Note section
– dynamic statistics used: dynamic sampling (level=2)
– 2 Sql Plan Directives used for this statement
As I understand these two features they are integrated, if the optimizer find statistics (read extended statistics) to be insufficient it will trigger Automatic Dynamic Statistics (ADS). The result of ADS is stored as a Sql Plan Directive (SPD). The SPD statistics is not stored per sql_id but instead of objects in the sql, making it available for other sqls as well.
The note in the plan says we used two sql plan directives,we can list our spd on that table.
We can see that we have SPD for table and columns, some created in January and updated in June. Several new SPDs are created on columns after we patched.
What if I run it again ?
SELECT stdoco, stedbt, stedct, stedln, stedoc, stekco, stmath01, stqnty, stsgvl, stupmj FROM f5547018 WHERE stedbt LIKE '8108574 1' AND stekco LIKE '00100' AND stedoc LIKE '226138' AND stedct LIKE 'PO' AND stedln LIKE'2000' ORDER BY stmath01 ; no rows selected Elapsed: 00:00:00.00 SQL>
Really fast, this tells me something is going on during parse time…
First execution took 8 seconds the following 0.01 seconds.
What will the trace tell us ?
I use tkprof to format the tracefile
tkprof cdb4166_ora_115494_LITERALS.trc cdb4166_ora_115494_LITERALS.trc.out sys=no
I do not sort the trace nor do I get the explain plans. For now I’m not interested in them.
I just want to see what is executed and the timing.
Scanning thru the tracefile I can see some sqls I do not recognize, they looks like this:
SQL ID: 15d2qu9mdymgg Plan Hash: 1238190760 SELECT /* DS_SVC */ /*+ dynamic_sampling(0) no_sql_tune no_monitoring optimizer_features_enable(default) no_parallel result_cache(snapshot=3600) */ SUM(C1) FROM (SELECT /*+ qb_name("innerQuery") NO_INDEX_FFS( "F5547018") */ 1 AS C1 FROM "F5547018" SAMPLE BLOCK(0.084207, 8) SEED(1) "F5547018" WHERE ("F5547018"."STEDBT"=U'8108574 1') AND (TO_CHAR("F5547018"."STEDOC")= '226138') AND (TO_CHAR("F5547018"."STEDLN")='2000') AND ("F5547018"."STEKCO"=U'00100') AND ("F5547018"."STEDCT"=U'PO')) innerQuery
I have 18 similar sqls executed prior to my sql I executed in the trace. The PARSE section in the raw tracefile has dep=1 which are dependencies, the value 1 means it is a recursive sql.
Most of them take less than 0.00 seconds but I have 3 statements each taking around
2.5 seconds of elapsed time. If I summarize them I have my 8 seconds from the first execution. The hint DS_SVC is useful to search on. If I do that on My Oracle Support I can quickly come to the conclusion they come from Automatic Dynamic Statistics. Earlier known as Dynamic Sampling.
As the name implies this is automatic and note: Adaptive Query Optimization (Doc ID 2031605.1) says:
“Oracle uses dynamic statistics automatically when the optimizer deems it necessary”
This note als0 says that
- When OPTIMIZER_ADAPTIVE_FEATURES = TRUE and OPTIMIZER_DYNAMIC_SAMPLING = 2 (default), then Automatic Dynamic Statistics (ADS) can happen.
- When OPTIMIZER_ADAPTIVE_FEATURES = FALSE and OPTIMIZER_DYNAMIC_SAMPLING = 2 (default), then Automatic Dynamic Statistics (ADS) will not happen. However, the default dynamic sampling level is still honored.
- If OPTIMIZER_DYNAMIC_SAMPLING is set to 11, then Automatic Dynamic Statistics is enabled forever irrespective of setting of the parameter OPTIMIZER_ADAPTIVE_FEATURES.
You can disable ADS by setting OPTIMIZER_DYNAMIC_SAMPLING = 0
Still we have the performance problems seen after our patching, why is this suddenly causing issues we have used 12c for more than 3 month without experiencing this before.
I turn to My Oracle Support site and seach for Automatic Dynamic Statistics, I also enable the bug section.
I came across this info…
Very Long Parse Time for Queries in In-Memory Database (Doc ID 2102106.1), which relates to Unpublished Bug 19130972 – EXTREMELY LONG PARSE TIME FOR STAR QUERY.
When parameter inmemory_query is enabled this issue can be seen. But to my knowledge we do not use in-Memory features. Reading about the parameter in 12c database administration guide it shows that this parameter has a default value of ENABLED. But it requires parameter inmemory_size to have a value > 0 to enable in-memory features.
Ok so lets test it
SQL> alter session set MAX_DUMP_FILE_SIZE = 'unlimited'; SQL> alter session set tracefile_identifier = 'INMEMORY'; SQL> EXECUTE dbms_monitor.session_trace_enable (waits=>true, binds=>true); SQL> alter session set inmemory_query=DISABLE;
I add a comment to force a hard parse..
SQL >SELECT /* INMEMORY */stdoco, stedbt, stedct, stedln, stedoc, stekco, stmath01, stqnty, stsgvl, stupmj FROM f5547018 WHERE stedbt LIKE '8108574 1' AND stekco LIKE '00100' AND stedoc LIKE '226138' AND stedct LIKE 'PO' AND stedln LIKE'2000' ORDER BY stmath01 ; no rows selected Elapsed: 00:00:00.01 SQL > EXECUTE dbms_monitor.session_trace_disable; -- disable tracing
Okay, so it seems to be fixed…..
Looking at the tracefile generated I have no dynamic statistics queries anymore.
It seems that applying the latest PSU introduced this behavior to our system.
Changing the parameter inmemory_query to disable solved the issue, I’m (almost) sure we suffered from Very Long Parse Time for Queries in In-Memory Database (Doc ID 2102106.1). The Bug says “.. for star query” but it is hidden so I cannot get any information from the bug. The note does not say if it is restricted to star queries.
A strong contributor to this performance problem comes from the application, the usage of literals will cause a hard parse for each statement. It is the hard parse that introduces the Dynamic Statistics. If we used bind variables I believe the first executions would be slow but the coming executions would do a soft parse. Possibly can adaptive cursor sharing cause some more cursors to be hard parsed but the over all executions would be better.
If you use inMemory feature there is a patch available, see the note for details.
If you see similar behavior for parallel queries look at Bug 20636003 Slow Parsing caused by Dynamic Sampling (DS_SRV) queries.
Maris Elsins – THE EASY WAY OF FINDING SIMILAR SQL STATEMENTS
Oracle – Administrator’s Guide
Oracle – Very Long Parse Time for Queries in In-Memory Database (Doc ID 2102106.1)