Around 9 months ago we upgraded a Demantra system to Oracle 12c. Now they started to complain that batch jobs on the weekend is taking a very long time. I was asked to have a look. This is my findings.
A colleague supplied a AWR report to me.
And if we look at top 10 foreground events
We can see that we spend 98.6% of the db Time on CPU. In general that is good since we want to spend as much time as possible on the cpu. But it is also interesting to see what we do using the cpu. What we want to do is processing user data. So next step is to check where are we spending that cpu time ? We look at Time Model Statistics
In total we spend 216568 seconds on cpu, and of those seconds we spend 60.3% hard parsing, 61,2% total parsing time. If your system has been up and running for some time and become stable, i would not expect to see hard parse using 60,3% of the db time, unless your application is using literals. I would certainly look into that.
In 12c AWR reports we also have a snippet of the ADDM report, it is listed in the beginning what can it tell us ?
It clearly tell us that we do hard parsing since we use literals and also because we have invalidations. I will get back to the invalidation in a another post.
Awr has it all, we can easily see which statements causing high parsing in the section SQL ordered by Parse Calls.
Here we have one sql really sticking out, this sql stands for 48% of the total parses.
This top sql is executed in our pdb GDEMP01, but what puzzles me is the type of sql.
I can see select from col$ which tells me this is internal sql.
Even after fixing this issue we would still have a lot of parse time in the system. I prefer to deal with one issue at the time, evaluating the result of fixing it. Don’t do to many changes at the same time it can make it difficult for you to understand what fixed the issue in the end. In my case the batch is slow but so far it isn’t a severe issue so a methodical approach to fix the issue is possible. I do expect to find more than one issue causing the problems we have. Back to the top sql, the full sql looks like this.
47r1y8yn34jmj select default$ from col$ where rowid=:1
This sql reads the extended statistics at parse time. When I see this type of sql on the top list in a AWR i get a bit suspicious. I always look that up in Oracle Support sites. And when I did that in this case I got a hit on:
Higher CPU and slower performance with top sql as select default$ from col$ where rowid=:1. Cannot be more on the spot then that. This is what the note say:
"Whenever a SQL is hard parsed all the information about the objects is loaded to the row cache including column definitions (and by extension the column groups as well) and default values, except when the default values are longer than 32 bytes in length. If the text length of this default value is longer than 32 bytes, then it is not being loaded into the row cache. Instead, the default value is read individually by the recursive SQL and whatever necessary operation with it would be acted on during hard parse time and later discarded.If the workload includes a large number of literal SQLs and each requires a hard parse, then the recursive call will be executed for as many SQLs and as many column groups present."
In our case we had a lot of hard parse, the sql above stands for 48% of hard parse time.
The rest should come from other sql:s using literals which I expect calls this sql
(select default$ from col$ where rowid=:1) and for some of them the length of default$
are longer than 32 byte causing a hard parse of this sql.
So what can you do to fix it ?
We have three options
Disabled extended statistics by setting “_optimizer_enable_extended_stats”=false
This means the optimizer will not use extended statistics.
Drop all column group stats, this is one of the options mentioned in the note.
You will use dbms_stats.drop_extended_stats to do that.
Set cursor_sharing to force. This will convert literals to bind variables.
I started to disable extended statistics, but I think I will end up setting cursor_force=FORCE since I seems to have application sqls using literals.
But we should not jump ahead.
After disabling extended statistics we didn’t see this sql on the top list anymore, so it seemed to help. Even if we got rid of this sql we still had issues with our batch, but we ran into more issues related to non application related activities. I do believe we experiences this Bug 14283239 – High CPU/IO for dictionary SQL against SYSAUTH$
(Doc ID 14283239.8).
I will dig a bit deeper into the problem with long (> 32 bytes) extended stats. I will explain a bit more how it works in detail. I hope you want to read it.
We still have issues and a slow running system but it seems we are running into new issues. Some of them are interesting and I will continue to document them in this blog.
If you find this and other posts interesting feel free to follow me on twitter !
There is a link on the home page.