I noticed a long running sql in one of our systems, looking at it I was puzzled that we had a huge amount of buffer gets for a small delete (~40000 rows). We had over 3 billion buffer gets. And it has been executed for more than 9 hours
This are the details from grid control…
Note also that it was executed for 32.651 seconds (9 hours) and most of the time on cpu. Deleting 43000 rows should be very fast in my opinion. One good way to understand what is going on is to trace the execution. We have 10046 traces that will give us a lot of useful information. See Note: How To Collect 10046 Trace (SQL_TRACE) Diagnostics for Performance Issues (Doc ID 376442.1) for details.
I choose to use the ALTER SYSTEM SET EVENTS ‘sql_trace [sql:&1] bind=&2, wait=&3’;
With this event set every time this sqlid is executed it will be traced, I supplied TRUE for binds and waits which means it will be printed in the tracefiles as well. To disable the event you run ALTER SYSTEM SET EVENTS ‘sql_trace [sql:&1] off’
I waited until next execution of the sql and looked for the tracefile generated. I found the trace and ran tkprof on that file, tkprof formats the raw tracefile so it is easier to read. The raw tracefile can sometimes be useful to read, but for my purpose I believe the formatted tracefile is good. If you just type tkprof you will get a list of commands accepted. I ran
tkprof cdb4482_ora_19399.trc /tmp/cdb4482_ora_19399.trc.tkprof sys=y
sys=y means that it will includeall sql executed, also what is called recursive sql. Recursive sql are sql executed on behalf of my executed sql. The base sql is:
DELETE FROM TIME_DIMENSION WHERE YEAR = :B3 AND MONTH = :B2 AND DAY_OF_MONTH <= :B1
By looking from the top you will see the sqls executed. You will see internal sql related to various checks that we do. I am looking for any sql sticking out, especially related to the high buffer gets. The first I found was our original sql.
One thing I noticed here was that the elapsed/cpu time and the buffer gets does not match the info from grid control. Here we have much less time spent and buffers read.
This can be related to delayed block clean-out. (readers comment)
Google it and you can find a lot of posts explaining delayed block clean-out.
My short version is that Oracle does not tidy all blocks after a dml, If I understand it correct it is the ITL entries that stay in the block and is then handled next time that block is accessed. This delay in cleaning a block can be for a long time, it depends when it is accessed next time. And it can be the reason for the differences we see.
Looking just below I found five similar sqls..
We are doing a lot of buffer gets and also executions are matching the amount of rows returned in our base sql.
Note also the recursive depth, for this statement it is 2 and for the base statement it is 1.
Recursive depths shows relation between sqls, normally a sql that you execute using sqlplus or application has depth 0. If executed thru a pl/sql package the sql gets depth 1, the execution of the pl/sql code has depth 0. So in this case we have depth 1 for our main sql and depth 2 for the lookup sql, which make sense. There are also sql with depth 3.
The other four statements are the same just for the tables mon_ux_metric, mon_ux_transaction, mon_ux_session and mon_bus_proc statistics are similar.
So why are these sqls executed so much ? The fk abbreviation in the name helps a bit, FK often relates to Foreign key. If I look at the table mon_ux_event on the section constraints I can see.
Here we can see that we have a foreign key constraint against table time_dimension, how does that contribute to the extra buffer gets listed. We have a parent child relation here, which means that if we insert a row into the child we use the fk constraint to check that that fk column(s) value(s) exists in the parent table. If it doesn’t it will not be allowed to be inserted and you will receive a ORA-02291: integrity constraint (“constraint name”) violated – parent key not found. When we delete from the parent we need to verify that we have no child rows effected by the delete. We do that by scanning the child table. The FK constraint can de be designed to have “cascade on delete” which means we will automatically delete the child rows as well. But we do not have that in our case, see the right most column on the picture above. This scenario is the same for all tables listed above. If we check the tables having Fk constraints to time_dimension we get the following picture.
We see many FK relations between time_dimension and other tables. So why do we not see the lookup sqls for the other tables also having a FK constraint to time_dimension ?
The reason is that here we have an index that works for the lookup. See details below.
Looking at the scan we perform we can see that it does a index fast full scan.
It is executed 43200 times which matches the rows deleted, so 1 row returned per execution.Which is expected since we execute a select count .. which always return 1 row. But we do not know what the result is, but with the function of FK I expect it to always be 0 rows returned for the count. If not we would have an error saying
ORA-02292: integrity constraint (“constraint_name”) violated – child record found.
So for each and every row deleted from TIME_DIMENSION table we scan index UXE_USR_FK on table MON_UX_EVENT. That seems not very efficient. We have the same scenario for the other four tables. This is one of the reasons why we have so much buffer gets. Checking the child tables is the biggest contributor. Oracle has written a lot about foreign keys and indexing and I can recommend to read Note: Foreign Keys, Indexes and Locking (Doc ID 1945574.1) and Richard Foote’s blog about the same. Reading these notes shows that apart from the issue we experiences you can also have severe locking issues. That is why we always should have an index on the foreign key column. There are several scripts that will list if you do have an index on a FK column or not. Oracle has one NOTE:1019527.6 – Script to Check for Foreign Key Locking Issues for a Specific User. If I run this script for the table time_dimension we get
The highlighted rows are the tables referenced in the delete of time_dimension, this info tells us that changing data in time_dimension will lock all rows of the child tables.
As the notes said, this can cause a severe locking issue for you and also what it does for us right now with high buffer gets for our delete. Let us build the index for our tables.
Now when we have the index in place, will that have any effect on our delete when it comes to elapsed time and the buffer gets ?
For this execution we did much fewer rows, just 1440 rows. What is also interesting is that the select count(1)… i noticed for the long execution is now gone. I actually expected to see them but being more efficient with the fk index in place. It seems Oracle manage to do the lookup of child records without running those sql, as I cannot see any evidence of any lookup sqls in the trace with the index in place.
Removing that lookup sql contributes to the much faster execution. Even if we don’t delete the same amount of rows I believe you can see the relation between deleting the 1440 rows and the 42300 rows previously.
Not being able to see any lookup sqls after adding the fk index, I get a bit puzzled how the existens of child records are checked. I had to read the raw 10046 trace file again and this time more meticulous. Now I can see the following.
If we check those obj# we get
So we have the table time_dimension and we have the tim_default_idx index which is used by the delete. Then we have the primary key index TIM_PK of the time_dimension table TIM_PK. Last we have the LAT_SRC_IDX which is an index for one of the tables having a fk relation to time_dimension. So as I understand it this is how we check and verify that we do not violate the fk constraint.
Looking further down the trace file we can see that the obj# switches to 86447.
obj# 86447 points to one of the other indexes related to a table having a fk constraint to time_dimension. 86447 is index PR_TIM_FK which is created on table mon_process. And this continues for each table having a fk to time_dimension table.
Clearly this seems more efficient then running another sql for each row returned from the table we delete rows. And don’t forget the locking issue it also solves.
If you have any comments or can add more information to this topic feel free to do so !
Note: Foreign Keys, Indexes and Locking
(Doc ID 1945574.1)
Note: How To Collect 10046 Trace (SQL_TRACE) Diagnostics for Performance Issues
(Doc ID 376442.1)
Richard Foote’s blog