enq: TX – row lock contention

I received a case where a developer had a session that hung waiting on  enq: TX – row lock contention. It was several things that puzzled me, he claims he is the only one running on this database and the row lock was related to an insert.

insert into PRODUCT (ID, VIN, CHASSISSERIES, CHASSISNO, MODIFICATIONTIME, CREATIONTIME) values (:1 , :2 , :3 , :4 , :5 , :6 )

I found this a bit odd at first. I relate enq: TX row lock contention with updates and deletes, not inserts. I want to analyze this further.

This is how the issue looked in Grid Control

lde1

We can see one session waiting on Application, we have no other sessions visible doing  work. If we select and drill down on that sql_id 6padky01y6s2k, we get he following picture.

lde1b

We can see that it is an insert and that it is waiting on enq: TX – row lock contention.
And still one session running the sql.
Luckily we have good tools in Oracle to dig deeper, we can have a look at the blocking sessions screen in grid control, it is located under the performance tab.

lde2

We see our session holding the lock, it is session id 105. Looking at that session, what is it waiting for ? It is Idle,  a session that actively holds a row lock is not doing anything ? Idle time for a session is not uncommon, but how can it causes a locking issue ?

Lets dig deeper. Lets look at that session by selecting the sid.

lde3

What can we see if we look at session 105. It is indeed idle, waiting  for sql*net message from client. It has been logged on for 1 day 20 hours 15 minutes and 30 seconds. Now have a look at the row WAIT:wait duration, 1 day 20 hours 14 minutes and 14 seconds. This session has been idle for more than 99% of the time. I asked the developer what happened the day before ? He replied that they had some issues with Java and he killed the java app manually. This is most likely the reason for us having that idle session. I plan to explain how this and similar scenarios can happen in another blog later on.

Still we have to answer the question how an insert can cause a row lock contention ?
If we look in the  Database Performance Tuning Guide for the description of the wait event:

10.3.6.2.4 TX enqueue

These are acquired exclusive when a transaction initiates its first change and held until the transaction does a COMMIT or ROLLBACK.

  • Waits for TX in mode 6: occurs when a session is waiting for a row level lock that is held by another session. This occurs when one user is updating or deleting a row, which another session wants to update or delete. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.The solution is to have the first session holding the lock perform a COMMIT or ROLLBACK.
  • Waits for TX in mode 4 can occur if the session is waiting for an ITL (interested transaction list) slot in a block. This happens when the session wants to lock a row in the block but one or more other sessions have rows locked in the same block, and there is no free ITL slot in the block. Usually, Oracle Database dynamically adds another ITL slot. This may not be possible if there is insufficient free space in the block to add an ITL. If so, the session waits for a slot with a TX enqueue in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - allocate ITL entry.The solution is to increase the number of ITLs available, either by changing the INITRANS or MAXTRANS for the table (either by using an ALTER statement, or by re-creating the table with the higher values).
  • Waits for TX in mode 4 can also occur if a session is waiting due to potential duplicates in UNIQUE index. If two sessions try to insert the same key value the second session has to wait to see if an ORA-0001 should be raised or not. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.The solution is to have the first session holding the lock perform a COMMIT or ROLLBACK.
  • Waits for TX in mode 4 is also possible if the session is waiting due to shared bitmap index fragment. Bitmap indexes index key values and a range of rowids. Each entry in a bitmap index can cover many rows in the actual table. If two sessions want to update rows covered by the same bitmap index fragment, then the second session waits for the first transaction to either COMMIT or ROLLBACK by waiting for the TX lock in mode 4. This type of TX enqueue wait corresponds to the wait event enq: TX - row lock contention.
  • Waits for TX in Mode 4 can also occur waiting for a PREPARED transaction.
  • Waits for TX in mode 4 also occur when a transaction inserting a row in an index has to wait for the end of an index block split being done by another transaction. This type of TX enqueue wait corresponds to the wait event enq: TX - index contention.

The most common scenario regarding enq:TX row lock contention is related to what the firts bullet says, a session is updating or deleting a set of rows and holds the lock for those rows, another session is then trying to do another dml (update or delete) against one or more of the same set of rows. But our locking scenario is not related to updates or deletes.

Continuing reading the bullet mentioning the insert and relations to the unique key is more interesting. If we look at the table product.

lde6

We have one Primary key constraint and two unique constraints, all supported by indexes.
Our blocked session waits on another insert that is not yet committed or rolled back. Since the session that block our session has been idle for more than 1 day we can assume it will not commit or rollback that transaction. I want to verify my findings, looking at session 105 as we did previously, under the label Application we have Open Cursors, if we select that we get to a screen that shows all open cursors for session 105.

lde4 Here we can see the same insert sql as our locked session is running.

Conclusion

This is what I think happened, yesterday the developer ran the application and did the same activity he did today, the key issue here is that he insert the exact same data today as he did yesterday. If he used other values for the primary/unique key today, This issue would not occur.  Having issues with Java, he killed the application. The session stays connected and it has ongoing transactions that hasn’t been committed or rolled back..

How to resolve

We killed session 105, note that killing the session and rolling back the transaction can introduce logical corruption in your application. You might need to spend more time on the clean up activity making sure your data is correct.

Our customer use WebSphere as application server and I learned yesterday that you can set a parameter in websphere called agedtimeout. You set this to some seconds, like 600.
If that session has been idle for 600 seconds the connection is terminated (active transactions will be rolled back) and the connection is returned to the connection pool manager. You need to be careful setting the value to low, it is normal that the session has idle time. You do not want to terminate sessions that are active. Using this feature makes it more difficult to handle logical corruption explained above, so think carefully before implementing it. I don’t know the name of the parameter for other app servers but I am sure it can easily be found.

I am also looking into Dead Connection Detection (DCD), but I need to evaluate that more before I can say for sure that it is possible to use. I would prefer to use the feature in the app server if available. If you experiencing frequent scenarios like the one I explain here you probably need to spend more time analyzing your application, issues like these are most often coming from the application. More on this topic will come later.

ASH Report

Ash reports are often used analyzing performance issues, and it is a very useful tool. I want to show you how my ash report looks for the issue above.

lde7
As expected our top wait event is enq:TX- row lock contention. It stands for 97.4% of the sampled time.

lde8
We have 1 insert statement that stands for 97.40% of the total activity, note how that matches the percentage of the top wait event.

lde9
Session 173 is our top session, it waits 97.40% on the row lock wait. We can see which user it is LDE_DEV4 which is the same username reported by the developer.

lde10
And at the top blocking session we have the session causing it all, session 105. Note the label Event Caused, it says that this session 105 is the session causing the enq: TX row lock contention session 173 suffers from. Grid control gives you more information but the ash report is all you need to fix the issue.

References:

Database Performance Tuning Guide 11.2.0
Dead Connection Detection
Oracle Net 12c: New Implementation of Dead Connection Detection 

Advertisements

6 thoughts on “enq: TX – row lock contention

  1. Hi,
    When you search for ways to close idle sessions you can investigate the PROFILE limit IDLE_TIME. Of course it works totally different than DCD as the profile limit disconnects healthy sessions as well, just for their inactivity.
    In case you opt for DCD you might want to read
    Oracle Net 12c: New Implementation of Dead Connection Detection (DCD) (Doc ID 1591874.1)
    Martin

  2. Hi,
    Interesting to see that the wait event interface has wrongly interpreted the situation by labeling the enqueue as ‘row lock contention’. It is a mode=4 TX because p1=1415053316 = 0x54580004 as p1 is ascii of lock type + lock mode
    Regards,
    Franck.

    • Hello Franck, not sure why Oracle choose to list the scenario under enq: TX row lock contention, I’ll guess this scenario is not very common. Not sure what a better wait event for this scenario should be. At least it is documented.

      /Magnus

      • Oh yes. I was sure that there was another wait event for this, but doc says it is ‘row lock’. So looking at the mode is still very important.

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out / Change )

Twitter picture

You are commenting using your Twitter account. Log Out / Change )

Facebook photo

You are commenting using your Facebook account. Log Out / Change )

Google+ photo

You are commenting using your Google+ account. Log Out / Change )

Connecting to %s