A colleague showed me an interesting scenario, after a restore and start up of a database we noticed that we had a lot of work being performed, but no users logged on.
We could see a lot of parallel processes running as SYS.
The processes was waiting on Other. When we selected wait event Other, we noticed that we where waiting on “Wait For a Undo Record”.Smon was waiting for “Wait for stopper event to be increased”
On Wednesday that week the customer requested a restore of the database, at that point we had a hugh delete running.. The delete was on a table with 400 Million rows
and it has been running since Monday the same week.
If we have long running transactions (insert,delete,update) aborted by killing the session or aborting the instance,we must clean up the aborted transaction.
We can expect the clean up to take a long time. SMON is handling the roll back and will use parallel roll back when appropriate.
(when recovery work reaches a certain limit)
When the limit is reached smon will start parallel processes handling the roll back.For some transactions PQ slaves can start competing with each other about resources,
when that happens parallel transaction recovery can be slower then serial transaction recovery. At times the database appear to hang when doing parallel transaction recovery.
Parallel Rollback is handled by parameter fast_start_parallel_rollback, valid values
Parallel rollback is disabled
DOP can be maximum 2 * CPU_COUNT
DOP can be maximum 4 * CPU_COUNT
Note Do not change the parameter dynamically on a system with many active transaction, it can cause severe problems.
Rollback strategy for active transactions may change. Do this change with a restart of the database.
How to solve
This type of clean up jobs must be allowed, so the best way is to wait for it to finish.We can monitor the progress by
accessing the view v$fast_start_transactions.
We have some options to influence the performance by adjusting the parameter fast_start_parallel_rollback.
We can also temporarily postpone the clean up by using an event, but remember to find the time to allow
the clean up to finish.
Please read the notes listed below for details how to monitor and handle problems related to fast parallel transaction rollback.
- Database Appears to Hang Waits for “Wait for a undo record” and “Wait for stopper event to be increased” Due to Parallel Transaction Recover [ID 464246.1]
- Database Hangs Because SMON Is Taking 100% CPU Doing Transaction Recovery [ID 414242.1]
- Oracle® Database Reference