I have been working on an application for some time now, the plan is to move it to Oracle 12c. We started with 12.1 but as soon as 12.2 was released we focused on using 12.2. We also applied the last bundle patch. This is a test system that uses the same data over and over so no new data are added.
A little background: The application has a job running every hour calculating prices, in 12.1 when I was first involved the execution time was around 90 minutes.
In 11.2 which is the current version each calculation took around 10 minutes.
My first steps was to revert all parameter changes used in 11g and basically starting with default values.
This alone had a huge impact, we noticed that execution times was down to around 20 minutes. Additional tuning with recollecting stats and checking other tuning possibilities like adding index, made the execution in 12.1 lower than the 10 minutes in 11g.
After the move to 12.2 we are back at an execution time of 17 minutes, but basically
related to one sql (which behaves odd and I had a case with Oracle Support on that sql). This blog is about another issue I noticed around 1 am every night.
I noticed that we had some issues. At that time we do start other jobs as well, not only the price calculation job.
We can see that the DBWR is on top in the top session list. In the top sql list we see the top sql having brown colour which is listed as configuration, to dig deeper we can click on the configuration label on the right hand side and get more information.
Now we can see it is related to free buffer waits. Let us look at a AWR report for that time. I choose between 1 am – 4 am.
Free Buffer waits are on the top wait event, 17.8% of the db time.
What are we waiting for when we see free buffer waits ?
To be able to do DML on a block we need to put it into the buffer cache. So we need a free buffer to use. It is the server process that search for a free buffer to use. A buffer that is modified has to be written to disk before it can be reused, a changed buffer is called a dirty buffer. It is the database writer (DBWR) that is responsible to write dirty blocks to disk.
We have two types of dirty blocks hot and cold, only cold dirty blocks will be reused after DBWR has written it to disk. Hot dirty blocks will be written to disk but remain with the same data in the buffer cache, this since it is very popular and frequently used. This functionality will reduce physical IO. There is a LRU list (Least Recently Used) to keep track if the buffer is a cold or hot buffer. If the server process finds a cold and dirty buffer while searching for a free buffer, it puts it on the write list. DBWR will then write that buffer to disk next time it writes out dirty buffers to disk. This buffer is now a free buffer.
If you scan the list until you reached a threshold (in 12c I believe it is _db_block_max_scan_pct, set to 40%). If you haven’t found a free buffer scanning up to the threshold, your server process will call DBWR to write out dirty blocks to disk, while the DBWR writes the dirty blocks to disk our server process reports “Free Buffer Waits”. The server process waits a few milliseconds and then redo the process again, hopefully we now have a free buffer to use.
There is a note:
Resolving Issues Where ‘free buffer waits’ Seen Waiting for Buffers to be Freed (Doc ID 1476046.1)
The note says:
There are 2 main scenarios where high waits for ‘free buffer waits’ occur. Firstly make sure that DBWR is able to write out the blocks quickly enough, there is information on this here. If write throughput is within acceptable thresholds then the other alternative is that the buffer cache is too small.
Use the following to address each scenario:
DBWR is not fast enough clearing dirty blocks.
- Check if CPU is not saturated. A saturated CPU can amplify wait events where a background process does not get enough CPU to progress faster.
- Check Slow IO (Poor data file write performance). Some file systems have poor write performance (writes take too long) and is impacting DB writer’s ability to keep enough clean buffers in the buffer cache.
- The DBWriter will achieve optimal throughput when asynchronous I/O is available. DBWriter may not be able to keep up with buffer demands if asynch I/O is not available. If your platform doesn’t support it, then adding multiple DBWriters can help divide the workload.
- Tune checkpoints, so that we can flush dirty buffers fast enough.
- Tune the SQL which can eliminate excess IO
Buffer cache is too small
- If the buffer cache is too small and filled with hot blocks, then sessions will be starved for free buffers (clean, cold blocks) and will need to spend too much time looking for free buffers and/or posting DBWR to write dirty blocks and make them free. Increase the parameter or DB_CACHE_SIZE and monitor the effect of the change.
- Tune the SQL which can eliminate excess IO
Lets go thru each scenario
- CPU is saturated, if we don’t have enough cpu resources available the DBWR has to wait writing out dirty blocks to disk. It is easy to detect using the AWR report, we have cpu usage listed here:
As it seems we have plenty of cpu resources available. So I doubt lack of cpu resources are the reason.
- Slow I/O performance , I believe we can use the AWR report to get the DBWR write time.
(IOStat by Function summary) Looking at the DBWR section, we do no reading which is expected since DBWR only writes. For the report duration DBWR writes 45.4 Gb and has an average write time 0,688 ms. I have to admit that I am not sure how to interpret this info. Is avg wait time of 0,688 ms fast or not. My first thought is that we should not have any wait time at all for DBWR. I looked at a few other 12c databases, some has waits some hasn’t. So I will wait for now to evaluate the write performance.
- Aysnchronious I/O, is asynch IO available ? use Note: How To Check if Asynchronous I/O is Working On Linux (Doc ID 237299.1) to check.
In my case i got
And reading the note, it indicates that the asynch I/O is working on the server.
But are the database using it ? We can use the view v$iostat_file, it has a column named asynch_io, which shows if asynch io is enabled or not. You can run this sql.
Looking at data file and temp file we can see ASYNC_OFF, so we are not using asynch_io. With asynch io off, I start to believe the issue is related to disabled asynch io. Next question is why isn’t it enabled? Reading more about it tells me this.
There are two parameters controlling asynch IO in the database, filesystemio_options and disk_asynch_io.
disk_asynch_io is like a main switch for asynch io, it is default to yes and we should keep that value. When disk_asynch_io is Yes, we can control asynch io by using filesystemio_options. It can have four values None, asynch,directio and setall.
Checking our current values we have:
You can read:
Init.ora Parameter “FILESYSTEMIO_OPTIONS” Reference Note (Doc ID 120697.1)
for more details. Going back to our settings, directio means we bypass any Unix Buffer Cache and reads from disk. This setting can cause performance problems,
if we can’t find the buffer in Oracles buffer cache we initiate a read from disk.
With this setting it is often recommended to increase the buffer cache to reduce disk reads. There is just one little thing with our environment, we use DNFS.
DNFS is always using asynch IO (if supported by the OS, and it is, we confirmed that above). Our settings having filesystemio_options=directio is not honored.
It is more of a fall back settings, what to use if asynch_io do not work.
So despite our settings of filesystemio_options, asynch io should be used. Linux supports it and DNFS supports it. Can the alert log tell us more ? At startup I can see this in the alert log.
Direct NFS: channel id  path [segotx1007] to filer [segotx1007] via local  is UP
which is telling us that DNFS is used. You can also use the column ACCESS_METHOD in v$iostat_file to verify dnfs usage. If the value is DNFS_LIB you use dnfs.
Looking at the alert log after the restart I can see frequent messages saying:
Direct NFS: please check that oradism is setuid, let us search in MOS on that.
I start with the latest Note Database Startup Failed with “Direct NFS: please check that oradism is setuid” (Doc ID 1430654.1)
It says that the reason for the message are:
– The ownership of $ORACLE_HOME/bin/oradism has been changed from
root to oracle. When checking that I can see it matches, we have it set to
– The solution is to :
#chown root:dba oradism
#chmod 4750 oradism
After doing that and restarting the database I can now see that we have async_on for data files and temp files.
2 temp files using asynch IO and 64 data files using asynch IO.
I didn’t had the privileges to do these changes on oradism myself, so while waiting to have asynch io enabled I followed the list of options to tune free buffer waits.
Normally I would do this change and wait for the result, but being far back on the priority list it took a while to have it enabled and since it is a test environment i wanted to evaluate the other steps and see what impact it had.
- Tune Checkpoints, look at note:
Checkpoint Tuning and Troubleshooting Guide (Doc ID 147468.1)
The checkpoint event synchronize modified data blocks (dirty buffers) with the files on disk. We do have log file switch (checkpoint incomplete) on our top 10 list in AWR (13%). If the checkpoint during a logfile switch is delayed due to few or to small logfiles it will have an impact on the DBWR. During a checkpoint DBWR will write dirty blocks to disk, any delay in the checkpoint process will impact DBWR write performance. As listed in in:
Note:Resolving Issues Where ‘log file switch (checkpoint incomplete)’
Waits Seen Due to Insufficient Redolog Groups (Doc ID 1476448.1)
We do log-switches to frequent, not a lot. I have seen much worse numbers here. Oracle recommends three times per hour, we do fourteen times per hour.
AWR will tell us that
Our current redo logs looks like this
We have three groups, one Gb each. I will make them bigger, 4 GB and add one more group, I’m sure this is not the reason for us having free buffer waits but it is good to make the adjustment when you see an issue. These changes will not make the performance worse. I didn’t see any improvements of free buffer waits after changing the redo logs.
- I increase the SGA, I raised the SGA from 10G to 15G. the buffer cache was raised from 8.6G to 13G. But it didn’t changed much related to the free buffer waits,
Which I expected. I expected at least some reduction of free buffer waits.
I was a bit puzzled about the increase of the SGA, I checked the current production environment and it was also 10G. comparing the performance I thought that my test environment was doing more physical reads then current production. The answer is in the asynch io issue, since we are not using asynch io we will revert back to directio, remember that we had filesystemiooption set to directio which is only honored if we do not use asynch io. Directio bypasses the OS cache and will increase physical io and if used the recommendation are to increase the buffer cache.
- Tune top sqls, what I can recall I didn’t see the free buffer waits in release 12.1 (or 11G) so it is a possibility that we have bad plans in 12.2 that causing so much IO that we run out of free buffers. Looking at my awr report I can see one sql sticking out.I have old awr reports showing execution data for 8ju6387hzaaw0. comparing execution data between 12.1 and 12.2,
gets has increase from an average 4703 gets in 12.1 to 53359 gets in 12.2, it is a huge increase over 10 times and with a high amount of executions (the differences in executions are related to the report timing) this can contribute to our increased free buffer waits. After tuning the sql above and some other sqls including the updates running while we see the free buffer waits, I couldn’t see any major reduction on free buffer waits.
So what have I done so far ?
- Increased Redo Logs to reduce log switches
- Increased SGA
- Tuned top sqls (works pretty stable now)
- Enabled asynch io
So what about the asynch io, well after I got help enabling the asynch io on the database, I immediately noticed a huge improvement related to the free buffer waits.
They are gone !
This is the same time stamp as my first report, between 1 am – 4 am. As you see no evidences of free buffer waits left.
Checking the DBWR statistics now after asynch io is enabled we got:
No waits at all, DBWR also wrote more data for the same timestamp, 45G before and 55G now without any waits. I am still not sure having waits on DBWR in this view means we have an indication of bad write performance for DBWR. If you want to share your thoughts please feel free to do so.
Note also that the log file switch (checkpoint incomplete) is gone after increasing the redo logs and adding one more group. We are now down to 5 switches per hour.
Looking back on the steps I performed adding redologs, tuned sql, added memory to sga and last checking asynch IO, with the result available I would recommend to do it in another order.
- Chck asynch io, make sure you have it enabled. It has the biggest impact, on DBWR performance.
- Tune sql, it can also make a big difference if you have really poor sql.
- Add memory to your SGA and make sure most of it goes to the buffer cache.
- Add redo log group/change size.
After enabling asynch_io my price calculation job was running even better, I fixed the oddly behaving sql which seems caused by;
Bug 26241759 – CBO BYPASSES OR EXPANSION LEADING TO SUB-OPTIMAL PLAN UTILIZING FTS
Thanks for reading !
Craig Shallahamer – Oracle Performance Firefighting
– A very good book in my opinion
Resolving Issues Where ‘free buffer waits’ Seen Waiting for Buffers to be Freed
(Doc ID 1476046.1)
How To Check if Asynchronous I/O is Working On Linux (Doc ID 237299.1)
Init.ora Parameter “FILESYSTEMIO_OPTIONS” Reference Note (Doc ID 120697.1)
Direct NFS: please check that oradism is setuid (Doc ID 1430654.1)