OpenEdge Database Performance: Before and After Imaging, Background Writers, BI Clusters, and Checkpoints

This article is aimed at people running OpenEdge databases, who want practical, low-risk, and fast performance improvements for their backend. The database is an essential part of backend performance, especially since OpenEdge makes it so easy to query your database.

With OpenEdge 12, database performance improved quite a bit out of the box due to new features like server-side joins, or changes to default parameter values, but there still is some more performance to squeeze out by making some quick changes.

 

Before/After Imaging

 

These are 2 techniques that the OpenEdge database uses for data integrity and recovery. Before imaging (BI) is always enabled, while after imaging (AI) can be disabled.

Note: If at this point you notice you are not using after imaging, strongly consider enabling it. While there is an overhead associated with having after imaging enabled, that overhead is minimal and the benefits more than outweigh any slight loss in performance.

 

Before Imaging

 

Before imaging is used for the following:

  1. Transaction rollbacks
  2. Undoing incomplete changes in the event of a crash
  3. Recovering the database to a consistent state after a crash

 

After Imaging

 

After imaging is used for the following:

  1. Database recovery after storage media failure
  2. Recovery from severe data corruption
  3. Allows us to recover a database to a specific point in time

 

Overall Transaction Process

 

To better explain how BI and AI work, here is a simplified list of steps that the database takes when changing a record. This is in no way complete, but it is more than enough to get a basic understanding of the process.

 

Finding the Record

 

At this stage we need to find the record we want to modify. It could already be loaded inside the database buffer (the size of the DB buffer is managed using the -B parameter), so first we look for it in memory.

If it is not already loaded, the corresponding block is found on the disk, and the entire block is loaded into the database buffer.

 

Writing the Before Image

 

Now we write the state of the data as it was before any changes to a BI buffer. This will allow us to undo the transaction or recover from a crash if needed. The BI buffer will then be written to the currently used BI cluster on disk.

 

Making the Changes

 

Now that the BI is written, we can make changes to the record. These changes are made in memory and will only be written to disk at a later point.

 

Writing the After Image

 

Now that the transaction is complete, we write the after image to an AI buffer. This contains the state of the record after the transaction finished. This AI buffer will then be written to the currently used AI extent.

 

Writing to Database Data Files

 

This is an asynchronous process for performance reasons. Once a change is done, the change that exists in the database buffer will eventually make it to the database data files on disk (.d[N] file). There are multiple ways this can happen, but in most situations it will either be done during a checkpoint or when an Asynchronous Page Writer writes it to disk prior to a checkpoint.

 

What Happens During a Crash

 

Because we now have both the BI and AI, we can easily recover from a crash. We can roll back uncommitted transactions, and in the case of media failure i.e. a disk no longer working, we can roll forward a backup of our database using the AI extents.

 

Background Writers

 

Background writers are processes that take over some disk operations from the database server processes, allowing the servers to focus on handling database transactions, thereby improving performance.

There are 3 background writers that you should be using:

  1. AIW – after image writer
  2. BIW – before image writer
  3. APW – asynchronous page writer

 

AIW & BIW

 

These writers just handle disk writes for after imaging and before imaging. You can only have one of each for a database, and they should always be enabled. These processes take the BI and AI buffers and write them to disk.

 

APWs

 

Unlike the AIW and BIW, you can run multiple instances of this process. This process shares some similarities to the AIW and BIW, but instead of writing data related to the BI/AI it writes the changes inside the database buffers to the database data files on disk.

It does this by scanning the database buffer pool for “dirty” buffers and writes them to disk. Because the changes associated to BI clusters can be written to disk ahead of the checkpointing process, this massively speeds up checkpointing, and it also distributes a potential spike in disk IO across a longer time period. This has the side benefit of supplying the database with a set of clean buffers it can use when some new data needs to be loaded from disk.

 

Checkpoints

 

Now that we have a basic understanding of what is involved in the transaction process, we need to understand how checkpointing works. Because dirty buffers are not written to the database synchronously, as more changes are made, the currently used BI cluster on disk will eventually be full. When this happens, the changed database buffers associated with the BI cluster need to be written to disk, inside the database data files.

Once all changes are written to disk, the BI cluster is then ready to be reused. By default, a 4 cluster chain is used. It rotates through BI clusters as they are filled with BI notes, have their associated changes written to disk, and are then ready to be reused.

Note: In old versions of Progress (pre 6.3), this would immediately stop all write operations until the changes associated to the BI cluster could be written to disk. Now we have until the next cluster needs to be written to disk, at which point the same stop for all write operations takes place.

 

BI Cluster Size

 

Why Does the Cluster Size Matter

 

Since the checkpoint process is a key part of writing changes to disk, the size of the BI cluster can play a major role in the database’s write performance. We previously talked about background writers and specifically mentioned the APW as being particularly important for the checkpoint process.

The one detail that needs to be highlighted is that APWs require some time to write dirty buffers to disk. This is where the BI cluster size comes into play, as it allows us to specify a large enough size for our BI clusters that they will not be filled with BI notes before the APWs have a chance to write their associated changes to the database data files.

 

Determining the Correct Cluster Size

 

To determine the correct size for your BI clusters, you will need to gather some data on your production database, specifically during times when the load is high. You can use promon to gather data on the current interval between checkpoints, and the BI cluster size can be increased by comparing the current interval to your desired interval (ideally at least 2 minutes).

If with a BI cluster size of 4MB your checkpoints are 30 seconds apart, you would need to increase the BI cluster size to 120s / 30s * 4MB = 16MB. Depending on the amount of writes you are doing and your disk performance, you may end up seeing numbers that are much larger.

For more details on how to use promon to set the cluster size, check out this article from Progress: https://community.progress.com/s/article/20566

 

Changing the Cluster Size

 

The first thing that needs to be mentioned, is that this is a process that can only occur when the database is shut down. Once that is done, the process is as simple as running a command that looks like this: proutil <dbname> -C truncate bi -biblocksize 16 -bi 262128

The above command changes the BI block size to 16KB, and the BI cluster size to 262128KB or 262MB. Please note that 262128KB is the maximum size for a BI cluster, while 16KB is the maximum size for a BI block.

 

Separate Disk Storage

 

Since BI clusters can easily become a performance bottleneck, they would ideally live on their own disk drive. While this is not feasible in all environments, if you can move the BI extent to its own dedicated drive, you will see an immediate performance improvement.

Because BI clusters are written to with every transaction, moving this disk IO away from the disk that handles your data files will free up more disk IO for disk reads, and more importantly for your APWs so they can write the dirty buffers to disk faster.

Note: The same logic applies to your AI extents as well. In addition to freeing up some disk IO, AI extents are crucial for database recovery so they have even more reasons to live on a separate disk.

 

Conclusion

 

The Progress OpenEdge database is highly tunable as far as performance is concerned. A lot of performance issues can be solved by following some best practices and having a basic understanding of how things function under the hood.

For a database that sees a lot of write activity, you could easily improve performance by just doing the following:

  1. Move the BI extent to a separate disk if possible
  2. Increasing the BI cluster size
  3. Making sure that all background worked processes are running

 

 


 

Author: Claudiu Spac, Tech Lead

He is a seasoned tech lead with a decade of experience at Wayfare, specializing in Progress OpenEdge. He has tackled diverse projects, from TV scheduling to warehouse logistics, with a strong emphasis on performance optimization and resolving intricate database challenges.

SEE HOW WE WORK.

FOLLOW US