OpenEdge Indexing Performance: ETIME, XREF & Query Optimization Explained

Complex database queries in many cases run for an extensive period of time if the query and database are not well optimized. Tracking query performance can be continually done using Progress OpenEdge’s functionalities. We’ll take a look over them in this article.

Database optimizations

In the building phase of the database, it’s mandatory to have a thorough understanding of business entities and database normalization forms in order to create a robust and redundancy free database.

Choosing the proper clustered index for each database table can be done following the next guidelines:

Choose a frequently searched index for query resolution, particularly by large queries, or as a foreign key on join relations (JOIN), or for sorting (GROUP BY, ORDER BY, etc.) and/or ROWID positioning inside the DataServer.
For composite clustered keys, the first component column should be the most searched column of the key.
Choose an index with a high degree of uniqueness.
Choose an index that can be utilized in range queries to produce bracketed results.
Choose an index that is often accessed sequentially. Rows are stored in the sorted order of this key, and the clustered index can act as a “covering” index, allowing data to be read out of the index itself rather than retrieving additional pages in a lookup operation to get the desired data values.
Choose an index that is monotonic, incremental, and unique, such as an IDENTITY column, particularly if it is also meaningful to the application table content.
The primary key is a good clustered index candidate.

Indexes and OE query optimizer index selection for queries

A database index operates just like the index at the back of a book: it tells you exactly where to find the specific information you are looking for without making you read through every single page. You can use a single column to define a simple index, or a combination of columns to define a composite or compound index. To decide which columns to use, you first need to determine how the data in the table is accessed. If users frequently look up customers by last name, then the last name is a good choice for an index. It is typical to base indexes on primary keys (columns that contain unique information). For example in sports2000 we have single component and multi component indexes:

CustNum index — Flags: Primary Unique — Fields: CustNum
CountryPost index — Flags: None — Fields: Country, PostalCode
Comments index — Flags: Word Index — Fields: Comments

An index has the following advantages:

Faster row search and retrieval. It is more efficient to locate a row by searching a sorted index table than by searching an unsorted table.
In an application written with OpenEdge ABL, records are ordered automatically to support your particular data access patterns. Regardless of how you change the table, when you browse or print it, the rows appear in indexed order instead of their stored physical order on disk.
When you define an index as unique, each row is unique. This ensures that duplicate rows do not occur. A unique index can contain nulls. However, a primary key, although unique, cannot contain nulls.
A combination of columns can be indexed together to allow you to sort a table in several different ways simultaneously (for example, sort the Projects table by a combined employee and date column).
Efficient access to data in multiple related tables.

There are also disadvantages when it comes to indexing:

Write overhead — every CREATE, UPDATE, and DELETE on an indexed field forces the index to be updated as well. The more indexes a table has, the slower writes become.
Storage cost — each index is stored as a separate sorted structure on disk. Word indexes are especially large since every individual word gets its own entry.
Index degeneration — as records are inserted and deleted over time, index blocks fragment and become partially empty, degrading read performance until the index is rebuilt.
False confidence — an index appearing in the XREF does not mean the query is efficient. A wide bracket or a missing leading component can make an indexed query nearly as costly as a full table scan.

Index bracketing

When ABL executes a query, it does not simply “use an index”; it sets a bracket on it. A bracket is the start point and end point ABL establishes on an index to define exactly which records to read. Think of an index as a sorted list. A bracket is ABL telling the database engine:

“Start here, stop here; read nothing outside this range.”

The quality of that bracket determines the real cost of a query:

Tight bracket — ABL jumps directly to the matching range and reads only what it needs. This is the goal.
No bracket — ABL starts at the beginning of the index and scans until the end, reading every entry regardless of whether it matches. This is what WHOLE-INDEX means in the XREF file.

Before diving into examples to understand the underlying rules the OE query optimizer uses to choose indexes, let’s introduce two OE functionalities that work well when trying to improve query performance:

XREF compile option

The XREF option of the COMPILE statement provides a variety of static data about an ABL procedure, including:

Indexes used for a particular database query
Database tables accessed
Other procedures that are RUN
Text strings that exist in the procedure
Code pages, include files, and shared variables used
Events published, subscribed, and unsubscribed to

Reading the XREF output

Each line follows this structure:

<source-file> <ref-file> <line> <reference-type> [details]

Take this procedure as an example:

PROCEDURE UseIndexOverride:

    DEFINE BUFFER bufCustomer FOR Customer.
    FIND FIRST bufCustomer NO-LOCK
         WHERE bufCustomer.Name = "Harrison"
         USE-INDEX SalesRep.
END PROCEDURE.

The XREF it produces:

exampleXREF.p  exampleXREF.p  3  REFERENCE  sports2000.Customer
exampleXREF.p  exampleXREF.p  5  ACCESS     sports2000.Customer Name
exampleXREF.p  exampleXREF.p  5  SEARCH     sports2000.Customer SalesRep WHOLE-INDEX
exampleXREF.p  exampleXREF.p  9  PROCEDURE  UseIndexOverride

ETIME

ETIME returns, as an INT64 value, the time (in milliseconds) elapsed since the ABL session began or since ETIME (elapsed time) was last set to 0. To set ETIME to 0, pass it a positive logical value, such as YES or TRUE.

Now, let’s look at index selection in action: how the OpenEdge optimizer chooses its path, and how you can track its performance.

1. FOR EACH with WHERE using AND and OR

DEFINE BUFFER bufCustomer FOR Customer.
FOR EACH bufCustomer NO-LOCK
    WHERE (bufCustomer.Comments CONTAINS "amount" AND bufCustomer.Name = "John")
    OR    (bufCustomer.Country = "USA" AND bufCustomer.PostalCode = "21000"):
END.

First subexpression: Comments CONTAINS “amount” AND Name = “John”

CONTAINS can use the word index on Comments → can bracket.
Name = “John” → equality match on the leading component of Name → can bracket.

Second subexpression: Country = “USA” AND PostalCode = “21000”

Country is the leading component of CountryPost index → can bracket.
PostalCode is the second component, adding depth to the bracket.

Both expressions match an index, so the query uses the indexes Comments, Name and CountryPost. If any of the expressions did not match an index, a full table scan (WHOLE-INDEX) is triggered.

The XREF file contains these lines that show what indexes are attached for this query at compile time:

SEARCH  sports2000.Customer Comments
SEARCH  sports2000.Customer Name
SEARCH  sports2000.Customer CountryPost

2. FIND FIRST with tie index selection

DEFINE BUFFER bufCustomer FOR Customer.

FIND FIRST bufCustomer NO-LOCK
     WHERE bufCustomer.Country  = "USA"
     AND   bufCustomer.SalesRep = "DOS"
NO-ERROR.

FIND can only use a single index.
Country is the leading component of CountryPost index (1 component, equality).
SalesRep is the leading component of the SalesRep index (1 component, equality).
Both tie at 1 bracketed component; neither is the primary index.
“CountryPost” precedes “SalesRep” alphabetically → CountryPost wins.

For single index selection, if there is a tie at any level between two or more indices, the tie-breakers used are:

If one of the indexes is the primary index for the table, use that.
Otherwise, use the index whose name comes first alphabetically.

SalesRep becomes a post-read filter.

The XREF file contains this line that shows what index is attached for this query at compile time:

SEARCH  sports2000.Customer CountryPost

3. Poor bracketing vs proper bracketing

DEFINE BUFFER bufCustomer for sports2000.Customer.

// Example 1: poor bracketing
ETIME(TRUE).
FOR EACH bufCustomer NO-LOCK
    WHERE bufCustomer.PostalCode = "29620":
END.
DISPLAY ETIME LABEL "First time".

// Example 2: proper bracketing
ETIME(TRUE).
FOR EACH bufCustomer NO-LOCK
    WHERE bufCustomer.Country    = "USA"
    AND   bufCustomer.PostalCode = "29620":
END.
DISPLAY ETIME LABEL "Proper bracketing".

The first example showcases poor bracketing: PostalCode is the second component of the CountryPost index, so the index is not used. This can be seen in the XREF file:

SEARCH sports2000.Customer CustNum WHOLE-INDEX

The second example is a proper bracketing example, where the bracket limits are set to:

Lower bracket: Country = “USA”, PostalCode = “29620”
Upper bracket: Country = “USA”, PostalCode = “29620”

And the index used is shown in the XREF file:

SEARCH sports2000.Customer CountryPost

This bracketing behaviour is visible at runtime using the display of ETIME:

The first example took ~310 ms because of the poor bracketing, and the second one took ~1 ms because of proper bracketing.

Conclusions

Tracking query performance in OpenEdge comes down to three tools used together: XREF tells you which index the compiler selected at compile time, ETIME measures the real cost at runtime, and understanding bracketing explains the gap between the two. A query that looks correct can silently scan thousands of records simply because one WHERE clause component is missing or misplaced. The earlier you catch that with XREF, the less expensive it is to fix.