9.4 Metadata Caching and the Cache Manager

This section looks specifically at the caching of metadata for RWDBTable and RWDBStoredProc. Three main classes are involved in metadata caching.

RWDBCacheManager. Encapsulates the interface of a cache manager that stores and retrieves metadata for tables and stored procedures.

RWDBTableEntry. Represents the cacheable metadata for an RWDBTable. Its methods allow you to set, retrieve, check the existence of, and clear the cacheable metadata. It also supplies operators for persisting and restoring the metadata.

RWDBStoredProcEntry. Represents the cacheable metadata for an RWDBStoredProc, and has comparable methods to RWDBTableEntry.

Before we get started, it would avoid confusion to discuss a bit of terminology.

Local cache. Some cacheable data is stored in local variables in the RWDBTable or RWDBStoredProc object it is associated with. We will refer to this data as being in the local cache.

Global cache. By this we will mean the data stored in the RWDBCacheManager-derived instance associated with a particular RWDBDatabase instance. Note that your application could have multiple RWDBDatabase instances, each with its own “global” cache, so in a sense it is not truly global. Just understand that we are referring to the cache for some particular RWDBDatabase instance.

The remainder of this section covers the following topics:

Error Handling in the Cache Manager. How errors that may occur within the cache manager are handled by the DB Interface Module.

Example. An extended example demonstrating use of the Rogue Wave in-memory cache manager implementation and how it affects data retrieval times.

This section describes the types of data that are cacheable, how and when the data is obtained, and the methods that participate in metadata caching.

Here is a list of the types of data that are cached in the cache manager:

To simply use a cache manager, you do not need to know anything about the cache manager methods. The methods involved in metadata caching use the cache manager internally in the interest of efficiency. Here is what a method does when some operation requires data that is potentially cached:

Checks the in-object variable (local cache) that holds the data, if there is one. Not all cacheable data has a local cache variable.

If not found locally, checks the global cache. If found in the global cache, uses that data without querying the database, and sets the local cache variable if there is one.

If not found in the global cache, queries the database, and then sets the data in the global cache and the local cache variable, if there is one.

For complete information on the metadata caching behavior of methods in RWDBTable and RWDBStoredProc, see these class entries in the SourcePro C++ API Reference Guide.

The cache manager is represented by the base class RWDBCacheManager. Rogue Wave provides an implementation of the cache manager as an in-memory object. The Rogue Wave implementation is represented by the class RWDBInMemoryCacheManager. You can create your own cache manager by deriving from RWDBCacheManager.

There is no cache manager installed by default. You must explicitly set one on an RWDBDatabase instance. The methods that cache metadata check for a cache manager and, if none is found, make no attempt to use the cache.

To use the Rogue Wave in-memory cache manager, set the cache manager in an RWDBDatabase instance:

//3 Register the cache manager with the RWDBDatabase instance. The return value is a pointer to the previous cache manager, which we save in a variable so it can be restored if necessary. If there is no previous cache manager, NULL is returned.

//4 Restore the previous cache manager. At this point, it would be safe to destroy the cm cache manager (see the discussion of the lifetime requirement below).

To use your own cache manager implementation, simply instantiate it in place of cm in the above code.

The above code installs an operational cache manager as a global object on the RWDBDatabase instance. The methods that cache metadata obtain and use this cache manager for their caching behavior. If your application creates two RWDBDatabase instances and you want caching for both, you must create two cache manager instances, one for each RWDBDatabase instance. That is, cache managers must not be shared between RWDBDatabase instances.

The fact that the cache manager is associated with a given RWDBDatabase instance has a couple of implications.

You must ensure that the cache manager continues to exist as long as the RWDBDatabase instance, or any object produced by the RWDBDatabase instance, exists and continues to reference it. Otherwise your application may crash or show undefined behavior.

Due to its association with an RWDBDatabase instance, the cache manager is potentially accessible by multiple threads. Therefore, any function that changes the data in the cache must guarantee exclusive access to the data. The Rogue Wave methods involved in metadata caching obtain a lock on the RWDBDatabase instance before changing any data in its cache manager. If your application manipulates the cache directly, it must do the same:

If an error occurs during a caching operation, it is the responsibility of the cache manager to throw an exception that describes the error. If no exception is thrown, the caching operation is considered successful, even if an empty cache entry is returned.

In cache managers derived from RWDBCacheManager, if one of the member functions throws an exception, the following occurs:

A copy of the calling object’s RWDBStatus object is created. The calling object’s own RWDBStatus object is not changed.

If an exception of type RWxmsg or std:exception occurs, the message from the exception is set as the message in the RWDBStatus object.

If an error handler is installed, the above activity triggers the handler.

There is always a possibility that the data held in the cache may become inconsistent with the actual state of the database. In this case, Rogue Wave classes and methods that rely on the cache may execute with stale data.

The mechanism that Rogue Wave supplies for re-establishing consistency is through clearCache() methods in the classes that use the cache: RWDBTable and RWDBStoredProc. These methods take an enum value as a parameter: Local or All. The Local option clears only the local cache variables in the calling object. The All option clears the local cache variables and the cached data associated with the calling object in the global cache.

Calling clearCache() with All guarantees that the next time the data is needed, it is refreshed by a query to the database. Using the Local option refreshes the local cache the next time the data is needed, but the data may come from the global cache, not necessarily from the database.

Note that Rogue Wave classes and methods never call clearCache() on their own. It is up to you to decide when there might be a danger of the cache becoming out-of-date, and to call these methods to set it right. We recommend that you clear the cache any time you change the database in a way that affects cached metadata.

The forcedLookup parameters on some RWDBTable and RWDBStoredProc methods, the previous API for updating cached metadata, are deprecated but maintained for backward compatibility.

As the name implies, all the data in the in-memory cache is stored in primary memory. This means the data is deleted when the application is shut down, but there is a persistence mechanism for saving the data. A database-intensive application could end up using a large amount of memory, but there are provisions for clearing memory as well.

If an error occurs during an attempt to access the cache, the cache manager obtains a copy of the RWDBStatus object from the calling object and populates it as described in Section 1.4.3. With the in-memory cache manager, the only way to capture the error is by installing an error handler.

The Rogue Wave implementation of the cache manager extends the methods defined in the base class to include operators for persisting a cache to a stream or a file. These operators allow you to persist and restore a cache across runs of the application. These operators are:

To deal with the possible problem of the cache consuming too much memory, the Rogue Wave implementation provides a method for clearing the data in the cache:

By default, this method passes the enum value both, which means it removes all of the data for both tables and stored procedures. You can instead pass the values table or storedProc to remove all of the data for one or the other type of object.

Of course, you may not want to clear everything from the cache, but just certain things. To clear cached data for just a single table or stored procedure, you can create an empty RWDBTableEntry or RWDBStoredProcEntry object and set it on the cache. To clear particular data for a table or stored procedure, obtain its entry from the cache, set empty data on the items you want to clear, and re-set the altered entry on the cache. See the descriptions for RWDBTableEntry and RWDBStoredProcEntry in the SourcePro C++ API Reference Guide for information on the methods available to you.

The Rogue Wave implementation provides an easy way to use caching, but it may not always be practical to store metadata in-memory, or you may have other caching requirements, such as wanting to share the cache between processes.

To implement your own cache manager, derive from RWDBCacheManager and implement the get() and set() methods defined in the base class. You can use the implementation of RWDBInMemoryCacheManager as a model for writing your implementation. Keep in mind these requirements:

Error handling. The only requirement here is to detect problems and throw exceptions. The Rogue Wave methods that use the cache manager respond to the exceptions as described in Section 9.4.3. Of course you are free to implement whatever additional error handling you wish.

Thread safety. The Rogue Wave methods that use the cache manager do not assume it is thread-safe and so obtain a lock on the RWDBDatabase object that holds the cache manager before changing any data. We do not recommend manipulating the cache any other way.

One option available to you is to make your cache manager more fine-grained than the in-memory cache manager that Rogue Wave supplies. The in-memory cache manager stores and retrieves metadata only through the RWDBTableEntry and RWDBStoredProcEntry objects. However, in implementing your own cache manager, you have access to these objects and all of the methods for setting, getting, checking the existence of, and clearing the particular types of data the cache can store, such as primary keys and stored procedure parameters. So you are free to deal with different parts of the cacheable data in different ways.

Another clear option is to implement one or more different ways of storing and persisting the cached data. The in-memory cache manager is implemented entirely in primary memory. You might prefer to store the data using some different mechanism, or even two or more mechanisms for different parts of the data. Similarly, the in-memory cache manager provides methods to persist data to a file or a stream, but you may prefer other ways of persisting the data.

This example demonstrates the retrieval of a schema for a table in three situations:

A timer method, timeGetSchema(), creates a temporary table and then retrieves the schema for the table while capturing the time interval needed for the retrieval. The time interval for the retrieval is written to the console.

The main program does some setup and then calls timeGetSchema() for each of the three situations.

//1 At this call, there is no cache. The time interval represents the time needed to obtain the schema directly from the database.

//2 This line both sets the in-memory cache manager as the cache manager for the database, and creates a pointer to the previous cache manager, which is returned by the cacheManager() method. If there is no cache manager, NULL is returned.

//3 At this call, a cache exists, but the schema data is not yet present. The time interval represents the time needed to retrieve the data from the database client and to put the schema data into the global cache.

//4 At this call, the schema data is in the cache. The time interval represents just the time needed to obtain the schema data from the global cache.

Here is output from running the example using the Microsoft SQL Server Native Client:

As the numbers indicate, caching can have a very positive effect on performance. Keep in mind, though, that the numbers you get from a given run of the example may vary a lot depending on database load and many other factors.

To examine the complete code, see the file <sourcepro_install>\examples\dbcore\memcache.cpp.

If you wish to run this example, be aware that the example follows the conventions of the DB Interface Module tutorials. Before you run the example, you need to run the executable tutinit, and when you are finished, you may want to run tutclean. For more information on the setup process, see Section 16.3, “Setting up the Tutorials.”