3. Practical Issues
Caching is a critical factor for the success of a
Web application. Caching mostly relates to getting quick access to
prefetched data that saves you roundtrips, queries, and any other sort
of heavy operations. Caching is important also for writing, especially
in systems with a high volume of data to be written. By posting requests
for writing to a kind of intermediate memory structure, you decouple
the main body of the application from the service in charge of writing.
Some people call this a batch update, but in the end it is nothing more than a form of caching for data to write.
The caching API provides you with the necessary
tools to build a bullet-proof caching strategy. When it comes to this,
though, a few practical issues arise.
Should I Cache or Should I Fetch?
There’s just one possible answer to this
question—it depends. It depends on the characteristics of the
application and the expected goals. For an application that must
optimize throughput and serve requests in the shortest possible amount
of time, caching is essential. The quantity of data you cache and the
amount of time you cache it are the two parameters you need to play with
to arrive at a good solution.
Caching is about reusing data, so data that is
not often used in the lifetime of the application is not a good
candidate for the cache. In addition to being frequently used, cacheable
data is also general-purpose data rather than data that is specific to a
request or a session. If your application manages data with these
characteristics, cache them with no fear.
Caching
is about memory, and memory is relatively cheap. However, a bad
application design can easily drive the application to unpleasant
out-of-memory errors regardless of the cost of a memory chip. On the
other hand, caching can boost the performance just enough to ease your
pain and give you more time to devise a serious refactoring.
Sometimes you face users who claim an absolute
need for live data. Sure, data parked in the cache is static, unaffected
by concurrent events, and not fully participating in the life of the
application. Can your users afford data that has not been updated for a
few seconds? With a few exceptions, the answer is, “Sure, they can.” In a
canonical Web application, there’s virtually no data that can’t be
cached at least for a second or two. No matter what end users claim,
caching can realistically be applied to the vast majority of scenarios.
Real-time systems and systems with a high degree of concurrency (for
example, a booking application) are certainly an exception, but most of
the time a slight delay of one or two seconds can make the application
run faster under stress conditions without affecting the quality of the
service.
In the end, you should be considering caching
all the time and filter it out in favor of direct data access only in
very special situations. As a practical rule, when users claim they need
live data, you should try with a counterexample to prove to them that a
few seconds of delay are still acceptable and maximize hardware and
software investments.
Fetching to get the real data is an option, but
it’s usually the most expensive one. If you choose that option, make
sure you really need it. Accessing cached data is faster if the data you
get in this way makes sense to the application. On the other hand, be
aware that caching requires memory. If abused, it can lead to
out-of-memory errors and performance hits.
Building a Wrapper Cache Object
As mentioned, no data stored in the ASP.NET
cache is guaranteed to stay there when a piece of code attempts to read
it. For the safety of the application, you should never rely on the
value returned by the Get method or the Item property. The following pattern keeps you on the safe side:
object data = Cache["MyData"];
if (data != null)
{
// The data is here, process it
...
}
The code snippet deliberately omits the else branch. What should you do if the requested item is null?
You can abort the ongoing operation and display a friendly message to
the user, or you can perhaps reload the data with a new fetch. Whatever
approach you opt for, it will probably not work for just any piece of
data you can have in the cache. You’ll most likely need to decide on a
case by case basis how best to reload the cache.
When it comes to building a cache layer, you’re
better off thinking in a domain-based way. You should avoid caching
data as individual elements, with the key being the only clue to
retrieve the element later. You can build a helper class with
domain-specific properties bound to cache entries. Here’s an example.
public static class MyCache
{
protected static class MyCacheEntries
{
public const string Customers = "Customers";
}
public static CustomerCollection Customers
{
get
{
object o = HttpContext.Current.Cache[MyCacheEntries.Customers];
if (o == null)
{
HttpContext.Current.Trace.Warn("Empty cache--reloading...");
LoadCustomers();
o = HttpContext.Current.Cache[MyCacheEntries.Customers];
}
return (CustomerCollection) o;
}
}
protected static void LoadCustomers()
{
// Get data
CustomerCollection coll = ProAspNet20.DAL.Customers.LoadAll();
// Set the item (5 seconds duration)
HttpContext.Current.Cache.Insert(MyCacheEntries.Customers, coll,
null, DateTime.Now.AddSeconds(5), Cache.NoSlidingExpiration);
}
}
The MyCache class defines a property named Customers of type CustomerCollection. The contents of this property comes from the sample Data Access Layer (DAL), and it’s stored in the cache for a duration of 5 seconds. The Customers
property hides all the details of the cache management and ensures the
availability of valid data to host pages. If the cached item is not
there because it has expired (or it has been removed), the get accessor of the property takes care of reloading the data.
Note
If you move the preceding code to a non-code-behind class, you can’t access the ASP.NET cache object using the plain Cache
keyword. ASP.NET has no intrinsic objects like classic ASP, meaning
that all objects you invoke must be public or reachable properties on
the current class or its parent. Just as we did in the previous example
of the MyCache class, you need to qualify the cache using the static property HttpContext.Current. |
A caller page needs only the following code to populate a grid with the results in the cache:
CustomerCollection data = MyCache.Customers;
CustomerList.DataTextField = "CompanyName";
CustomerList.DataValueField = "ID";
CustomerList.DataSource = data;
CustomerList.DataBind();
By writing a wrapper class around the specific
data you put into the cache, you can more easily implement a safe
pattern for data access that prevents null references and treats each
piece of data appropriately. In addition, the resulting code is more
readable and easy to maintain.
Note
This approach is
potentially more powerful than using the built-in cache capabilities of
data source controls. First and foremost, such a wrapper class
encapsulates all the data you need to keep in the cache and not just the
data bound to a control. Second, it gives you more control over the
implementation—you can set the priority and removal callback, implement
complex dependencies, and choose the name of the entry. Next, it works
with any data and not just with ADO.NET objects, as is the case with SqlDataSource and ObjectDataSource.
You can use this approach instead while building your own DAL so that
you come up with a bunch of classes that support caching to bind to data
source controls. If your pages are quite simple (for example, some data
bound to a grid or other data-bound controls) and you’re using only DataSet or DataTable, the caching infrastructure of data source controls will probably suit your needs. |
Enumerating Items in the Cache
Although most of the time you simply access
cached items by name, you might find it useful to know how to enumerate
the contents of the cache to list all stored public items. As mentioned,
the Cache class is a sort of
collection that is instantiated during the application’s startup. Being a
collection, its contents can be easily enumerated using a for..each statement. The following code shows how to copy the current contents of the ASP.NET cache to a newly created DataTable object:
private DataTable CacheToDataTable()
{
DataTable dt = CreateDataTable();
foreach(DictionaryEntry elem in HttpContext.Current.Cache)
AddItemToTable(dt, elem);
return dt;
}
private DataTable CreateDataTable()
{
DataTable dt = new DataTable();
dt.Columns.Add("Key", typeof(string));
dt.Columns.Add("Value", typeof(string));
return dt;
}
private void AddItemToTable(DataTable dt, DictionaryEntry elem)
{
DataRow row = dt.NewRow();
row["Key"] = elem.Key.ToString();
row["Value"] = elem.Value.ToString();
dt.Rows.Add(row);
}
The DataTable contains two columns, one for the key and one for the value of the item stored. The value is rendered using the ToString
method, meaning that the string and numbers will be loyally rendered
but objects will typically be rendered through their class name.
Important
When you enumerate the
items in the cache, only two pieces of information are available—the key
and value. From a client page, there’s no way to read the priority of a
given item or perhaps its expiration policy. When you enumerate the
contents of the Cache object, a generic DictionaryEntry
object is returned with no property or method pointing to more specific
information. To get more information, you should consider using the
.NET Reflection API. Also note that because the Cache
object stores data internally using a hashtable, the enumerator returns
contained items in an apparently weird order, neither alphabetical nor
time-based. The order in which items are returned, instead, is based on
the internal hash code used to index items. |
Clearing the Cache
The .NET Framework provides no method on the Cache class to programmatically clear all the content. The following code snippet shows how to build one:
public void Clear()
{
foreach(DictionaryEntry elem in Cache)
{
string s = elem.Key.ToString();
Cache.Remove(s);
}
}
Even though the ASP.NET cache is implemented to
maintain a neat separation between the application’s and system’s
items, it is preferable that you delete items in the cache individually.
If you have several items to maintain, you might want to build your own
wrapper class and expose one single method to clear all the cached
data.
Cache Synchronization
Whenever you read or write an individual cache item, from a threading perspective you’re absolutely safe. The ASP.NET Cache
object guarantees that no other concurrently running threads can ever
interfere with what you’re doing. If you need to ensure that multiple
operations on the Cache object occur atomically, that’s a different story. Consider the following code snippet:
int counter = -1;
object o = Cache["Counter"];
if (o == null)
{
// Retrieve the last good known value from a database
// or return a default value
counter = RetrieveLastKnownValue();
}
else
{
counter = (int) Cache["Counter"];
counter ++;
Cache["Counter"] = counter;
}
The Cache object is accessed repeatedly in the context of an atomic operation—incrementing a counter. Although individual accesses to Cache
are thread-safe, there’s no guarantee that other threads won’t kick in
between the various calls. If there’s potential contention on the cached
value, you should consider using additional locking constructs, such as
the C# lock statement (SyncLock in Visual Basic .NET).
Important
Where should you put the lock? If you directly lock the Cache object, you might run into trouble. ASP.NET uses the Cache object extensively and directly locking the Cache
object might have a serious impact on the overall performance of the
application. However, most of the time ASP.NET doesn’t access the cache
via the Cache object; rather, it accesses the direct data container—that is the CacheSingle or CacheMultiple class. In this regard, a lock on the Cache
object probably won’t affect many ASP.NET components; regardless, it’s a
risk that personally I wouldn’t like to take. By locking the Cache
object, you also risk blocking HTTP modules and handlers active in the
pipeline, as well as other pages and sessions in the application that
need to use cache entries different from the ones you want to serialize
access to. The best way out seems to be using a
synchronizer—that is, an intermediate but global object that you lock
before entering in a piece of code sensitive to concurrency: lock(yourSynchronizer) {
// Access the Cache here. This pattern must be replicated for
// each access to the cache that requires serialization.
}
The synchronizer object must be global to the application. For example, it can be a static member defined in the global.asax file. |
Per-Request Caching
Although
you normally tend to cache only global data and data of general
interest, to squeeze out every little bit of performance you can also
cache per-request data that is long-lived even though it’s used only by a
particular page. You place this information in the Cache object.
Another form of
per-request caching is possible to improve performance. Working
information shared by all controls and components participating in the
processing of a request can be stored in a global container for the
duration of the request. In this case, though, you might want to use the
Items collection on the HttpContext class
to park the data because it is automatically freed up at the end of the
request and doesn’t involve implicit or explicit locking like Cache.