NAME CHI -- Unified cache interface SYNOPSIS use CHI; # Choose a standard driver # my $cache = CHI->new( driver => 'Memory' ); my $cache = CHI->new( driver => 'File', cache_root => '/path/to/root' ); my $cache = CHI->new( driver => 'FastMmap', root_dir => '/path/to/root', cache_size => '1k' ); my $cache = CHI->new( driver => 'Memcached', servers => [ "10.0.0.15:11211", "10.0.0.15:11212" ], l1_cache => { driver => 'FastMmap', root_dir => '/path/to/root' } ); my $cache = CHI->new( driver => 'DBI', dbh => $dbh ); # (These drivers coming soon...) # my $cache = CHI->new( driver => 'BerkeleyDB', root_dir => '/path/to/root' ); # Create your own driver # my $cache = CHI->new( driver_class => 'My::Special::Driver' ); # Basic cache operations # my $customer = $cache->get($name); if ( !defined $customer ) { $customer = get_customer_from_db($name); $cache->set( $name, $customer, "10 minutes" ); } $cache->remove($name); DESCRIPTION CHI provides a unified caching API, designed to assist a developer in persisting data for a specified period of time. The CHI interface is implemented by driver classes that support fetching, storing and clearing of data. Driver classes exist or will exist for the gamut of storage backends available to Perl, such as memory, plain files, memory mapped files, memcached, and DBI. CHI is intended as an evolution of DeWitt Clinton's Cache::Cache package, adhering to the basic Cache API but adding new features and addressing limitations in the Cache::Cache implementation. FEATURES * Easy to create new drivers * Uniform support for namespaces * Automatic [de]serialization of data * Multilevel caches * Probabilistic expiration and busy locks, to reduce cache miss stampedes * Optional logging of cache activity SIZE AWARENESS If is_size_aware or max_size are passed to the constructor, the cache will be *size aware* - that is, it will keep track of its own size (in bytes) as items are added and removed. You can get a cache's size with get_size. Size aware caches generally keep track of their size in a separate meta-key, and have to do an extra store whenever the size changes (e.g. on each set and remove). Maximum size and discard policies If a cache's size rises above its max_size, items are discarded until the cache size is sufficiently below the max size. (See max_size_reduction_factor for how to fine-tune this.) The order in which items are discarded is controlled with discard_policy. The default discard policy is 'arbitrary', which discards items in an arbitrary order. The available policies and default policy can differ with each driver, e.g. the CHI::Driver::Memory driver provides and defaults to an 'LRU' policy. Appropriate drivers Size awareness was chiefly designed for, and works well with, the CHI::Driver::Memory driver: one often needs to enforce a maximum size on a memory cache, and the overhead of tracking size in memory is negligible. However, the capability may be useful with other drivers. Some drivers - for example, CHI::Driver::FastMmap and CHI::Driver::Memcached - inherently keep track of their size and enforce a maximum size, and it makes no sense to turn on CHI's size awareness for these. Also, for drivers that cannot atomically read and update a value - for example, CHI::Driver::File - there is a race condition in the updating of size that can cause the size to grow inaccurate over time. AVAILABILITY OF DRIVERS The following drivers are currently available as part of this distribution: * CHI::Driver::Memory - In-process memory based cache * CHI::Driver::File - File-based cache using one file per entry in a multi-level directory structure * CHI::Driver::FastMmap - Shared memory interprocess cache via mmap'ed files * CHI::Driver::FastMmap - Dummy cache in which nothing is stored * CHI::Driver::CacheCache - CHI wrapper for Cache::Cache The following drivers are currently available as separate CPAN distributions: * CHI::Driver::Memcached - Distributed memory-based cache * CHI::Driver::DBI - Database-based cache This list is likely incomplete. A complete set of drivers can be found on CPAN by searching for "CHI::Driver". RELATION TO OTHER MODULES Cache::Cache CHI is intended as an evolution of DeWitt Clinton's Cache::Cache package. It starts with the same basic API (which has proven durable over time) but addresses some implementation shortcomings that cannot be fixed in Cache::Cache due to backward compatibility concerns. In particular: Performance Some of Cache::Cache's subclasses (e.g. Cache::FileCache) have been justifiably criticized as inefficient. CHI has been designed from the ground up with performance in mind, both in terms of general overhead and in the built-in driver classes. Method calls are kept to a minimum, data is only serialized when necessary, and metadata such as expiration time is stored in packed binary format alongside the data. As an example, using Rob Mueller's cacheperl benchmarks, CHI's file driver runs 3 to 4 times faster than Cache::FileCache. Ease of subclassing New Cache::Cache subclasses can be tedious to create, due to a lack of code refactoring, the use of non-OO package subroutines, and the separation of "cache" and "backend" classes. With CHI, the goal is to make the creation of new drivers as easy as possible, roughly the same as writing a TIE interface to your data store. Concerns like serialization and expiration options are handled by the driver base class so that individual drivers don't have to worry about them. Increased compatibility with cache implementations Probably because of the reasons above, Cache::Cache subclasses were never created for some of the most popular caches available on CPAN, e.g. Cache::FastMmap and Cache::Memcached. CHI's goal is to be able to support these and other caches with a minimum performance overhead and minimum of glue code required. Cache The Cache distribution is another redesign and implementation of Cache, created by Chris Leishman in 2003. Like CHI, it improves performance and reduces the barrier to implementing new cache drivers. It breaks with the Cache::Cache interface in a few ways that I considered non-negotiable - for example, get/set do not serialize data, and namespaces are an optional feature that drivers may decide not to implement. Cache::Memcached, Cache::FastMmap, etc. CPAN sports a variety of full-featured standalone cache modules representing particular backends. CHI does not reinvent these but simply wraps them with an appropriate driver. For example, CHI::Driver::Memcached and CHI::Driver::FastMmap are thin layers around Cache::Memcached and Cache::FastMmap. Of course, because these modules already work on their own, there will be some overlap. Cache::FastMmap, for example, already has code to serialize data and handle expiration times. Here's how CHI resolves these overlaps. Serialization CHI handles its own serialization, passing a flat binary string to the underlying cache backend. Expiration CHI packs expiration times (as well as other metadata) inside the binary string passed to the underlying cache backend. The backend is unaware of these values; from its point of view the item has no expiration time. Among other things, this means that you can use CHI to examine expired items (e.g. with $cache->get_object) even if this is not supported natively by the backend. At some point CHI will provide the option of explicitly notifying the backend of the expiration time as well. This might allow the backend to do better storage management, etc., but would prevent CHI from examining expired items. Naturally, using CHI's FastMmap or Memcached driver will never be as time or storage efficient as simply using Cache::FastMmap or Cache::Memcached. In terms of performance, we've attempted to make the overhead as small as possible, on the order of 5% per get or set (benchmarks coming soon). In terms of storage size, CHI adds about 16 bytes of metadata overhead to each item. How much this matters obviously depends on the typical size of items in your cache. SUPPORT AND DOCUMENTATION Questions and feedback are welcome, and should be directed to the perl-cache mailing list: http://groups.google.com/group/perl-cache-discuss Bugs and feature requests will be tracked at RT: http://rt.cpan.org/NoAuth/Bugs.html?Dist=CHI The latest source code can be browsed and fetched at: http://github.com/jonswar/perl-chi/tree/master git clone git://github.com/jonswar/perl-chi.git TODO * Perform cache benchmarks comparing both CHI and non-CHI cache implementations * Release BerkeleyDB drivers as separate CPAN distributions * Add docs comparing various strategies for reducing miss stampedes and cost of recomputes * Add expires_next syntax (e.g. expires_next => 'hour') * Support automatic serialization and escaping of keys * Create XS versions of main functions in Driver.pm (e.g. get, set) ACKNOWLEDGMENTS Thanks to Dewitt Clinton for the original Cache::Cache, to Rob Mueller for the Perl cache benchmarks, and to Perrin Harkins for the discussions that got this going. CHI was originally designed and developed for the Digital Media group of the Hearst Corporation, a diversified media company based in New York City. Many thanks to Hearst management for agreeing to this open source release. AUTHOR Jonathan Swartz SEE ALSO Cache::Cache, Cache::Memcached, Cache::FastMmap COPYRIGHT & LICENSE Copyright (C) 2007 Jonathan Swartz. CHI is provided "as is" and without any express or implied warranties, including, without limitation, the implied warranties of merchantibility and fitness for a particular purpose. This program is free software; you can redistribute it and/or modify it under the same terms as Perl itself.