Caching
Git-Pandas supports pluggable cache backends to optimize performance for expensive, repetitive operations. This is particularly useful for large repositories or when running multiple analyses.
Overview
The caching system provides: * In-memory caching for temporary results * Redis-based caching for persistent storage * Configurable cache durations * Automatic cache invalidation * Decorator-based caching for expensive operations
Available Cache Backends
In-Memory Cache (EphemeralCache)
The default in-memory cache is ephemeral and will be cleared when the process ends:
from gitpandas import Repository
from gitpandas.cache import EphemeralCache
# Create an in-memory cache with default settings
cache = EphemeralCache()
# Or customize the cache size
cache = EphemeralCache(max_keys=500)
# Use the cache with a repository
repo = Repository('/path/to/repo', cache_backend=cache)
Redis Cache (RedisDFCache)
For persistent caching across sessions, use Redis:
from gitpandas import Repository
from gitpandas.cache import RedisDFCache
# Create a Redis cache with default settings
cache = RedisDFCache()
# Or customize Redis connection and cache settings
cache = RedisDFCache(
host='localhost',
port=6379,
db=12,
max_keys=1000,
ttl=3600 # Cache entries expire after 1 hour
)
# Use the cache with a repository
repo = Repository('/path/to/repo', cache_backend=cache)
Using the Cache Decorator
The @multicache decorator can be used to cache method results:
from gitpandas.cache import multicache
@multicache(
key_prefix="method_name",
key_list=["param1", "param2"],
skip_if=lambda x: x.get("param1") is None
)
def expensive_method(self, param1, param2):
# Method implementation
pass
Configuration
Cache backends can be configured with various parameters:
EphemeralCache:
* max_keys
: Maximum number of keys to store in memory (default: 1000)
RedisDFCache:
* host
: Redis host (default: ‘localhost’)
* port
: Redis port (default: 6379)
* db
: Redis database number (default: 12)
* max_keys
: Maximum number of keys to store (default: 1000)
* ttl
: Time-to-live in seconds for cache entries (default: None, no expiration)
* Additional keyword arguments are passed to redis.StrictRedis