duck.utils.caching¶
Caching module built on top of the diskcache library.
Provides a hierarchy of cache backends ranging from pure in-memory LRU caches to persistent file-backed caches with dynamic sharding and key-as-folder storage strategies.
Every public method is safe to call from multiple threads or async tasks simultaneously. Sync callers use threading.RLock; async callers use asyncio.Lock via the async_* variants exposed on each class.
Submodules¶
Package Contents¶
Classes¶
Abstract base class that all cache backends must implement. |
|
This class performs speed test of Cache classes. |
|
Sharded persistent cache that automatically creates new shard directories when existing ones reach a configured size limit. |
|
Thread-safe in-memory cache with LRU eviction. |
|
Persistent cache that stores each key’s data in a dedicated subdirectory named after the key itself. |
|
Persistent file-backed cache powered by the diskcache library. |
Data¶
API¶
- class duck.utils.caching.CacheBase[source]¶
Abstract base class that all cache backends must implement.
Subclasses must override set, get, delete, pop, and clear. The save() hook is optional and is a no-op by default.
Locking strategy: Each subclass owns a threading.RLock (sync) and an asyncio.Lock (async). The RLock is reentrant so that methods which call other locking methods on the same instance do not deadlock.
- abstractmethod delete(key: str) None[source]¶
Remove key from the cache. Silent if the key does not exist.
- Parameters:
key – Cache key to remove.
- abstractmethod get(key: str, default: Any = None) Any[source]¶
Retrieve the value stored under key.
- Parameters:
key – Cache key to look up.
default – Returned when the key is absent or expired.
- Returns:
The cached value or default.
- abstractmethod pop(key: str, default: Any = MISSING) Any[source]¶
Retrieve and atomically remove a value from the cache.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when the key is absent. Raises KeyError if omitted and the key does not exist.
- Returns:
The cached value, or default when the key is absent.
- Raises:
KeyError – When the key is missing and no default was given.
- class duck.utils.caching.CacheSpeedTest(repeat: int = 1)[source]¶
This class performs speed test of Cache classes.
Initialization
- instances¶
None
- class duck.utils.caching.DynamicFileCache(cache_dir: str, cache_limit: int = DEFAULT_SHARD_SIZE, cached_objs_limit: int = 128)[source]¶
Bases:
duck.utils.caching.CacheBaseSharded persistent cache that automatically creates new shard directories when existing ones reach a configured size limit.
Each shard is a PersistentFileCache instance. Shards are created lazily and are never deleted. Reads scan shards from newest to oldest so the most-recently-written value is returned first.
The lru_cache on get_cache_obj is intentionally not used here because closed PersistentFileCache objects must not be returned from a cache. Shard instances are tracked in a plain dict instead.
- Parameters:
cache_dir – Root directory that will hold shard subdirectories.
cache_limit – Maximum size in bytes per shard. Defaults to 1 GB.
cached_objs_limit – Maximum number of shard objects kept open simultaneously. Oldest is closed when the limit is exceeded.
Initialization
- DEFAULT_SHARD_SIZE: int¶
1000000000
- async async_delete(key: str) None[source]¶
Async-safe version of delete.
- Parameters:
key – Cache key to remove.
- async async_get(key: str, default: Any = None) Any[source]¶
Async-safe version of get.
- Parameters:
key – Cache key to look up.
default – Returned when absent.
- Returns:
The cached value or default.
- async async_pop(key: str, default: Any = MISSING) Any[source]¶
Async-safe version of pop.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value or default.
- Raises:
KeyError – When the key is missing and no default was provided.
- async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]¶
Async-safe version of set.
- Parameters:
key – Cache key.
value – Value to persist.
expiry – TTL in seconds.
- close() None[source]¶
Close all open shard handles in a background daemon thread.
Returns immediately; actual closing happens asynchronously so that long-running shard close operations do not block the caller.
- create_new_shard() str[source]¶
Create a new uniquely named shard directory inside cache_dir.
- Returns:
Absolute path string of the newly created shard directory.
- delete(key: str) None[source]¶
Delete key from every shard that holds it.
- Parameters:
key – Cache key to remove.
- get(key: str, default: Any = None) Any[source]¶
Search all shards from newest to oldest and return the first hit.
- Parameters:
key – Cache key to look up.
default – Returned when the key is not found in any shard.
- Returns:
The cached value or default.
- get_shard(path: str) duck.utils.caching.PersistentFileCache[source]¶
Return an open PersistentFileCache for the given shard path.
Evicts the least-recently-used open shard when the open-shard limit is exceeded to avoid unbounded file-handle accumulation.
- Parameters:
path – Absolute shard directory path.
- Returns:
An open PersistentFileCache instance for that path.
- get_writable_shard_path() str[source]¶
Return the path of a shard that has not yet reached cache_limit.
Creates a new shard directory if all existing shards are full.
- Returns:
Absolute path string of the target shard directory.
- pop(key: str, default: Any = MISSING) Any[source]¶
Retrieve the first occurrence of key (newest shard first) and delete it from all shards atomically under the instance lock.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value, or default when absent.
- Raises:
KeyError – When the key is missing and no default was provided.
- class duck.utils.caching.InMemoryCache(maxkeys: int | None = None)[source]¶
Bases:
duck.utils.caching.CacheBaseThread-safe in-memory cache with LRU eviction.
Entries are stored in an OrderedDict so that the least-recently-used key can be evicted in O(1) when the cache is full. An optional expiry map records per-key TTLs and is checked on every read.
- Parameters:
maxkeys – Maximum number of keys before LRU eviction kicks in. None means unbounded.
Initialization
- _evict(key: str) None[source]¶
Remove a key from both the cache and the expiry map.
- Parameters:
key – Cache key to evict.
- _is_expired(key: str) bool[source]¶
Check whether a key has passed its expiry without side effects.
- Parameters:
key – Cache key to inspect.
- Returns:
True if the key has an expiry that has already elapsed.
- async async_delete(key: str) None[source]¶
Async-safe version of delete.
- Parameters:
key – Cache key to remove.
- async async_get(key: str, default: Any = None) Any[source]¶
Async-safe version of get.
- Parameters:
key – Cache key to look up.
default – Returned when absent or expired.
- Returns:
The cached value or default.
- async async_pop(key: str, default: Any = MISSING) Any[source]¶
Async-safe version of pop.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value or default.
- Raises:
KeyError – When the key is missing and no default was provided.
- async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]¶
Async-safe version of set.
- Parameters:
key – Cache key.
value – Value to store.
expiry – TTL in seconds.
- delete(key: str) None[source]¶
Remove key from the cache. Silent if absent.
- Parameters:
key – Cache key to remove.
- get(key: str, default: Any = None) Any[source]¶
Retrieve the value for key, evicting it first if expired.
- Parameters:
key – Cache key to look up.
default – Returned when the key is absent or expired.
- Returns:
The cached value or default.
- has(key: str) bool[source]¶
Return True if key exists and has not expired.
- Parameters:
key – Cache key to probe.
- Returns:
True when the key is present and live.
- pop(key: str, default: Any = MISSING) Any[source]¶
Retrieve and atomically remove a value.
The entire read-then-delete sequence is held under one lock acquisition so no concurrent caller can observe the key between the two operations.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when the key is absent. Raises KeyError if omitted and the key does not exist.
- Returns:
The cached value, or default when absent.
- Raises:
KeyError – When the key is missing and no default was provided.
- class duck.utils.caching.KeyAsFolderCache(cache_dir: str, cached_objs_limit: int = DEFAULT_CACHE_OBJ_LIMIT)[source]¶
Bases:
duck.utils.caching.CacheBasePersistent cache that stores each key’s data in a dedicated subdirectory named after the key itself.
This makes it trivially easy to inspect, backup, or delete a single cache entry on disk. Each subdirectory is managed by its own PersistentFileCache instance.
Unlike DynamicFileCache, the per-key shard mapping is rebuilt from disk on every operation via a fresh os.scandir(), so new entries written by other processes are always visible.
- Parameters:
cache_dir – Root directory under which per-key subdirectories will be created.
cached_objs_limit – Maximum number of PersistentFileCache instances to keep open at once.
Initialization
- DEFAULT_CACHE_OBJ_LIMIT: int¶
128
- _remove_key_dir(key_dir: str) None[source]¶
Close the shard for key_dir, evict it from open_shards, and remove the directory tree from disk.
- Parameters:
key_dir – Absolute path of the per-key directory to remove.
- async async_delete(key: str) None[source]¶
Async-safe version of delete.
- Parameters:
key – Cache key to remove.
- async async_get(key: str, default: Any = None) Any[source]¶
Async-safe version of get.
- Parameters:
key – Cache key to look up.
default – Returned when absent or expired.
- Returns:
The cached value or default.
- async async_pop(key: str, default: Any = MISSING) Any[source]¶
Async-safe version of pop.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value or default.
- Raises:
KeyError – When the key is missing and no default was provided.
- async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]¶
Async-safe version of set.
- Parameters:
key – Cache key.
value – Value to persist.
expiry – TTL in seconds.
- delete(key: str) None[source]¶
Remove key’s subdirectory from disk. Silent if absent.
- Parameters:
key – Cache key to remove.
- get(key: str, default: Any = None) Any[source]¶
Retrieve the value stored under key.
Removes the on-disk subdirectory if the key has expired so that stale directories do not accumulate.
- Parameters:
key – Cache key to look up.
default – Returned when the key is absent or expired.
- Returns:
The cached value or default.
- get_key_dir(key: str) str[source]¶
Return the absolute path of the subdirectory for key.
- Parameters:
key – Cache key.
- Returns:
Absolute directory path string.
- get_shard(path: str) duck.utils.caching.PersistentFileCache[source]¶
Return an open PersistentFileCache for the given directory path.
Evicts the least-recently-used shard when the open-shard limit is exceeded.
- Parameters:
path – Absolute directory path for the shard.
- Returns:
An open PersistentFileCache for that path.
- live_key_dirs() list[pathlib.Path][source]¶
Return a snapshot of all current per-key subdirectories.
Performs a fresh os.scandir() on every call so newly created keys (including those from other processes) are always included.
- Returns:
List of Path objects, one per existing key subdirectory.
- pop(key: str, default: Any = MISSING) Any[source]¶
Retrieve and atomically remove a value and its on-disk folder.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value or default.
- Raises:
KeyError – When the key is missing and no default was provided.
- duck.utils.caching.MISSING¶
‘object(…)’
- class duck.utils.caching.PersistentFileCache(path: str, cache_size: int | None = None)[source]¶
Bases:
duck.utils.caching.CacheBasePersistent file-backed cache powered by the diskcache library.
All operations are serialised behind a threading.RLock so the underlying SQLite connection is never accessed from two threads simultaneously. diskcache itself is thread-safe, but the RLock also makes our own pop() atomic at the Python level.
- Parameters:
path – Directory path used as the cache store.
cache_size – Maximum size in bytes. None means unlimited.
Initialization
- async async_delete(key: str) None[source]¶
Async-safe version of delete.
- Parameters:
key – Cache key to remove.
- async async_get(key: str, default: Any = None) Any[source]¶
Async-safe version of get.
- Parameters:
key – Cache key to look up.
default – Returned when absent or expired.
- Returns:
The cached value or default.
- async async_pop(key: str, default: Any = MISSING) Any[source]¶
Async-safe version of pop.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value or default.
- Raises:
KeyError – When the key is missing and no default was provided.
- async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]¶
Async-safe version of set.
- Parameters:
key – Cache key.
value – Value to persist.
expiry – TTL in seconds.
- clear() None[source]¶
Evict all entries from the cache.
- Raises:
RuntimeError – When the cache is closed.
- delete(key: str) None[source]¶
Remove key from the cache. Silent if absent.
- Parameters:
key – Cache key to remove.
- Raises:
RuntimeError – When the cache is closed.
- get(key: str, default: Any = None) Any[source]¶
Retrieve the value stored under key.
- Parameters:
key – Cache key (must be a str).
default – Returned when the key is absent or expired.
- Returns:
The cached value or default.
- Raises:
KeyError – When key is not a string.
RuntimeError – When the cache is closed.
- pop(key: str, default: Any = MISSING) Any[source]¶
Retrieve and atomically remove a value.
Uses diskcache’s own pop() which is atomic at the SQLite level, then falls back to KeyError / default handling.
- Parameters:
key – Cache key to retrieve and delete.
default – Returned when absent. Raises KeyError if omitted.
- Returns:
The cached value or default.
- Raises:
KeyError – When the key is missing and no default was provided.
RuntimeError – When the cache is closed.
- set(key: str, value: Any, expiry: int | float | None = None) None[source]¶
Store a value under key with an optional TTL.
- Parameters:
key – Cache key (must be a str).
value – Value to persist.
expiry – Seconds until the entry expires.
- Raises:
KeyError – When key is not a string.
RuntimeError – When the cache is closed.