duck.utils.caching

Caching module built on top of the diskcache library.

Provides a hierarchy of cache backends ranging from pure in-memory LRU caches to persistent file-backed caches with dynamic sharding and key-as-folder storage strategies.

Every public method is safe to call from multiple threads or async tasks simultaneously. Sync callers use threading.RLock; async callers use asyncio.Lock via the async_* variants exposed on each class.

Submodules

Package Contents

Classes

CacheBase

Abstract base class that all cache backends must implement.

CacheSpeedTest

This class performs speed test of Cache classes.

DynamicFileCache

Sharded persistent cache that automatically creates new shard directories when existing ones reach a configured size limit.

InMemoryCache

Thread-safe in-memory cache with LRU eviction.

KeyAsFolderCache

Persistent cache that stores each key’s data in a dedicated subdirectory named after the key itself.

PersistentFileCache

Persistent file-backed cache powered by the diskcache library.

Data

MISSING

API

class duck.utils.caching.CacheBase[source]

Abstract base class that all cache backends must implement.

Subclasses must override set, get, delete, pop, and clear. The save() hook is optional and is a no-op by default.

Locking strategy: Each subclass owns a threading.RLock (sync) and an asyncio.Lock (async). The RLock is reentrant so that methods which call other locking methods on the same instance do not deadlock.

abstractmethod clear() None[source]

Evict all entries from the cache.

abstractmethod delete(key: str) None[source]

Remove key from the cache. Silent if the key does not exist.

Parameters:

key – Cache key to remove.

abstractmethod get(key: str, default: Any = None) Any[source]

Retrieve the value stored under key.

Parameters:
  • key – Cache key to look up.

  • default – Returned when the key is absent or expired.

Returns:

The cached value or default.

abstractmethod pop(key: str, default: Any = MISSING) Any[source]

Retrieve and atomically remove a value from the cache.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when the key is absent. Raises KeyError if omitted and the key does not exist.

Returns:

The cached value, or default when the key is absent.

Raises:

KeyError – When the key is missing and no default was given.

save()[source]

Optional persistence hook. No-op unless overridden.

abstractmethod set(key: str, value: Any, expiry: int | float | None = None) None[source]

Store a value under key with an optional TTL in seconds.

Parameters:
  • key – Cache key.

  • value – Value to store.

  • expiry – Seconds until the entry expires. None means no expiry.

class duck.utils.caching.CacheSpeedTest(repeat: int = 1)[source]

This class performs speed test of Cache classes.

Initialization

compare_performance()[source]
execute_all()[source]
static generate_random_string(length)[source]
instances

None

print_summary()[source]
run_test(instance)[source]
test_clear(instance)[source]
test_create(instance)[source]
test_del(instance)[source]
test_get(instance)[source]
test_set(instance)[source]
class duck.utils.caching.DynamicFileCache(cache_dir: str, cache_limit: int = DEFAULT_SHARD_SIZE, cached_objs_limit: int = 128)[source]

Bases: duck.utils.caching.CacheBase

Sharded persistent cache that automatically creates new shard directories when existing ones reach a configured size limit.

Each shard is a PersistentFileCache instance. Shards are created lazily and are never deleted. Reads scan shards from newest to oldest so the most-recently-written value is returned first.

The lru_cache on get_cache_obj is intentionally not used here because closed PersistentFileCache objects must not be returned from a cache. Shard instances are tracked in a plain dict instead.

Parameters:
  • cache_dir – Root directory that will hold shard subdirectories.

  • cache_limit – Maximum size in bytes per shard. Defaults to 1 GB.

  • cached_objs_limit – Maximum number of shard objects kept open simultaneously. Oldest is closed when the limit is exceeded.

Initialization

DEFAULT_SHARD_SIZE: int

1000000000

async async_clear() None[source]

Async-safe version of clear.

async async_delete(key: str) None[source]

Async-safe version of delete.

Parameters:

key – Cache key to remove.

async async_get(key: str, default: Any = None) Any[source]

Async-safe version of get.

Parameters:
  • key – Cache key to look up.

  • default – Returned when absent.

Returns:

The cached value or default.

async async_pop(key: str, default: Any = MISSING) Any[source]

Async-safe version of pop.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value or default.

Raises:

KeyError – When the key is missing and no default was provided.

async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]

Async-safe version of set.

Parameters:
  • key – Cache key.

  • value – Value to persist.

  • expiry – TTL in seconds.

clear() None[source]

Evict all entries from every shard.

close() None[source]

Close all open shard handles in a background daemon thread.

Returns immediately; actual closing happens asynchronously so that long-running shard close operations do not block the caller.

create_new_shard() str[source]

Create a new uniquely named shard directory inside cache_dir.

Returns:

Absolute path string of the newly created shard directory.

delete(key: str) None[source]

Delete key from every shard that holds it.

Parameters:

key – Cache key to remove.

get(key: str, default: Any = None) Any[source]

Search all shards from newest to oldest and return the first hit.

Parameters:
  • key – Cache key to look up.

  • default – Returned when the key is not found in any shard.

Returns:

The cached value or default.

get_shard(path: str) duck.utils.caching.PersistentFileCache[source]

Return an open PersistentFileCache for the given shard path.

Evicts the least-recently-used open shard when the open-shard limit is exceeded to avoid unbounded file-handle accumulation.

Parameters:

path – Absolute shard directory path.

Returns:

An open PersistentFileCache instance for that path.

get_writable_shard_path() str[source]

Return the path of a shard that has not yet reached cache_limit.

Creates a new shard directory if all existing shards are full.

Returns:

Absolute path string of the target shard directory.

pop(key: str, default: Any = MISSING) Any[source]

Retrieve the first occurrence of key (newest shard first) and delete it from all shards atomically under the instance lock.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value, or default when absent.

Raises:

KeyError – When the key is missing and no default was provided.

reload_shard_paths() None[source]

Scan cache_dir and refresh the ordered list of shard paths.

Called at construction time and whenever a new shard is created so the shard list is always current.

set(key: str, value: Any, expiry: int | float | None = None) None[source]

Write key to the current writable shard.

Parameters:
  • key – Cache key.

  • value – Value to persist.

  • expiry – TTL in seconds.

class duck.utils.caching.InMemoryCache(maxkeys: int | None = None)[source]

Bases: duck.utils.caching.CacheBase

Thread-safe in-memory cache with LRU eviction.

Entries are stored in an OrderedDict so that the least-recently-used key can be evicted in O(1) when the cache is full. An optional expiry map records per-key TTLs and is checked on every read.

Parameters:

maxkeys – Maximum number of keys before LRU eviction kicks in. None means unbounded.

Initialization

_evict(key: str) None[source]

Remove a key from both the cache and the expiry map.

Parameters:

key – Cache key to evict.

_is_expired(key: str) bool[source]

Check whether a key has passed its expiry without side effects.

Parameters:

key – Cache key to inspect.

Returns:

True if the key has an expiry that has already elapsed.

async async_clear() None[source]

Async-safe version of clear.

async async_delete(key: str) None[source]

Async-safe version of delete.

Parameters:

key – Cache key to remove.

async async_get(key: str, default: Any = None) Any[source]

Async-safe version of get.

Parameters:
  • key – Cache key to look up.

  • default – Returned when absent or expired.

Returns:

The cached value or default.

async async_pop(key: str, default: Any = MISSING) Any[source]

Async-safe version of pop.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value or default.

Raises:

KeyError – When the key is missing and no default was provided.

async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]

Async-safe version of set.

Parameters:
  • key – Cache key.

  • value – Value to store.

  • expiry – TTL in seconds.

clear() None[source]

Evict all entries from the cache.

close() None[source]

Release all resources held by this cache instance.

delete(key: str) None[source]

Remove key from the cache. Silent if absent.

Parameters:

key – Cache key to remove.

get(key: str, default: Any = None) Any[source]

Retrieve the value for key, evicting it first if expired.

Parameters:
  • key – Cache key to look up.

  • default – Returned when the key is absent or expired.

Returns:

The cached value or default.

has(key: str) bool[source]

Return True if key exists and has not expired.

Parameters:

key – Cache key to probe.

Returns:

True when the key is present and live.

pop(key: str, default: Any = MISSING) Any[source]

Retrieve and atomically remove a value.

The entire read-then-delete sequence is held under one lock acquisition so no concurrent caller can observe the key between the two operations.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when the key is absent. Raises KeyError if omitted and the key does not exist.

Returns:

The cached value, or default when absent.

Raises:

KeyError – When the key is missing and no default was provided.

set(key: str, value: Any, expiry: int | float | None = None) None[source]

Insert or update key with an optional TTL.

Parameters:
  • key – Cache key.

  • value – Value to store.

  • expiry – Seconds until expiry. None means the entry lives forever.

class duck.utils.caching.KeyAsFolderCache(cache_dir: str, cached_objs_limit: int = DEFAULT_CACHE_OBJ_LIMIT)[source]

Bases: duck.utils.caching.CacheBase

Persistent cache that stores each key’s data in a dedicated subdirectory named after the key itself.

This makes it trivially easy to inspect, backup, or delete a single cache entry on disk. Each subdirectory is managed by its own PersistentFileCache instance.

Unlike DynamicFileCache, the per-key shard mapping is rebuilt from disk on every operation via a fresh os.scandir(), so new entries written by other processes are always visible.

Parameters:
  • cache_dir – Root directory under which per-key subdirectories will be created.

  • cached_objs_limit – Maximum number of PersistentFileCache instances to keep open at once.

Initialization

DEFAULT_CACHE_OBJ_LIMIT: int

128

_remove_key_dir(key_dir: str) None[source]

Close the shard for key_dir, evict it from open_shards, and remove the directory tree from disk.

Parameters:

key_dir – Absolute path of the per-key directory to remove.

async async_clear() None[source]

Async-safe version of clear.

async async_delete(key: str) None[source]

Async-safe version of delete.

Parameters:

key – Cache key to remove.

async async_get(key: str, default: Any = None) Any[source]

Async-safe version of get.

Parameters:
  • key – Cache key to look up.

  • default – Returned when absent or expired.

Returns:

The cached value or default.

async async_pop(key: str, default: Any = MISSING) Any[source]

Async-safe version of pop.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value or default.

Raises:

KeyError – When the key is missing and no default was provided.

async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]

Async-safe version of set.

Parameters:
  • key – Cache key.

  • value – Value to persist.

  • expiry – TTL in seconds.

clear() None[source]

Evict all entries by clearing every per-key shard.

close() None[source]

Close all open shard handles in a background daemon thread.

delete(key: str) None[source]

Remove key’s subdirectory from disk. Silent if absent.

Parameters:

key – Cache key to remove.

get(key: str, default: Any = None) Any[source]

Retrieve the value stored under key.

Removes the on-disk subdirectory if the key has expired so that stale directories do not accumulate.

Parameters:
  • key – Cache key to look up.

  • default – Returned when the key is absent or expired.

Returns:

The cached value or default.

get_key_dir(key: str) str[source]

Return the absolute path of the subdirectory for key.

Parameters:

key – Cache key.

Returns:

Absolute directory path string.

get_shard(path: str) duck.utils.caching.PersistentFileCache[source]

Return an open PersistentFileCache for the given directory path.

Evicts the least-recently-used shard when the open-shard limit is exceeded.

Parameters:

path – Absolute directory path for the shard.

Returns:

An open PersistentFileCache for that path.

live_key_dirs() list[pathlib.Path][source]

Return a snapshot of all current per-key subdirectories.

Performs a fresh os.scandir() on every call so newly created keys (including those from other processes) are always included.

Returns:

List of Path objects, one per existing key subdirectory.

pop(key: str, default: Any = MISSING) Any[source]

Retrieve and atomically remove a value and its on-disk folder.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value or default.

Raises:

KeyError – When the key is missing and no default was provided.

set(key: str, value: Any, expiry: int | float | None = None) None[source]

Store value in a subdirectory named after key.

Creates the subdirectory if it does not already exist.

Parameters:
  • key – Cache key (used as the subdirectory name).

  • value – Value to persist.

  • expiry – TTL in seconds.

duck.utils.caching.MISSING

‘object(…)’

class duck.utils.caching.PersistentFileCache(path: str, cache_size: int | None = None)[source]

Bases: duck.utils.caching.CacheBase

Persistent file-backed cache powered by the diskcache library.

All operations are serialised behind a threading.RLock so the underlying SQLite connection is never accessed from two threads simultaneously. diskcache itself is thread-safe, but the RLock also makes our own pop() atomic at the Python level.

Parameters:
  • path – Directory path used as the cache store.

  • cache_size – Maximum size in bytes. None means unlimited.

Initialization

_require_open() None[source]

Guard that raises RuntimeError when the cache has been closed.

async async_clear() None[source]

Async-safe version of clear.

async async_delete(key: str) None[source]

Async-safe version of delete.

Parameters:

key – Cache key to remove.

async async_get(key: str, default: Any = None) Any[source]

Async-safe version of get.

Parameters:
  • key – Cache key to look up.

  • default – Returned when absent or expired.

Returns:

The cached value or default.

async async_pop(key: str, default: Any = MISSING) Any[source]

Async-safe version of pop.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value or default.

Raises:

KeyError – When the key is missing and no default was provided.

async async_set(key: str, value: Any, expiry: int | float | None = None) None[source]

Async-safe version of set.

Parameters:
  • key – Cache key.

  • value – Value to persist.

  • expiry – TTL in seconds.

clear() None[source]

Evict all entries from the cache.

Raises:

RuntimeError – When the cache is closed.

close() None[source]

Flush pending writes and close the underlying diskcache store.

delete(key: str) None[source]

Remove key from the cache. Silent if absent.

Parameters:

key – Cache key to remove.

Raises:

RuntimeError – When the cache is closed.

get(key: str, default: Any = None) Any[source]

Retrieve the value stored under key.

Parameters:
  • key – Cache key (must be a str).

  • default – Returned when the key is absent or expired.

Returns:

The cached value or default.

Raises:
  • KeyError – When key is not a string.

  • RuntimeError – When the cache is closed.

pop(key: str, default: Any = MISSING) Any[source]

Retrieve and atomically remove a value.

Uses diskcache’s own pop() which is atomic at the SQLite level, then falls back to KeyError / default handling.

Parameters:
  • key – Cache key to retrieve and delete.

  • default – Returned when absent. Raises KeyError if omitted.

Returns:

The cached value or default.

Raises:
  • KeyError – When the key is missing and no default was provided.

  • RuntimeError – When the cache is closed.

set(key: str, value: Any, expiry: int | float | None = None) None[source]

Store a value under key with an optional TTL.

Parameters:
  • key – Cache key (must be a str).

  • value – Value to persist.

  • expiry – Seconds until the entry expires.

Raises:
  • KeyError – When key is not a string.

  • RuntimeError – When the cache is closed.