Welcome to cachings’s documentation!

caching (Caching Module)

Author:

Description:

This Module supports functions and classes for caching e.g. properties of other instances.

Submodules:

Unittest:

See also the unittest documentation.

class caching.property_cache_json(source_instance, cache_filename, load_all_on_init=False, callback_on_data_storage=None, max_age=None, store_on_get=True)

See also parent property_cache_pickle for detailed information.

Important

  • This class uses json. You should only use keys of type string!

  • Unicode types are transfered to strings

See limitations of json.

Example:

#!/usr/bin/env python
# -*- coding: UTF-8 -*-

import caching
import logging
import sys
import time


class test_slow_data(object):
    DATA_VERSION = 0.1
    KEY_ONE = '1'
    KEY_TWO = '2'
    KEY_THREE = 'three'
    KEY_FOUR = 'four'
    KEY_FIVE = 'five'
    KEYS = [KEY_ONE, KEY_TWO, KEY_THREE, KEY_FOUR, KEY_FIVE]

    def data_version(self):
        return self.DATA_VERSION

    def get(self, key, default=None):
        try:
            return getattr(self, f'__{key}__')()
        except AttributeError:
            return default

    def keys(self):
        return self.KEYS

    def uid(self):
        return None

    def print_n_sleep(self, k):
        sys.stdout.write('slow get executed for %s\n' % k)
        time.sleep(3)

    def __1__(self):
        self.print_n_sleep("__1__")
        return 'one'

    def __2__(self):
        self.print_n_sleep("__2__")
        return 'two'

    def __three__(self):
        self.print_n_sleep("__three__")
        return 'three'

    def __four__(self):
        self.print_n_sleep("__four__")
        return 'four'

    def __five__(self):
        self.print_n_sleep("__five__")
        return 'five'

class test_slow_data_cached(test_slow_data):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self._cache = caching.property_cache_json(test_slow_data(), 'cache.json')
        self._cache.add_source_get_keys(self.KEY_THREE)

    def get(self, key, default=None):
        return self._cache.get(key, default)


if __name__ == "__main__":
    # Logging configuration
    logging.basicConfig(
        format="%(asctime)s: %(levelname)8s - %(name)s - %(message)s",
        level=logging.DEBUG,
        stream=sys.stdout
    )
    # Example
    tm = time.time()
    data = test_slow_data_cached()
    print('Testing property_cache (json):\n--------------------------------')
    for key in data.keys():
        print(data.get(key))
    print('--------------------------------\nThe execution time was %.1fs' % (time.time()-tm))

Will result on the first execution to the following output (with a long execution time):

Testing property_cache (json):
--------------------------------
2024-09-22 11:01:08,845:    DEBUG - caching - Cache file does not exists (yet).
2024-09-22 11:01:08,846:    DEBUG - caching - cache-file stored (cache.json)
2024-09-22 11:01:08,846:    DEBUG - caching - Loading property for key='1' from source instance
slow get executed for __1__
2024-09-22 11:01:11,847:    DEBUG - caching - Adding key=1, value=one with timestamp=1726995671 to chache
2024-09-22 11:01:11,847:    DEBUG - caching - cache-file stored (cache.json)
one
2024-09-22 11:01:11,848:    DEBUG - caching - Loading property for key='2' from source instance
slow get executed for __2__
2024-09-22 11:01:14,849:    DEBUG - caching - Adding key=2, value=two with timestamp=1726995674 to chache
2024-09-22 11:01:14,850:    DEBUG - caching - cache-file stored (cache.json)
two
2024-09-22 11:01:14,850:    DEBUG - caching - Key 'three' is excluded by .add_source_get_keys(). Uncached data will be returned.
slow get executed for __three__
three
2024-09-22 11:01:17,851:    DEBUG - caching - Loading property for key='four' from source instance
slow get executed for __four__
2024-09-22 11:01:20,851:    DEBUG - caching - Adding key=four, value=four with timestamp=1726995680 to chache
2024-09-22 11:01:20,853:    DEBUG - caching - cache-file stored (cache.json)
four
2024-09-22 11:01:20,854:    DEBUG - caching - Loading property for key='five' from source instance
slow get executed for __five__
2024-09-22 11:01:23,854:    DEBUG - caching - Adding key=five, value=five with timestamp=1726995683 to chache
2024-09-22 11:01:23,855:    DEBUG - caching - cache-file stored (cache.json)
five
--------------------------------
The execution time was 15.0s

With every following execution the time cosumption my by much smaller:

Testing property_cache (json):
--------------------------------
2024-09-22 11:01:23,983:    DEBUG - caching - Loading properties from cache (cache.json)
2024-09-22 11:01:23,984:    DEBUG - caching - Providing property for '1' from cache
one
2024-09-22 11:01:23,984:    DEBUG - caching - Providing property for '2' from cache
two
2024-09-22 11:01:23,984:    DEBUG - caching - Key 'three' is excluded by .add_source_get_keys(). Uncached data will be returned.
slow get executed for __three__
three
2024-09-22 11:01:26,984:    DEBUG - caching - Providing property for 'four' from cache
four
2024-09-22 11:01:26,985:    DEBUG - caching - Providing property for 'five' from cache
five
--------------------------------
The execution time was 3.0s
class caching.property_cache_pickle(source_instance, cache_filename, load_all_on_init=False, callback_on_data_storage=None, max_age=None, store_on_get=True)

This class caches the data from a given source_instance. It takes the data from the cache instead of generating the data from the source_instance, if the conditions for the cache usage are given.

Required properties for the source_instance

  • uid(): returns the unique id of the source’s source or None, if you don’t want to use the unique id.

  • keys(): returns a list of all available keys.

  • data_version(): returns a version number of the current data (it should be increased, if the get method of the source instance returns improved values or the data structure had been changed).

  • get(key, default): returns the property for a key. If key does not exists, default will be returned.

Parameters:
  • source_instance (instance) – The source instance holding the data

  • cache_filename (str) – File name, where the properties are stored as cache

  • load_all_on_init (bool) – True will load all data from the source instance, when the cache will be initialised the first time.

  • callback_on_data_storage (method) – The callback will be executed every time when the cache file is stored. It will be executed with the instance of this class as first argument.

  • max_age (int or None) – The maximum age of the cached data in seconds or None for no maximum age.

  • store_on_get (bool) – False will prevent cache storage with execution of the .get(key, default) method. You need to store the cache somewhere else.

The cache will be used, if all following conditions are given

  • The key is in the list returned by .keys() method of the source_instance

  • The key is not in the list of keys added by the .add_source_get_keys() method.

  • The cache age is less then the given max_age parameter or the given max_age is None.

  • The uid of the source instance (e.g. a checksum or unique id of the source) is identically to to uid stored in the cache.

  • The data version of the source_instance is <= the data version stored in the cache.

  • The value is available in the previous stored information

Example:

#!/usr/bin/env python
# -*- coding: UTF-8 -*-

import caching
import logging
import sys
import time


class test_slow_data(object):
    DATA_VERSION = 0.1
    KEY_ONE = '1'
    KEY_TWO = '2'
    KEY_THREE = 'three'
    KEY_FOUR = 'four'
    KEY_FIVE = 'five'
    KEYS = [KEY_ONE, KEY_TWO, KEY_THREE, KEY_FOUR, KEY_FIVE]

    def data_version(self):
        return self.DATA_VERSION

    def get(self, key, default=None):
        try:
            return getattr(self, f'__{key}__')()
        except AttributeError:
            return default

    def keys(self):
        return self.KEYS

    def uid(self):
        return None

    def print_n_sleep(self, k):
        sys.stdout.write('slow get executed for %s\n' % k)
        time.sleep(3)

    def __1__(self):
        self.print_n_sleep("__1__")
        return 'one'

    def __2__(self):
        self.print_n_sleep("__2__")
        return 'two'

    def __three__(self):
        self.print_n_sleep("__three__")
        return 'three'

    def __four__(self):
        self.print_n_sleep("__four__")
        return 'four'

    def __five__(self):
        self.print_n_sleep("__five__")
        return 'five'

class test_slow_data_cached(test_slow_data):
    def __init__(self, *args, **kwargs) -> None:
        super().__init__(*args, **kwargs)
        self._cache = caching.property_cache_json(test_slow_data(), 'cache.pickle')
        self._cache.add_source_get_keys(self.KEY_THREE)

    def get(self, key, default=None):
        return self._cache.get(key, default)


if __name__ == "__main__":
    # Logging configuration
    logging.basicConfig(
        format="%(asctime)s: %(levelname)8s - %(name)s - %(message)s",
        level=logging.DEBUG,
        stream=sys.stdout
    )
    # Example
    tm = time.time()
    data = test_slow_data_cached()
    print('Testing property_cache (pickle):\n--------------------------------')
    for key in data.keys():
        print(data.get(key))
    print('--------------------------------\nThe execution time was %.1fs' % (time.time()-tm))

Will result on the first execution to the following output (with a long execution time):

Testing property_cache (pickle):
--------------------------------
2024-09-22 11:01:27,126:    DEBUG - caching - Cache file does not exists (yet).
2024-09-22 11:01:27,127:    DEBUG - caching - cache-file stored (cache.pickle)
2024-09-22 11:01:27,127:    DEBUG - caching - Loading property for key='1' from source instance
slow get executed for __1__
2024-09-22 11:01:30,128:    DEBUG - caching - Adding key=1, value=one with timestamp=1726995690 to chache
2024-09-22 11:01:30,129:    DEBUG - caching - cache-file stored (cache.pickle)
one
2024-09-22 11:01:30,129:    DEBUG - caching - Loading property for key='2' from source instance
slow get executed for __2__
2024-09-22 11:01:33,130:    DEBUG - caching - Adding key=2, value=two with timestamp=1726995693 to chache
2024-09-22 11:01:33,131:    DEBUG - caching - cache-file stored (cache.pickle)
two
2024-09-22 11:01:33,132:    DEBUG - caching - Key 'three' is excluded by .add_source_get_keys(). Uncached data will be returned.
slow get executed for __three__
three
2024-09-22 11:01:36,133:    DEBUG - caching - Loading property for key='four' from source instance
slow get executed for __four__
2024-09-22 11:01:39,134:    DEBUG - caching - Adding key=four, value=four with timestamp=1726995699 to chache
2024-09-22 11:01:39,135:    DEBUG - caching - cache-file stored (cache.pickle)
four
2024-09-22 11:01:39,136:    DEBUG - caching - Loading property for key='five' from source instance
slow get executed for __five__
2024-09-22 11:01:42,136:    DEBUG - caching - Adding key=five, value=five with timestamp=1726995702 to chache
2024-09-22 11:01:42,137:    DEBUG - caching - cache-file stored (cache.pickle)
five
--------------------------------
The execution time was 15.0s

With every following execution the time cosumption my by much smaller:

Testing property_cache (pickle):
--------------------------------
2024-09-22 11:01:42,204:    DEBUG - caching - Loading properties from cache (cache.pickle)
2024-09-22 11:01:42,204:    DEBUG - caching - Providing property for '1' from cache
one
2024-09-22 11:01:42,205:    DEBUG - caching - Providing property for '2' from cache
two
2024-09-22 11:01:42,205:    DEBUG - caching - Key 'three' is excluded by .add_source_get_keys(). Uncached data will be returned.
slow get executed for __three__
three
2024-09-22 11:01:45,205:    DEBUG - caching - Providing property for 'four' from cache
four
2024-09-22 11:01:45,206:    DEBUG - caching - Providing property for 'five' from cache
five
--------------------------------
The execution time was 3.0s
add_source_get_keys(keys)

This will add one or more keys to a list of keys which will always be provided by the source_instance instead of the cache.

Parameters:

keys (list, tuple, str) – The key or keys to be added

full_update(sleep_between_keys=0)

With the execution of this method, the complete source data which needs to be cached, will be read from the source instance and the resulting cache will be stored to the given file.

Parameters:

sleep_between_keys (float, int) – Time to sleep between each source data generation

Hint

Use this method, if you initiallised the class with store_on_get=False

get(key, default=None)

Method to get the cached property. If the key does not exists in the cache or source_instance, default will be returned.

Parameters:
  • key – key for value to get.

  • default – value to be returned, if key does not exists.

Returns:

value for a given key or default value.

Indices and tables