biocantor.util.hashing
Perform MD5 digests of arbitrary objects in memory in python. All objects are unpacked and their string representations are digested.
- TODO: The current implementation of this has not been profiled and likely has room for optimization that could improve
performance on larger datasets.
Module Contents
Functions
|
Lexicographically orders a set and converts all values to strings to enable order comparison. |
|
Recursively search a dictionary for sets and order them. |
|
Inner function for |
|
MD5 digest of any arbitrary set of python objects. Must be utf-8 encodeable. |
- biocantor.util.hashing._order_set(set_of_hashables: Set[Hashable]) List[str]
Lexicographically orders a set and converts all values to strings to enable order comparison.
- Parameters
set_of_hashables – A set of hashable items
- Returns
An lexicographically ordered list of strings
- biocantor.util.hashing._order_dict_of_possible_sets(dict_of_possible_sets: Dict[Hashable, Any]) Iterable[str]
Recursively search a dictionary for sets and order them.
- Parameters
dict_of_possible_sets – Any dictionary. The values must have stable string representations that are uniquely
hashable –
sets. (unless these values are a set or a dictionary of) –
- Returns
Ordered list of tuples
- biocantor.util.hashing._encode_object_for_digest(*args, **kwargs) Iterable[str]
Inner function for
digest_object()
that produces the string representations. This helps with debugging.
- biocantor.util.hashing.digest_object(*args, **kwargs) uuid.UUID
MD5 digest of any arbitrary set of python objects. Must be utf-8 encodeable.
Because the string representation of sets are not guaranteed to be ordered across different python interpreters, we recursively hunt for sets in args and kwargs and then sort them. This sorting requires conversion to string representation for set members. All items in args and kwargs are also utf-8 encoded before hashing.
Args can be any set of objects with stable string representations, as well as sets. Kwargs can be any set of objects with stable string representations, including nested dicts of sets. The argument name in the kwargs dictionary are part of the hash produced.