Module multiformats.multihash

Implementation of the multihash spec.

Core functionality is provided by the digest(), wrap(), unwrap() functions, or the correspondingly-named methods of the Multihash class. The digest() function and Multihash.digest() method can be used to create a multihash digest directly from data:

>>> data = b"Hello world!"
>>> digest = multihash.digest(data, "sha2-256")
>>> digest.hex()
'1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
>>> sha2_256 = multihash.get("sha2-256")
>>> digest = sha2_256.digest(data)
>>> digest.hex()
'1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'

By default, the full digest produced by the hash function is used. Optionally, a smaller digest size can be specified to produce truncated hashes:

>>> digest = multihash.digest(data, "sha2-256", size=20)
#        optional truncated hash size, in bytes ^^^^^^^
>>> multihash_digest.hex()
'1214c0535e4be2b79ffd93291305436bf889314e4a3f' # 20-bytes truncated hash

The unwrap() function can be used to extract the raw digest from a multihash digest:

>>> digest.hex()
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'
>>> raw_digest = multihash.unwrap(digest)
>>> raw_digest.hex()
    'c0535e4be2b79ffd93291305436bf889314e4a3f'

The Multihash.unwrap() method performs the same functionality, but additionally checks that the multihash digest is valid for the multihash:

>>> raw_digest = sha2_256.unwrap(digest)
>>> raw_digest.hex()
    'c0535e4be2b79ffd93291305436bf889314e4a3f'
>>> sha1 = multihash.get("sha1")
>>> (sha2_256.code, sha1.code)
(18, 17)
>>> sha1.unwrap(digest)
err.ValueError: Decoded code 18 differs from multihash code 17.

The wrap() function and Multihash.wrap() method can be used to wrap a raw digest into a multihash digest:

>>> raw_digest.hex()
    'c0535e4be2b79ffd93291305436bf889314e4a3f'
>>> multihash.wrap(raw_digest, "sha2-256").hex()
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'
>>> sha2_256.wrap(raw_digest).hex()
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'

Note the both multihash code and digest length are wrapped as varints (see the multiformats.varint module) and can span multiple bytes:

>>> skein1024_1024 = multihash.get("skein1024-1024")
>>> skein1024_1024.codec
Multicodec(name='skein1024-1024', tag='multihash', code='0xb3e0',
           status='draft', description='')
>>> skein1024_1024.digest(data).hex()
'e0e702800192e08f5143...' # 3+2+128 = 133 bytes in total
#^^^^^^     3-bytes varint for hash function code 0xb3e0
#      ^^^^ 2-bytes varint for hash digest length 128
>>> from multiformats import varint
>>> hex(varint.decode(bytes.fromhex("e0e702")))
'0xb3e0'
>>> varint.decode(bytes.fromhex("8001"))
128

Also note that data and digests are all bytes objects, represented here as hex strings for clarity:

>>> raw_digest
        b'\xc0S^K\xe2\xb7\x9f\xfd\x93)\x13\x05Ck\xf8\x891NJ?'
>>> digest
b'\x12\x14\xc0S^K\xe2\xb7\x9f\xfd\x93)\x13\x05Ck\xf8\x891NJ?'
# ^^^^^      0x12 -> multihash multicodec "sha2-256"
#      ^^^^^ 0x14 -> truncated hash length of 20 bytes

The multihash specified by a given multihash digest is accessible using the from_digest() function:

>>> multihash.from_digest(digest)
Multihash(codec='sha2-256')
>>> multihash.from_digest(digest).codec
Multicodec(name='sha2-256', tag='multihash', code='0x12',
           status='permanent', description='')

Additional multihash management functionality is provided by the exists() and get() functions, which can be used to check whether a multihash multicodec with given name or code is known, and if so to get the corresponding object:

>>> multihash.exists("sha1")
True
>>> multihash.get("sha1")
Multihash(codec='sha1')
>>> multihash.exists(code=0x11)
True
>>> multihash.get(code=0x11)
Multihash(codec='sha1')
Expand source code
"""
    Implementation of the [multihash spec](https://github.com/multiformats/multihash).

    Core functionality is provided by the `digest`, `wrap`, `unwrap` functions,
    or the correspondingly-named methods of the `Multihash` class.
    The `digest` function and `Multihash.digest` method can be used to create a multihash digest directly from data:

    ```py
    >>> data = b"Hello world!"
    >>> digest = multihash.digest(data, "sha2-256")
    >>> digest.hex()
    '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
    ```

    ```py
    >>> sha2_256 = multihash.get("sha2-256")
    >>> digest = sha2_256.digest(data)
    >>> digest.hex()
    '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
    ```

    By default, the full digest produced by the hash function is used.
    Optionally, a smaller digest size can be specified to produce truncated hashes:

    ```py
    >>> digest = multihash.digest(data, "sha2-256", size=20)
    #        optional truncated hash size, in bytes ^^^^^^^
    >>> multihash_digest.hex()
    '1214c0535e4be2b79ffd93291305436bf889314e4a3f' # 20-bytes truncated hash
    ```

    The `unwrap` function can be used to extract the raw digest from a multihash digest:

    ```py
    >>> digest.hex()
    '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
    >>> raw_digest = multihash.unwrap(digest)
    >>> raw_digest.hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
    ```

    The `Multihash.unwrap` method performs the same functionality, but additionally checks
    that the multihash digest is valid for the multihash:

    ```py
    >>> raw_digest = sha2_256.unwrap(digest)
    >>> raw_digest.hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
    ```

    ```py
    >>> sha1 = multihash.get("sha1")
    >>> (sha2_256.code, sha1.code)
    (18, 17)
    >>> sha1.unwrap(digest)
    err.ValueError: Decoded code 18 differs from multihash code 17.
    ```

    The `wrap` function and `Multihash.wrap` method can be used to wrap a raw digest into a multihash digest:

    ```py
    >>> raw_digest.hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
    >>> multihash.wrap(raw_digest, "sha2-256").hex()
    '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
    ```

    ```py
    >>> sha2_256.wrap(raw_digest).hex()
    '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
    ```

    Note the both multihash code and digest length are wrapped as varints
    (see the `multiformats.varint` module) and can span multiple bytes:

    ```py
    >>> skein1024_1024 = multihash.get("skein1024-1024")
    >>> skein1024_1024.codec
    Multicodec(name='skein1024-1024', tag='multihash', code='0xb3e0',
               status='draft', description='')
    >>> skein1024_1024.digest(data).hex()
    'e0e702800192e08f5143...' # 3+2+128 = 133 bytes in total
    #^^^^^^     3-bytes varint for hash function code 0xb3e0
    #      ^^^^ 2-bytes varint for hash digest length 128
    >>> from multiformats import varint
    >>> hex(varint.decode(bytes.fromhex("e0e702")))
    '0xb3e0'
    >>> varint.decode(bytes.fromhex("8001"))
    128
    ```

    Also note that data and digests are all `bytes` objects, represented here as hex strings for clarity:

    ```py
    >>> raw_digest
            b'\\xc0S^K\\xe2\\xb7\\x9f\\xfd\\x93)\\x13\\x05Ck\\xf8\\x891NJ?'
    >>> digest
    b'\\x12\\x14\\xc0S^K\\xe2\\xb7\\x9f\\xfd\\x93)\\x13\\x05Ck\\xf8\\x891NJ?'
    # ^^^^^      0x12 -> multihash multicodec "sha2-256"
    #      ^^^^^ 0x14 -> truncated hash length of 20 bytes
    ```

    The multihash specified by a given multihash digest is accessible using the `from_digest` function:

    ```py
    >>> multihash.from_digest(digest)
    Multihash(codec='sha2-256')
    >>> multihash.from_digest(digest).codec
    Multicodec(name='sha2-256', tag='multihash', code='0x12',
               status='permanent', description='')
    ```

    Additional multihash management functionality is provided by the `exists` and `get` functions,
    which can be used to check whether a multihash multicodec with given name or code is known,
    and if so to get the corresponding object:

    ```py
    >>> multihash.exists("sha1")
    True
    >>> multihash.get("sha1")
    Multihash(codec='sha1')
    >>> multihash.exists(code=0x11)
    True
    >>> multihash.get(code=0x11)
    Multihash(codec='sha1')
    ```

"""

from io import BytesIO, BufferedIOBase
from typing import AbstractSet, Any, cast, ClassVar, Dict, Iterator, Mapping, Optional, overload, Union, Sequence, Tuple, Type, TypeVar
from weakref import WeakValueDictionary
import sys
from typing_extensions import Literal
from typing_validation import validate

from multiformats import multicodec, varint
from multiformats.multicodec import Multicodec, _hexcode
from multiformats.varint import BytesLike

from . import raw, err
from .raw import Hashfun, MultihashImpl

class Multihash:
    """
        Container class for a multibase encoding.

        Example usage:

        ```py
        >>> sha2_256 = multihash.get("sha2-256")
        >>> sha2_256
        Multihash(codec='sha2-256')
        ```
    """

    # WeakValueDictionary[str, Multihash]
    _cache: ClassVar[WeakValueDictionary] = WeakValueDictionary() # type: ignore

    _codec: Multicodec
    _implementation: MultihashImpl

    __slots__ = ("__weakref__", "_codec", "_implementation")

    def __new__(cls, *, codec: Union[str, int, Multicodec]) -> "Multihash":
        # check that the codec exists:
        if isinstance(codec, str):
            codec = multicodec.get(codec)
        elif isinstance(codec, int):
            codec = multicodec.get(code=codec)
        else:
            validate(codec, Multicodec)
            existing_codec = multicodec.get(codec.name)
            if existing_codec != codec:
                raise err.ValueError(f"Multicodec named {repr(codec.name)} exists, but is not the one given.")
            codec = existing_codec
        # check that the codec is a multihash multicodec:
        if codec.tag != "multihash":
            raise err.ValueError(f"Multicodec named {repr(codec.name)} exists, but is not a multihash.")
        implementation: MultihashImpl = raw.get(codec.name)
        _cache = Multihash._cache
        if codec.name in _cache:
            # if a multihash instance with this name is already registered
            instance: Multihash = _cache[codec.name]
            if instance.codec == codec and instance._implementation == implementation:
                # nothing changed, can use the existing instance
                return instance
            # otherwise remove the existing instance
            del _cache[codec.name]
        # create a fresh instance, register it and return it
        instance = super().__new__(cls)
        instance._codec = codec
        instance._implementation = implementation
        _cache[codec.name] = instance
        return instance

    @property
    def name(self) -> str:
        """
            Multihash multicodec name.

            Example usage:

            ```py
            >>> sha2_256.name
            'sha2-256'
            ```
        """
        return self.codec.name

    @property
    def code(self) -> int:
        """
            Multihash multicodec code.

            Example usage:

            ```py
            >>> sha2_256.code
            18
            # 18 = 0x12
            ```
        """
        return self.codec.code

    @property
    def codec(self) -> Multicodec:
        """
            The multicodec for this multihash.

            Example usage:

            ```py
            >>> sha2_256.codec
            Multicodec(name='sha2-256', tag='multihash', code='0x12',
                       status='permanent', description='')
            ```
        """
        return self._codec

    @property
    def max_digest_size(self) -> Optional[int]:
        """
            The maximum size (in bytes) for raw digests of this multihash,
            or `None` if there is no maximum size.
            Used to sense-check the wrapped/unwrapped raw digests.

            Example usage:

            ```py
            >>> sha2_256.max_digest_size
            32
            # 32 bytes = 256 bits
            ```
        """
        _, max_digest_size = self.implementation
        return max_digest_size

    @property
    def implementation(self) ->MultihashImpl:
        """
            Returns the implementation of a multihash multicodec, as a pair:

            ```py
            hash_function, max_digest_size = multihash.implementation("sha2-256")
            ```

            Above, `codec` is the `multiformats.multicodec.Multicodec` object carrying information about the
            multihash multicodec, `hash_function` is the function `bytes->bytes` computing the raw hashes,
            and `max_digest_size` is the max size of the digests produced by `hash_function` (or `None` if
            there is no max size, such as in the case of the 'identity' multihash multicodec).

            Example usage:

            ```py
            >>> sha2_256.implementation
            (<function _hashlib_sha.<locals>.hashfun at 0x0000029396E22280>, 32)
            ```
        """
        return self._implementation
        # hash_function, max_digest_size = raw.get(self.name)
        # return hash_function, max_digest_size

    def wrap(self, raw_digest: BytesLike) -> bytes:
        """
            Wraps a raw digest into a multihash digest:

            ```
            <raw digest> -> <code><size><raw digest>
            ```

            Example usage:

            ```py
            >>> sha2_256 = multihash.get("sha2-256")
            >>> raw_digest = bytes.fromhex(
            ... "c0535e4be2b79ffd93291305436bf889314e4a3f")
            >>> sha2_256.wrap(raw_digest).hex()
            "1214c0535e4be2b79ffd93291305436bf889314e4a3f"
            ```

            See `wrap` for more information.
        """
        validate(raw_digest, BytesLike)
        _, max_digest_size = self.implementation
        size = len(raw_digest)
        if max_digest_size is not None and size > max_digest_size:
            raise err.ValueError(f"Digest size {max_digest_size} is listed for {self.name}, "
                             f"but a digest of larger size {size} was given to be wrapped.")
        return self.codec.wrap(varint.encode(size)+raw_digest)

    def digest(self, data: BytesLike, *, size: Optional[int] = None) -> bytes:
        """
            Computes the raw digest of the given data and wraps it into a multihash digest.
            The optional keyword argument `size` can be used to truncate the
            raw digest to be of the given size (or less) before encoding.

            Example usage:

            ```py
            >>> sha2_256 = multihash.get("sha2-256")
            >>> data = b"Hello world!"
            >>> data.hex()
            "48656c6c6f20776f726c6421"
            >>> sha2_256.digest(data).hex() # full 32-bytes hash
            '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
            >>> sha2_256.digest(data, size=20).hex() # truncated hash
            '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
            ```

            See `digest` for more information.
        """
        hf, _ = self.implementation
        raw_digest = hf(data)
        if size is not None:
            raw_digest = raw_digest[:size] # truncate digest
        size = len(raw_digest)
        return self.codec.wrap(varint.encode(size)+raw_digest)

    def unwrap(self, digest: Union[BytesLike, BufferedIOBase]) -> bytes:
        """
            Unwraps a multihash digest into a hash digest:

            ```
            <code><size><raw digest> -> <raw digest>
            ```

            If `digest` is one of bytes, bytearray or memoryview, the method also checks
            that the actual hash digest size matches the size listed by the multihash digest.

            Example usage:

            ```py
            >>> sha2_256 = multihash.get("sha2-256")
            >>> digest = bytes.fromhex(
            ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
            >>> sha2_256.unwrap(digest).hex()
            'c0535e4be2b79ffd93291305436bf889314e4a3f'
            ```

        """
        code, raw_digest = unwrap_raw(digest)
        if code != self.code:
            raise err.ValueError(f"Decoded code {code} differs from multihash code {self.code}.")
        _validate_raw_digest_size(self.name, raw_digest, self.max_digest_size)
        return raw_digest

    def __str__(self) -> str:
        return f"multihash.get({repr(self.name)})"

    def __repr__(self) -> str:
        return f"Multihash(codec={repr(self.name)})"

    @property
    def _as_tuple(self) -> Tuple[Type["Multihash"], Multicodec]:
        return (Multihash, self.codec)

    def __hash__(self) -> int:
        return hash(self._as_tuple)

    def __eq__(self, other: Any) -> bool:
        if self is other:
            return True
        if not isinstance(other, Multihash):
            return NotImplemented
        return self._as_tuple == other._as_tuple


def get(name: Optional[str] = None, *, code: Optional[int] = None) -> Multihash:
    """
        Gets the multihash multicodec with given name or code.
        Raises `err.KeyError` if the multihash does not exist or is not implemented.
        Exactly one of `name` and `code` must be specified.

        Example usage:

        ```py
        >>> multihash.get("sha1")
        Multihash(codec='sha1')
        >>> multihash.get(code=0x11)
        Multihash(codec='sha1')
        ```

    """
    if name is not None and code is not None:
        raise err.ValueError("Must specify at most one between 'name' and 'code'.")
    if name is not None:
        return Multihash(codec=name)
    if code is not None:
        return Multihash(codec=code)
    raise err.ValueError("Must specify at least one between 'name' and 'code'.")


def exists(name: Optional[str] = None, *, code: Optional[int] = None) -> bool:
    """
        Checks whether a multihash multicodec with the given name or code exists.
        Exactly one of `name` and `code` must be specified.
        This function returns `False` if a multicodec by given name or code exists,
        but is not tagged 'multihash'.

        Example usage:

        ```py
        >>> multihash.exists("sha1")
        True
        >>> multihash.exists(code=0x11)
        True
        >>> from multiformats import multicodec
        >>> multicodec.get("cidv1")
        Multicodec(name='cidv1', tag='cid', code='0x01',
                   status='permanent', description='CIDv1')
        >>> multihash.exists("cidv1")
        False
        ```

    """
    if not multicodec.exists(name, code=code):
        return False
    multihash = multicodec.get(name, code=code)
    return multihash.tag == "multihash"


def is_implemented(name: Optional[str] = None, *, code: Optional[int] = None) -> bool:
    """
        Checks whether a multihash with the given name or code exists and is implemented.
        Exactly one of `name` and `code` must be specified.

        Example usage:

        ```py
        >>> multihash.is_implemented("sha1")
        True
        >>> multihash.is_implemented(code=0x11)
        True
        ```
    """
    if not exists(name, code=code):
        return False
    multihash = multicodec.get(name, code=code)
    return raw.exists(multihash.name)


def from_digest(multihash_digest: BytesLike) -> Multihash:
    """
        Returns the multihash multicodec for the given digest,
        according to the code specified by its prefix.
        Raises `err.KeyError` if no multihash exists with that code.

        Example usage:

        ```py
        >>> multihash_digest = bytes.fromhex("140a9a7a8207a57d03e9c524")
        >>> multihash.from_digest(multihash_digest)
        Multihash(codec='sha3-512')
        ```

    """
    code, _, _ = multicodec.unwrap_raw(multihash_digest)
    return get(code=code)


def wrap(raw_digest: BytesLike, multihash: Union[str, int, Multihash]) -> bytes:
    """
        Wraps a raw digest into a multihash digest using the given multihash:

        ```
        <raw digest> -> <code><size><raw digest>
        ```

        If the multihash is passed by name or code, the `get` function is used to retrieve it.

        Example usage:

        ```py
        >>> multihash.get("sha2-256").codec
        Multicodec(name='sha2-256', tag='multihash', code='0x12',
                   status='permanent', description='')
        >>> raw_digest = bytes.fromhex("c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> len(raw_digest)
        20
        >>> multihash.wrap(raw_digest, "sha2-256").hex()
        "1214c0535e4be2b79ffd93291305436bf889314e4a3f"
        #^^   code 0x12 for multihash multicodec "sha2-256"
        #  ^^ truncated hash length 0x14 = 20 bytes
        ```

        Note that all digests are `bytes` objects, represented here as hex strings for clarity:

        ```py
        >>> hash_digest
        b'\\xc0S^K\\xe2\\xb7\\x9f\\xfd\\x93)\\x13\\x05Ck\\xf8\\x891NJ?'
        >>> multihash.wrap(raw_digest, "sha2-256")
        b'\\x12\\x14\\xc0S^K\\xe2\\xb7\\x9f\\xfd\\x93)\\x13\\x05Ck\\xf8\\x891NJ?'
        # ^^^^     0x12 -> multihash multicodec "sha2-256"
        #     ^^^^ 0x14 -> truncated hash length of 20 bytes
        ```

    """
    if not isinstance(multihash, Multihash):
        multihash = Multihash(codec=multihash)
    return multihash.wrap(raw_digest)


def digest(data: BytesLike, multihash: Union[str, int, Multihash], *, size: Optional[int] = None) -> bytes:
    """
        Computes the raw digest of the given data and wraps it into a multihash digest.
        The optional keyword argument `size` can be used to truncate the
        raw digest to be of the given size (or less) before encoding.

        Example usage:

        ```py
        >>> data = b"Hello world!"
        >>> data.hex()
        "48656c6c6f20776f726c6421"
        >>> multihash.digest(data, "sha2-256").hex() # full 32-bytes hash
        '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
        >>> multihash.digest(data, "sha2-256", size=20).hex()
        #         optional truncated hash size ^^^^^^^
        '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
        #^^   code 0x12 for multihash multicodec "sha2-256"
        #  ^^ truncated hash length 0x14 = 20 bytes
        ```

        Note both multihash code and digest length are wrapped as varints
        (see the `multiformats.varint` module) and can span multiple bytes:

        ```py
        >>> multihash.get("skein1024-1024")
        Multicodec(name='skein1024-1024', tag='multihash', code='0xb3e0',
                   status='draft', description='')
        >>> multihash.digest(data, "skein1024-1024").hex()
        'e0e702800192e08f5143 ... 3+2+128 = 133 bytes in total
        #^^^^^^     3-bytes varint for hash function code 0xb3e0
        #      ^^^^ 2-bytes varint for hash digest length 128
        >>> from multiformats import varint
        >>> hex(varint.decode(bytes.fromhex("e0e702")))
        '0xb3e0'
        >>> varint.decode(bytes.fromhex("8001"))
        128
        ```

    """
    if not isinstance(multihash, Multihash):
        multihash = Multihash(codec=multihash)
    return multihash.digest(data, size=size)


def _validate_raw_digest_size(name: str, raw_digest: bytes, max_digest_size: Optional[int]) -> None:
    if max_digest_size is not None and len(raw_digest) > max_digest_size:
        raise err.ValueError(f"Multihash {name} has max digest size {max_digest_size}, "
                         f"but a digest of larger size {len(raw_digest)} was unwrapped instead.")


def unwrap(digest: Union[BytesLike, BufferedIOBase],
           multihash: Union[None, str, int, Multihash]=None) -> bytes:
    """
        Unwraps a multihash digest into a raw digest:

        ```
        <code><size><raw digest> -> <raw digest>
        ```

        If `digest` is one of `bytes`, `bytearray` or `memoryview`, the method also checks
        that the actual raw digest size matches the size listed in the multihash digest.
        If `digest` is a stream (an instance of `BufferedIOBase`, specifically), then the
        number of bytes consumed to produce the raw digest matches the size lised in the multihash digest,
        and no further bytes are consumed from the stream.

        If `multihash` is not `None`, the function additionally enforces that the code from the
        multihash digest matches the code of the multihash (calls `Multihash.unwrap` under the hood to do so).
        Regardless, the function checks that the multihash with code specified by the multihash digest exists
        and is implemented.

        Example usage:

        ```py
        >>> digest = bytes.fromhex(
        ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> multihash.unwrap(digest, "sha2-256").hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
        ```
    """
    if multihash is not None:
        if not isinstance(multihash, Multihash):
            multihash = Multihash(codec=multihash)
        return multihash.unwrap(digest)
    code, raw_digest = unwrap_raw(digest)
    multihash = Multihash(codec=code)
    _validate_raw_digest_size(multihash.name, raw_digest, multihash.max_digest_size)
    return raw_digest


_BufferedIOT = TypeVar("_BufferedIOT", bound=BufferedIOBase)

@overload
def unwrap_raw(multihash_digest: BufferedIOBase) -> Tuple[int, bytes]:
    ...

@overload
def unwrap_raw(multihash_digest: BytesLike) -> Tuple[int, memoryview]:
    ...

def unwrap_raw(multihash_digest: Union[BytesLike, BufferedIOBase]) -> Tuple[int, Union[bytes, memoryview]]:
    """
        Unwraps a multihash digest into a code and raw digest pair:

        ```
        <code><size><hash digest> -> (<code>, <hash digest>)
        ```

        The function checks that the multihash codec with code specified by the multihash digest exists,
        but does not check whether it is implemented or not.

        Example usage:

        ```py
        >>> multihash_digest = bytes.fromhex(
        ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> code, digest = multihash.unwrap_raw(multihash_digest, "sha2-256")
        >>> code
        18 # the code 0x12 of 'sha2-256'
        >>> digest.hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
        ```

    """
    # switch between memoryview mode and stream mode
    if isinstance(multihash_digest, BufferedIOBase):
        stream_mode = True
        validate(multihash_digest, BufferedIOBase)
        stream: Union[memoryview, BufferedIOBase] = multihash_digest
    else:
        stream_mode = False
        stream = memoryview(multihash_digest)
    # extract multihash code
    multihash_code, n, stream = multicodec.unwrap_raw(multihash_digest)
    if not exists(code=multihash_code):
        n_bytes_read = f" ({n} bytes read)" if stream_mode else ""
        raise err.KeyError(f"Multicodec {_hexcode(multihash_code)} is not a multihash{n_bytes_read}.")
    # extract hash digest size
    digest_size, _, stream = varint.decode_raw(stream)
    # extract hash digest
    if stream_mode:
        # use only the number of bytes specified by the multihash
        hash_digest = cast(BufferedIOBase, stream).read(digest_size)
    else:
        # use all remaining bytes
        hash_digest = cast(memoryview, stream)
    # check that the hash digest size is valid
    if digest_size != len(hash_digest):
        raise err.ValueError(f"Multihash digest lists size {digest_size}, but the hash digest has size {len(hash_digest)} instead.")
    return multihash_code, hash_digest

Sub-modules

multiformats.multihash.err

Errors for the multiformats.multihash module.

multiformats.multihash.raw

Implementation of raw hash functions used by multihash multicodecs …

Functions

def digest(data: Union[bytes, bytearray, memoryview], multihash: Union[str, int, Multihash], *, size: Optional[None] = None) ‑> bytes

Computes the raw digest of the given data and wraps it into a multihash digest. The optional keyword argument size can be used to truncate the raw digest to be of the given size (or less) before encoding.

Example usage:

>>> data = b"Hello world!"
>>> data.hex()
"48656c6c6f20776f726c6421"
>>> multihash.digest(data, "sha2-256").hex() # full 32-bytes hash
'1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
>>> multihash.digest(data, "sha2-256", size=20).hex()
#         optional truncated hash size ^^^^^^^
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'
#^^   code 0x12 for multihash multicodec "sha2-256"
#  ^^ truncated hash length 0x14 = 20 bytes

Note both multihash code and digest length are wrapped as varints (see the multiformats.varint module) and can span multiple bytes:

>>> multihash.get("skein1024-1024")
Multicodec(name='skein1024-1024', tag='multihash', code='0xb3e0',
           status='draft', description='')
>>> multihash.digest(data, "skein1024-1024").hex()
'e0e702800192e08f5143 ... 3+2+128 = 133 bytes in total
#^^^^^^     3-bytes varint for hash function code 0xb3e0
#      ^^^^ 2-bytes varint for hash digest length 128
>>> from multiformats import varint
>>> hex(varint.decode(bytes.fromhex("e0e702")))
'0xb3e0'
>>> varint.decode(bytes.fromhex("8001"))
128
Expand source code
def digest(data: BytesLike, multihash: Union[str, int, Multihash], *, size: Optional[int] = None) -> bytes:
    """
        Computes the raw digest of the given data and wraps it into a multihash digest.
        The optional keyword argument `size` can be used to truncate the
        raw digest to be of the given size (or less) before encoding.

        Example usage:

        ```py
        >>> data = b"Hello world!"
        >>> data.hex()
        "48656c6c6f20776f726c6421"
        >>> multihash.digest(data, "sha2-256").hex() # full 32-bytes hash
        '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
        >>> multihash.digest(data, "sha2-256", size=20).hex()
        #         optional truncated hash size ^^^^^^^
        '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
        #^^   code 0x12 for multihash multicodec "sha2-256"
        #  ^^ truncated hash length 0x14 = 20 bytes
        ```

        Note both multihash code and digest length are wrapped as varints
        (see the `multiformats.varint` module) and can span multiple bytes:

        ```py
        >>> multihash.get("skein1024-1024")
        Multicodec(name='skein1024-1024', tag='multihash', code='0xb3e0',
                   status='draft', description='')
        >>> multihash.digest(data, "skein1024-1024").hex()
        'e0e702800192e08f5143 ... 3+2+128 = 133 bytes in total
        #^^^^^^     3-bytes varint for hash function code 0xb3e0
        #      ^^^^ 2-bytes varint for hash digest length 128
        >>> from multiformats import varint
        >>> hex(varint.decode(bytes.fromhex("e0e702")))
        '0xb3e0'
        >>> varint.decode(bytes.fromhex("8001"))
        128
        ```

    """
    if not isinstance(multihash, Multihash):
        multihash = Multihash(codec=multihash)
    return multihash.digest(data, size=size)
def exists(name: Optional[str] = None, *, code: Optional[None] = None) ‑> bool

Checks whether a multihash multicodec with the given name or code exists. Exactly one of name and code must be specified. This function returns False if a multicodec by given name or code exists, but is not tagged 'multihash'.

Example usage:

>>> multihash.exists("sha1")
True
>>> multihash.exists(code=0x11)
True
>>> from multiformats import multicodec
>>> multicodec.get("cidv1")
Multicodec(name='cidv1', tag='cid', code='0x01',
           status='permanent', description='CIDv1')
>>> multihash.exists("cidv1")
False
Expand source code
def exists(name: Optional[str] = None, *, code: Optional[int] = None) -> bool:
    """
        Checks whether a multihash multicodec with the given name or code exists.
        Exactly one of `name` and `code` must be specified.
        This function returns `False` if a multicodec by given name or code exists,
        but is not tagged 'multihash'.

        Example usage:

        ```py
        >>> multihash.exists("sha1")
        True
        >>> multihash.exists(code=0x11)
        True
        >>> from multiformats import multicodec
        >>> multicodec.get("cidv1")
        Multicodec(name='cidv1', tag='cid', code='0x01',
                   status='permanent', description='CIDv1')
        >>> multihash.exists("cidv1")
        False
        ```

    """
    if not multicodec.exists(name, code=code):
        return False
    multihash = multicodec.get(name, code=code)
    return multihash.tag == "multihash"
def from_digest(multihash_digest: Union[bytes, bytearray, memoryview]) ‑> Multihash

Returns the multihash multicodec for the given digest, according to the code specified by its prefix. Raises KeyError if no multihash exists with that code.

Example usage:

>>> multihash_digest = bytes.fromhex("140a9a7a8207a57d03e9c524")
>>> multihash.from_digest(multihash_digest)
Multihash(codec='sha3-512')
Expand source code
def from_digest(multihash_digest: BytesLike) -> Multihash:
    """
        Returns the multihash multicodec for the given digest,
        according to the code specified by its prefix.
        Raises `err.KeyError` if no multihash exists with that code.

        Example usage:

        ```py
        >>> multihash_digest = bytes.fromhex("140a9a7a8207a57d03e9c524")
        >>> multihash.from_digest(multihash_digest)
        Multihash(codec='sha3-512')
        ```

    """
    code, _, _ = multicodec.unwrap_raw(multihash_digest)
    return get(code=code)
def get(name: Optional[str] = None, *, code: Optional[None] = None) ‑> Multihash

Gets the multihash multicodec with given name or code. Raises KeyError if the multihash does not exist or is not implemented. Exactly one of name and code must be specified.

Example usage:

>>> multihash.get("sha1")
Multihash(codec='sha1')
>>> multihash.get(code=0x11)
Multihash(codec='sha1')
Expand source code
def get(name: Optional[str] = None, *, code: Optional[int] = None) -> Multihash:
    """
        Gets the multihash multicodec with given name or code.
        Raises `err.KeyError` if the multihash does not exist or is not implemented.
        Exactly one of `name` and `code` must be specified.

        Example usage:

        ```py
        >>> multihash.get("sha1")
        Multihash(codec='sha1')
        >>> multihash.get(code=0x11)
        Multihash(codec='sha1')
        ```

    """
    if name is not None and code is not None:
        raise err.ValueError("Must specify at most one between 'name' and 'code'.")
    if name is not None:
        return Multihash(codec=name)
    if code is not None:
        return Multihash(codec=code)
    raise err.ValueError("Must specify at least one between 'name' and 'code'.")
def is_implemented(name: Optional[str] = None, *, code: Optional[None] = None) ‑> bool

Checks whether a multihash with the given name or code exists and is implemented. Exactly one of name and code must be specified.

Example usage:

>>> multihash.is_implemented("sha1")
True
>>> multihash.is_implemented(code=0x11)
True
Expand source code
def is_implemented(name: Optional[str] = None, *, code: Optional[int] = None) -> bool:
    """
        Checks whether a multihash with the given name or code exists and is implemented.
        Exactly one of `name` and `code` must be specified.

        Example usage:

        ```py
        >>> multihash.is_implemented("sha1")
        True
        >>> multihash.is_implemented(code=0x11)
        True
        ```
    """
    if not exists(name, code=code):
        return False
    multihash = multicodec.get(name, code=code)
    return raw.exists(multihash.name)
def unwrap(digest: Union[bytes, bytearray, memoryview, io.BufferedIOBase], multihash: Union[ForwardRef(None), str, int, Multihash] = None) ‑> bytes

Unwraps a multihash digest into a raw digest:

<code><size><raw digest> -> <raw digest>

If digest() is one of bytes, bytearray or memoryview, the method also checks that the actual raw digest size matches the size listed in the multihash digest. If digest() is a stream (an instance of BufferedIOBase, specifically), then the number of bytes consumed to produce the raw digest matches the size lised in the multihash digest, and no further bytes are consumed from the stream.

If multihash is not None, the function additionally enforces that the code from the multihash digest matches the code of the multihash (calls Multihash.unwrap() under the hood to do so). Regardless, the function checks that the multihash with code specified by the multihash digest exists and is implemented.

Example usage:

>>> digest = bytes.fromhex(
... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
>>> multihash.unwrap(digest, "sha2-256").hex()
'c0535e4be2b79ffd93291305436bf889314e4a3f'
Expand source code
def unwrap(digest: Union[BytesLike, BufferedIOBase],
           multihash: Union[None, str, int, Multihash]=None) -> bytes:
    """
        Unwraps a multihash digest into a raw digest:

        ```
        <code><size><raw digest> -> <raw digest>
        ```

        If `digest` is one of `bytes`, `bytearray` or `memoryview`, the method also checks
        that the actual raw digest size matches the size listed in the multihash digest.
        If `digest` is a stream (an instance of `BufferedIOBase`, specifically), then the
        number of bytes consumed to produce the raw digest matches the size lised in the multihash digest,
        and no further bytes are consumed from the stream.

        If `multihash` is not `None`, the function additionally enforces that the code from the
        multihash digest matches the code of the multihash (calls `Multihash.unwrap` under the hood to do so).
        Regardless, the function checks that the multihash with code specified by the multihash digest exists
        and is implemented.

        Example usage:

        ```py
        >>> digest = bytes.fromhex(
        ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> multihash.unwrap(digest, "sha2-256").hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
        ```
    """
    if multihash is not None:
        if not isinstance(multihash, Multihash):
            multihash = Multihash(codec=multihash)
        return multihash.unwrap(digest)
    code, raw_digest = unwrap_raw(digest)
    multihash = Multihash(codec=code)
    _validate_raw_digest_size(multihash.name, raw_digest, multihash.max_digest_size)
    return raw_digest
def unwrap_raw(multihash_digest: Union[bytes, bytearray, memoryview, io.BufferedIOBase]) ‑> Tuple[int, Union[bytes, memoryview]]

Unwraps a multihash digest into a code and raw digest pair:

<code><size><hash digest> -> (<code>, <hash digest>)

The function checks that the multihash codec with code specified by the multihash digest exists, but does not check whether it is implemented or not.

Example usage:

>>> multihash_digest = bytes.fromhex(
... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
>>> code, digest = multihash.unwrap_raw(multihash_digest, "sha2-256")
>>> code
18 # the code 0x12 of 'sha2-256'
>>> digest.hex()
'c0535e4be2b79ffd93291305436bf889314e4a3f'
Expand source code
def unwrap_raw(multihash_digest: Union[BytesLike, BufferedIOBase]) -> Tuple[int, Union[bytes, memoryview]]:
    """
        Unwraps a multihash digest into a code and raw digest pair:

        ```
        <code><size><hash digest> -> (<code>, <hash digest>)
        ```

        The function checks that the multihash codec with code specified by the multihash digest exists,
        but does not check whether it is implemented or not.

        Example usage:

        ```py
        >>> multihash_digest = bytes.fromhex(
        ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> code, digest = multihash.unwrap_raw(multihash_digest, "sha2-256")
        >>> code
        18 # the code 0x12 of 'sha2-256'
        >>> digest.hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
        ```

    """
    # switch between memoryview mode and stream mode
    if isinstance(multihash_digest, BufferedIOBase):
        stream_mode = True
        validate(multihash_digest, BufferedIOBase)
        stream: Union[memoryview, BufferedIOBase] = multihash_digest
    else:
        stream_mode = False
        stream = memoryview(multihash_digest)
    # extract multihash code
    multihash_code, n, stream = multicodec.unwrap_raw(multihash_digest)
    if not exists(code=multihash_code):
        n_bytes_read = f" ({n} bytes read)" if stream_mode else ""
        raise err.KeyError(f"Multicodec {_hexcode(multihash_code)} is not a multihash{n_bytes_read}.")
    # extract hash digest size
    digest_size, _, stream = varint.decode_raw(stream)
    # extract hash digest
    if stream_mode:
        # use only the number of bytes specified by the multihash
        hash_digest = cast(BufferedIOBase, stream).read(digest_size)
    else:
        # use all remaining bytes
        hash_digest = cast(memoryview, stream)
    # check that the hash digest size is valid
    if digest_size != len(hash_digest):
        raise err.ValueError(f"Multihash digest lists size {digest_size}, but the hash digest has size {len(hash_digest)} instead.")
    return multihash_code, hash_digest
def wrap(raw_digest: Union[bytes, bytearray, memoryview], multihash: Union[str, int, Multihash]) ‑> bytes

Wraps a raw digest into a multihash digest using the given multihash:

<raw digest> -> <code><size><raw digest>

If the multihash is passed by name or code, the get() function is used to retrieve it.

Example usage:

>>> multihash.get("sha2-256").codec
Multicodec(name='sha2-256', tag='multihash', code='0x12',
           status='permanent', description='')
>>> raw_digest = bytes.fromhex("c0535e4be2b79ffd93291305436bf889314e4a3f")
>>> len(raw_digest)
20
>>> multihash.wrap(raw_digest, "sha2-256").hex()
"1214c0535e4be2b79ffd93291305436bf889314e4a3f"
#^^   code 0x12 for multihash multicodec "sha2-256"
#  ^^ truncated hash length 0x14 = 20 bytes

Note that all digests are bytes objects, represented here as hex strings for clarity:

>>> hash_digest
b'\xc0S^K\xe2\xb7\x9f\xfd\x93)\x13\x05Ck\xf8\x891NJ?'
>>> multihash.wrap(raw_digest, "sha2-256")
b'\x12\x14\xc0S^K\xe2\xb7\x9f\xfd\x93)\x13\x05Ck\xf8\x891NJ?'
# ^^^^     0x12 -> multihash multicodec "sha2-256"
#     ^^^^ 0x14 -> truncated hash length of 20 bytes
Expand source code
def wrap(raw_digest: BytesLike, multihash: Union[str, int, Multihash]) -> bytes:
    """
        Wraps a raw digest into a multihash digest using the given multihash:

        ```
        <raw digest> -> <code><size><raw digest>
        ```

        If the multihash is passed by name or code, the `get` function is used to retrieve it.

        Example usage:

        ```py
        >>> multihash.get("sha2-256").codec
        Multicodec(name='sha2-256', tag='multihash', code='0x12',
                   status='permanent', description='')
        >>> raw_digest = bytes.fromhex("c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> len(raw_digest)
        20
        >>> multihash.wrap(raw_digest, "sha2-256").hex()
        "1214c0535e4be2b79ffd93291305436bf889314e4a3f"
        #^^   code 0x12 for multihash multicodec "sha2-256"
        #  ^^ truncated hash length 0x14 = 20 bytes
        ```

        Note that all digests are `bytes` objects, represented here as hex strings for clarity:

        ```py
        >>> hash_digest
        b'\\xc0S^K\\xe2\\xb7\\x9f\\xfd\\x93)\\x13\\x05Ck\\xf8\\x891NJ?'
        >>> multihash.wrap(raw_digest, "sha2-256")
        b'\\x12\\x14\\xc0S^K\\xe2\\xb7\\x9f\\xfd\\x93)\\x13\\x05Ck\\xf8\\x891NJ?'
        # ^^^^     0x12 -> multihash multicodec "sha2-256"
        #     ^^^^ 0x14 -> truncated hash length of 20 bytes
        ```

    """
    if not isinstance(multihash, Multihash):
        multihash = Multihash(codec=multihash)
    return multihash.wrap(raw_digest)

Classes

class Multihash (*, codec: Union[str, int, Multicodec])

Container class for a multibase encoding.

Example usage:

>>> sha2_256 = multihash.get("sha2-256")
>>> sha2_256
Multihash(codec='sha2-256')
Expand source code
class Multihash:
    """
        Container class for a multibase encoding.

        Example usage:

        ```py
        >>> sha2_256 = multihash.get("sha2-256")
        >>> sha2_256
        Multihash(codec='sha2-256')
        ```
    """

    # WeakValueDictionary[str, Multihash]
    _cache: ClassVar[WeakValueDictionary] = WeakValueDictionary() # type: ignore

    _codec: Multicodec
    _implementation: MultihashImpl

    __slots__ = ("__weakref__", "_codec", "_implementation")

    def __new__(cls, *, codec: Union[str, int, Multicodec]) -> "Multihash":
        # check that the codec exists:
        if isinstance(codec, str):
            codec = multicodec.get(codec)
        elif isinstance(codec, int):
            codec = multicodec.get(code=codec)
        else:
            validate(codec, Multicodec)
            existing_codec = multicodec.get(codec.name)
            if existing_codec != codec:
                raise err.ValueError(f"Multicodec named {repr(codec.name)} exists, but is not the one given.")
            codec = existing_codec
        # check that the codec is a multihash multicodec:
        if codec.tag != "multihash":
            raise err.ValueError(f"Multicodec named {repr(codec.name)} exists, but is not a multihash.")
        implementation: MultihashImpl = raw.get(codec.name)
        _cache = Multihash._cache
        if codec.name in _cache:
            # if a multihash instance with this name is already registered
            instance: Multihash = _cache[codec.name]
            if instance.codec == codec and instance._implementation == implementation:
                # nothing changed, can use the existing instance
                return instance
            # otherwise remove the existing instance
            del _cache[codec.name]
        # create a fresh instance, register it and return it
        instance = super().__new__(cls)
        instance._codec = codec
        instance._implementation = implementation
        _cache[codec.name] = instance
        return instance

    @property
    def name(self) -> str:
        """
            Multihash multicodec name.

            Example usage:

            ```py
            >>> sha2_256.name
            'sha2-256'
            ```
        """
        return self.codec.name

    @property
    def code(self) -> int:
        """
            Multihash multicodec code.

            Example usage:

            ```py
            >>> sha2_256.code
            18
            # 18 = 0x12
            ```
        """
        return self.codec.code

    @property
    def codec(self) -> Multicodec:
        """
            The multicodec for this multihash.

            Example usage:

            ```py
            >>> sha2_256.codec
            Multicodec(name='sha2-256', tag='multihash', code='0x12',
                       status='permanent', description='')
            ```
        """
        return self._codec

    @property
    def max_digest_size(self) -> Optional[int]:
        """
            The maximum size (in bytes) for raw digests of this multihash,
            or `None` if there is no maximum size.
            Used to sense-check the wrapped/unwrapped raw digests.

            Example usage:

            ```py
            >>> sha2_256.max_digest_size
            32
            # 32 bytes = 256 bits
            ```
        """
        _, max_digest_size = self.implementation
        return max_digest_size

    @property
    def implementation(self) ->MultihashImpl:
        """
            Returns the implementation of a multihash multicodec, as a pair:

            ```py
            hash_function, max_digest_size = multihash.implementation("sha2-256")
            ```

            Above, `codec` is the `multiformats.multicodec.Multicodec` object carrying information about the
            multihash multicodec, `hash_function` is the function `bytes->bytes` computing the raw hashes,
            and `max_digest_size` is the max size of the digests produced by `hash_function` (or `None` if
            there is no max size, such as in the case of the 'identity' multihash multicodec).

            Example usage:

            ```py
            >>> sha2_256.implementation
            (<function _hashlib_sha.<locals>.hashfun at 0x0000029396E22280>, 32)
            ```
        """
        return self._implementation
        # hash_function, max_digest_size = raw.get(self.name)
        # return hash_function, max_digest_size

    def wrap(self, raw_digest: BytesLike) -> bytes:
        """
            Wraps a raw digest into a multihash digest:

            ```
            <raw digest> -> <code><size><raw digest>
            ```

            Example usage:

            ```py
            >>> sha2_256 = multihash.get("sha2-256")
            >>> raw_digest = bytes.fromhex(
            ... "c0535e4be2b79ffd93291305436bf889314e4a3f")
            >>> sha2_256.wrap(raw_digest).hex()
            "1214c0535e4be2b79ffd93291305436bf889314e4a3f"
            ```

            See `wrap` for more information.
        """
        validate(raw_digest, BytesLike)
        _, max_digest_size = self.implementation
        size = len(raw_digest)
        if max_digest_size is not None and size > max_digest_size:
            raise err.ValueError(f"Digest size {max_digest_size} is listed for {self.name}, "
                             f"but a digest of larger size {size} was given to be wrapped.")
        return self.codec.wrap(varint.encode(size)+raw_digest)

    def digest(self, data: BytesLike, *, size: Optional[int] = None) -> bytes:
        """
            Computes the raw digest of the given data and wraps it into a multihash digest.
            The optional keyword argument `size` can be used to truncate the
            raw digest to be of the given size (or less) before encoding.

            Example usage:

            ```py
            >>> sha2_256 = multihash.get("sha2-256")
            >>> data = b"Hello world!"
            >>> data.hex()
            "48656c6c6f20776f726c6421"
            >>> sha2_256.digest(data).hex() # full 32-bytes hash
            '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
            >>> sha2_256.digest(data, size=20).hex() # truncated hash
            '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
            ```

            See `digest` for more information.
        """
        hf, _ = self.implementation
        raw_digest = hf(data)
        if size is not None:
            raw_digest = raw_digest[:size] # truncate digest
        size = len(raw_digest)
        return self.codec.wrap(varint.encode(size)+raw_digest)

    def unwrap(self, digest: Union[BytesLike, BufferedIOBase]) -> bytes:
        """
            Unwraps a multihash digest into a hash digest:

            ```
            <code><size><raw digest> -> <raw digest>
            ```

            If `digest` is one of bytes, bytearray or memoryview, the method also checks
            that the actual hash digest size matches the size listed by the multihash digest.

            Example usage:

            ```py
            >>> sha2_256 = multihash.get("sha2-256")
            >>> digest = bytes.fromhex(
            ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
            >>> sha2_256.unwrap(digest).hex()
            'c0535e4be2b79ffd93291305436bf889314e4a3f'
            ```

        """
        code, raw_digest = unwrap_raw(digest)
        if code != self.code:
            raise err.ValueError(f"Decoded code {code} differs from multihash code {self.code}.")
        _validate_raw_digest_size(self.name, raw_digest, self.max_digest_size)
        return raw_digest

    def __str__(self) -> str:
        return f"multihash.get({repr(self.name)})"

    def __repr__(self) -> str:
        return f"Multihash(codec={repr(self.name)})"

    @property
    def _as_tuple(self) -> Tuple[Type["Multihash"], Multicodec]:
        return (Multihash, self.codec)

    def __hash__(self) -> int:
        return hash(self._as_tuple)

    def __eq__(self, other: Any) -> bool:
        if self is other:
            return True
        if not isinstance(other, Multihash):
            return NotImplemented
        return self._as_tuple == other._as_tuple

Instance variables

var code : int

Multihash multicodec code.

Example usage:

>>> sha2_256.code
18
# 18 = 0x12
Expand source code
@property
def code(self) -> int:
    """
        Multihash multicodec code.

        Example usage:

        ```py
        >>> sha2_256.code
        18
        # 18 = 0x12
        ```
    """
    return self.codec.code
var codecMulticodec

The multicodec for this multihash.

Example usage:

>>> sha2_256.codec
Multicodec(name='sha2-256', tag='multihash', code='0x12',
           status='permanent', description='')
Expand source code
@property
def codec(self) -> Multicodec:
    """
        The multicodec for this multihash.

        Example usage:

        ```py
        >>> sha2_256.codec
        Multicodec(name='sha2-256', tag='multihash', code='0x12',
                   status='permanent', description='')
        ```
    """
    return self._codec
var implementation : Tuple[Callable[[Union[bytes, bytearray, memoryview]], bytes], Optional[int]]

Returns the implementation of a multihash multicodec, as a pair:

hash_function, max_digest_size = multihash.implementation("sha2-256")

Above, codec is the Multicodec object carrying information about the multihash multicodec, hash_function is the function bytes->bytes computing the raw hashes, and max_digest_size is the max size of the digests produced by hash_function (or None if there is no max size, such as in the case of the 'identity' multihash multicodec).

Example usage:

>>> sha2_256.implementation
(<function _hashlib_sha.<locals>.hashfun at 0x0000029396E22280>, 32)
Expand source code
@property
def implementation(self) ->MultihashImpl:
    """
        Returns the implementation of a multihash multicodec, as a pair:

        ```py
        hash_function, max_digest_size = multihash.implementation("sha2-256")
        ```

        Above, `codec` is the `multiformats.multicodec.Multicodec` object carrying information about the
        multihash multicodec, `hash_function` is the function `bytes->bytes` computing the raw hashes,
        and `max_digest_size` is the max size of the digests produced by `hash_function` (or `None` if
        there is no max size, such as in the case of the 'identity' multihash multicodec).

        Example usage:

        ```py
        >>> sha2_256.implementation
        (<function _hashlib_sha.<locals>.hashfun at 0x0000029396E22280>, 32)
        ```
    """
    return self._implementation
    # hash_function, max_digest_size = raw.get(self.name)
    # return hash_function, max_digest_size
var max_digest_size : Optional[None]

The maximum size (in bytes) for raw digests of this multihash, or None if there is no maximum size. Used to sense-check the wrapped/unwrapped raw digests.

Example usage:

>>> sha2_256.max_digest_size
32
# 32 bytes = 256 bits
Expand source code
@property
def max_digest_size(self) -> Optional[int]:
    """
        The maximum size (in bytes) for raw digests of this multihash,
        or `None` if there is no maximum size.
        Used to sense-check the wrapped/unwrapped raw digests.

        Example usage:

        ```py
        >>> sha2_256.max_digest_size
        32
        # 32 bytes = 256 bits
        ```
    """
    _, max_digest_size = self.implementation
    return max_digest_size
var name : str

Multihash multicodec name.

Example usage:

>>> sha2_256.name
'sha2-256'
Expand source code
@property
def name(self) -> str:
    """
        Multihash multicodec name.

        Example usage:

        ```py
        >>> sha2_256.name
        'sha2-256'
        ```
    """
    return self.codec.name

Methods

def digest(self, data: Union[bytes, bytearray, memoryview], *, size: Optional[None] = None) ‑> bytes

Computes the raw digest of the given data and wraps it into a multihash digest. The optional keyword argument size can be used to truncate the raw digest to be of the given size (or less) before encoding.

Example usage:

>>> sha2_256 = multihash.get("sha2-256")
>>> data = b"Hello world!"
>>> data.hex()
"48656c6c6f20776f726c6421"
>>> sha2_256.digest(data).hex() # full 32-bytes hash
'1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
>>> sha2_256.digest(data, size=20).hex() # truncated hash
'1214c0535e4be2b79ffd93291305436bf889314e4a3f'

See digest() for more information.

Expand source code
def digest(self, data: BytesLike, *, size: Optional[int] = None) -> bytes:
    """
        Computes the raw digest of the given data and wraps it into a multihash digest.
        The optional keyword argument `size` can be used to truncate the
        raw digest to be of the given size (or less) before encoding.

        Example usage:

        ```py
        >>> sha2_256 = multihash.get("sha2-256")
        >>> data = b"Hello world!"
        >>> data.hex()
        "48656c6c6f20776f726c6421"
        >>> sha2_256.digest(data).hex() # full 32-bytes hash
        '1220c0535e4be2b79ffd93291305436bf889314e4a3faec05ecffcbb7df31ad9e51a'
        >>> sha2_256.digest(data, size=20).hex() # truncated hash
        '1214c0535e4be2b79ffd93291305436bf889314e4a3f'
        ```

        See `digest` for more information.
    """
    hf, _ = self.implementation
    raw_digest = hf(data)
    if size is not None:
        raw_digest = raw_digest[:size] # truncate digest
    size = len(raw_digest)
    return self.codec.wrap(varint.encode(size)+raw_digest)
def unwrap(self, digest: Union[bytes, bytearray, memoryview, io.BufferedIOBase]) ‑> bytes

Unwraps a multihash digest into a hash digest:

<code><size><raw digest> -> <raw digest>

If digest() is one of bytes, bytearray or memoryview, the method also checks that the actual hash digest size matches the size listed by the multihash digest.

Example usage:

>>> sha2_256 = multihash.get("sha2-256")
>>> digest = bytes.fromhex(
... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
>>> sha2_256.unwrap(digest).hex()
'c0535e4be2b79ffd93291305436bf889314e4a3f'
Expand source code
def unwrap(self, digest: Union[BytesLike, BufferedIOBase]) -> bytes:
    """
        Unwraps a multihash digest into a hash digest:

        ```
        <code><size><raw digest> -> <raw digest>
        ```

        If `digest` is one of bytes, bytearray or memoryview, the method also checks
        that the actual hash digest size matches the size listed by the multihash digest.

        Example usage:

        ```py
        >>> sha2_256 = multihash.get("sha2-256")
        >>> digest = bytes.fromhex(
        ... "1214c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> sha2_256.unwrap(digest).hex()
        'c0535e4be2b79ffd93291305436bf889314e4a3f'
        ```

    """
    code, raw_digest = unwrap_raw(digest)
    if code != self.code:
        raise err.ValueError(f"Decoded code {code} differs from multihash code {self.code}.")
    _validate_raw_digest_size(self.name, raw_digest, self.max_digest_size)
    return raw_digest
def wrap(self, raw_digest: Union[bytes, bytearray, memoryview]) ‑> bytes

Wraps a raw digest into a multihash digest:

<raw digest> -> <code><size><raw digest>

Example usage:

>>> sha2_256 = multihash.get("sha2-256")
>>> raw_digest = bytes.fromhex(
... "c0535e4be2b79ffd93291305436bf889314e4a3f")
>>> sha2_256.wrap(raw_digest).hex()
"1214c0535e4be2b79ffd93291305436bf889314e4a3f"

See wrap() for more information.

Expand source code
def wrap(self, raw_digest: BytesLike) -> bytes:
    """
        Wraps a raw digest into a multihash digest:

        ```
        <raw digest> -> <code><size><raw digest>
        ```

        Example usage:

        ```py
        >>> sha2_256 = multihash.get("sha2-256")
        >>> raw_digest = bytes.fromhex(
        ... "c0535e4be2b79ffd93291305436bf889314e4a3f")
        >>> sha2_256.wrap(raw_digest).hex()
        "1214c0535e4be2b79ffd93291305436bf889314e4a3f"
        ```

        See `wrap` for more information.
    """
    validate(raw_digest, BytesLike)
    _, max_digest_size = self.implementation
    size = len(raw_digest)
    if max_digest_size is not None and size > max_digest_size:
        raise err.ValueError(f"Digest size {max_digest_size} is listed for {self.name}, "
                         f"but a digest of larger size {size} was given to be wrapped.")
    return self.codec.wrap(varint.encode(size)+raw_digest)