TorrentFile API

Torrent Module

module
torrentfile.torrent

Classes and procedures pertaining to the creation of torrent meta files.

Classes

  • TorrentFile construct .torrent file.

  • TorrentFileV2 construct .torrent v2 files using provided data.

  • MetaFile base class for all MetaFile classes.

Constants

  • BLOCK_SIZE : int size of leaf hashes for merkle tree.

  • HASH_SIZE : int Length of a sha256 hash.

Bittorrent V2

From Bittorrent.org Documentation pages.

Implementation details for Bittorrent Protocol v2.

Note

All strings in a .torrent file that contain text must be UTF-8 encoded.

Meta Version 2 Dictionary:

  • "announce": The URL of the tracker.

  • "info": This maps to a dictionary, with keys described below.

    • "name": A display name for the torrent. It is purely advisory.

    • "piece length": The number of bytes that each logical piece in the peer protocol refers to. I.e. it sets the granularity of piece, request, bitfield and have messages. It must be a power of two and at least 6KiB.

    • "meta version": An integer value, set to 2 to indicate compatibility with the current revision of this specification. Version 1 is not assigned to avoid confusion with BEP3. Future revisions will only increment this issue to indicate an incompatible change has been made, for example that hash algorithms were changed due to newly discovered vulnerabilities. Lementations must check this field first and indicate that a torrent is of a newer version than they can handle before performing other idations which may result in more general messages about invalid files. Files are mapped into this piece address space so that each non-empty

    • "file tree": A tree of dictionaries where dictionary keys represent UTF-8 encoded path elements. Entries with zero-length keys describe the properties of the composed path at that point. 'UTF-8 encoded' context only means that if the native encoding is known at creation time it must be converted to UTF-8. Keys may contain invalid UTF-8 sequences or characters and names that are reserved on specific filesystems. Implementations must be prepared to sanitize them. On platforms path components exactly matching '.' and '..' must be sanitized since they could lead to directory traversal attacks and conflicting path descriptions. On platforms that require UTF-8 path components this sanitizing step must happen after normalizing overlong UTF-8 encodings. File is aligned to a piece boundary and occurs in same order as the file tree. The last piece of each file may be shorter than the specified piece length, resulting in an alignment gap.

    • "length": Length of the file in bytes. Presence of this field indicates that the dictionary describes a file, not a directory. Which means it must not have any sibling entries.

    • "pieces root": For non-empty files this is the the root hash of a merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file. The last block may be shorter than 16KiB. The remaining leaf hashes beyond the end of the file required to construct upper layers of the merkle tree are set to zero. As of meta version 2 SHA2-256 is used as digest function for the merkle tree. The hash is stored in its binary form, not as human-readable string.

  • "piece layers": A dictionary of strings. For each file in the file tree that is larger than the piece size it contains one string value. The keys are the merkle roots while the values consist of concatenated hashes of one layer within that merkle tree. The layer is chosen so that one hash covers piece length bytes. For example if the piece size is 16KiB then the leaf hashes are used. If a piece size of 128KiB is used then 3rd layer up from the leaf hashes is used. Layer hashes which exclusively cover data beyond the end of file, i.e. are only needed to balance the tree, are omitted. All hashes are stored in their binary format. A torrent is not valid if this field is absent, the contained hashes do not match the merkle roots or are not from the correct layer.

Important

The file tree root dictionary itself must not be a file, i.e. it must not contain a zero-length key with a dictionary containing a length key.

Bittorrent V1

v1 meta-dictionary

  • announce: The URL of the tracker.

  • info: This maps to a dictionary, with keys described below.

    • name: maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory.

    • piece length: maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated.

    • piece length: is almost always a power of two, most commonly 2^18 = 256 K

    • pieces: maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.

    • length: In the single file case, maps to the length of the file in bytes.

    • files: If present then the download represents a single file, otherwise it represents a set of files which go in a directory structure. For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files maps to, and is a list of dictionaries containing the following keys:

      • path: A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name

      • length: Maps to the length of the file in bytes.

    • length: Only present if the torrent contents is a single file. Maps to the length of the file in bytes.

Note

In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory.

Classes
  • MetaFile Base Class for all TorrentFile classes.
  • TorrentFile Class for creating Bittorrent meta files.
  • TorrentFileV2 Class for creating Bittorrent meta v2 files.
  • TorrentFileHybrid Construct the Hybrid torrent meta file with provided parameters.

torrentfile.torrent

Classes and procedures pertaining to the creation of torrent meta files.

Classes

  • TorrentFile construct .torrent file.

  • TorrentFileV2 construct .torrent v2 files using provided data.

  • MetaFile base class for all MetaFile classes.

Constants

  • BLOCK_SIZE : int size of leaf hashes for merkle tree.

  • HASH_SIZE : int Length of a sha256 hash.

Bittorrent V2

From Bittorrent.org Documentation pages.

Implementation details for Bittorrent Protocol v2.

Note

All strings in a .torrent file that contain text must be UTF-8 encoded.

Meta Version 2 Dictionary:
  • "announce": The URL of the tracker.

  • "info": This maps to a dictionary, with keys described below.

    • "name": A display name for the torrent. It is purely advisory.

    • "piece length": The number of bytes that each logical piece in the peer protocol refers to. I.e. it sets the granularity of piece, request, bitfield and have messages. It must be a power of two and at least 6KiB.

    • "meta version": An integer value, set to 2 to indicate compatibility with the current revision of this specification. Version 1 is not assigned to avoid confusion with BEP3. Future revisions will only increment this issue to indicate an incompatible change has been made, for example that hash algorithms were changed due to newly discovered vulnerabilities. Lementations must check this field first and indicate that a torrent is of a newer version than they can handle before performing other idations which may result in more general messages about invalid files. Files are mapped into this piece address space so that each non-empty

    • "file tree": A tree of dictionaries where dictionary keys represent UTF-8 encoded path elements. Entries with zero-length keys describe the properties of the composed path at that point. 'UTF-8 encoded' context only means that if the native encoding is known at creation time it must be converted to UTF-8. Keys may contain invalid UTF-8 sequences or characters and names that are reserved on specific filesystems. Implementations must be prepared to sanitize them. On platforms path components exactly matching '.' and '..' must be sanitized since they could lead to directory traversal attacks and conflicting path descriptions. On platforms that require UTF-8 path components this sanitizing step must happen after normalizing overlong UTF-8 encodings. File is aligned to a piece boundary and occurs in same order as the file tree. The last piece of each file may be shorter than the specified piece length, resulting in an alignment gap.

    • "length": Length of the file in bytes. Presence of this field indicates that the dictionary describes a file, not a directory. Which means it must not have any sibling entries.

    • "pieces root": For non-empty files this is the the root hash of a merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file. The last block may be shorter than 16KiB. The remaining leaf hashes beyond the end of the file required to construct upper layers of the merkle tree are set to zero. As of meta version 2 SHA2-256 is used as digest function for the merkle tree. The hash is stored in its binary form, not as human-readable string.

  • "piece layers": A dictionary of strings. For each file in the file tree that is larger than the piece size it contains one string value. The keys are the merkle roots while the values consist of concatenated hashes of one layer within that merkle tree. The layer is chosen so that one hash covers piece length bytes. For example if the piece size is 16KiB then the leaf hashes are used. If a piece size of 128KiB is used then 3rd layer up from the leaf hashes is used. Layer hashes which exclusively cover data beyond the end of file, i.e. are only needed to balance the tree, are omitted. All hashes are stored in their binary format. A torrent is not valid if this field is absent, the contained hashes do not match the merkle roots or are not from the correct layer.

Important

The file tree root dictionary itself must not be a file, i.e. it must not contain a zero-length key with a dictionary containing a length key.

Bittorrent V1

v1 meta-dictionary
  • announce: The URL of the tracker.

  • info: This maps to a dictionary, with keys described below.

    • name: maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory.

    • piece length: maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated.

    • piece length: is almost always a power of two, most commonly 2^18 = 256 K

    • pieces: maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index.

    • length: In the single file case, maps to the length of the file in bytes.

    • files: If present then the download represents a single file, otherwise it represents a set of files which go in a directory structure. For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the value files maps to, and is a list of dictionaries containing the following keys:

      • path: A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name

      • length: Maps to the length of the file in bytes.

    • length: Only present if the torrent contents is a single file. Maps to the length of the file in bytes.

Note

In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory.

MetaFile

Base Class for all TorrentFile classes.

Parameters:

Name Type Description Default
path `str`

target path to torrent content. Default: None

None
announce `str`

One or more tracker URL's. Default: None

None
comment `str`

A comment. Default: None

None
piece_length `int`

Size of torrent pieces. Default: None

None
private `bool`

For private trackers. Default: None

False
outfile `str`

target path to write .torrent file. Default: None

None
source `str`

Private tracker source. Default: None

None
noprogress `bool`

If True disable showing the progress bar.

False
Source code in torrentfile\torrent.py
class MetaFile:
    """Base Class for all TorrentFile classes.

    Parameters
    ----------
    path : `str`
        target path to torrent content.  Default: None
    announce : `str`
        One or more tracker URL's.  Default: None
    comment : `str`
        A comment.  Default: None
    piece_length : `int`
        Size of torrent pieces.  Default: None
    private : `bool`
        For private trackers.  Default: None
    outfile : `str`
        target path to write .torrent file. Default: None
    source : `str`
        Private tracker source. Default: None
    noprogress : `bool`
        If True disable showing the progress bar.
    """

    hasher = None

    @classmethod
    def set_callback(cls, func):
        """
        Assign a callback function for the Hashing class to call for each hash.

        Parameters
        ----------
        func : function
            The callback function which accepts a single paramter.
        """
        if "hasher" in vars(cls) and vars(cls)["hasher"]:
            cls.hasher.set_callback(func)

    # fmt: off
    def __init__(self, path=None, announce=None, private=False,
                 source=None, piece_length=None, comment=None,
                 outfile=None, url_list=None, noprogress=False):
        """Construct MetaFile superclass and assign local attributes."""
        if not path:
            raise utils.MissingPathError

        # base path to torrent content.
        self.path = path

        # Format piece_length attribute.
        if piece_length:
            self.piece_length = utils.normalize_piece_length(piece_length)
        else:
            self.piece_length = utils.path_piece_length(self.path)

        # Assign announce URL to empty string if none provided.
        if not announce:
            self.announce = ""
            self.announce_list = [[""]]

        # Most torrent clients have editting trackers as a feature.
        elif isinstance(announce, str):
            self.announce = announce
            self.announce_list = [announce]
        elif isinstance(announce, Sequence):
            self.announce = announce[0]
            self.announce_list = [announce]

        if private:
            self.private = 1
        else:
            self.private = None

        self.outfile = outfile
        self.noprogress = noprogress
        self.comment = comment
        self.url_list = url_list
        self.source = source
        self.meta = {
            "announce": self.announce,
            "announce-list": self.announce_list,
            "created by": f"TorrentFile:v{version}",
            "creation date": int(datetime.timestamp(datetime.now())),
            "info": {},
        }
        logger.debug("Announce list = %s", str(self.announce_list))
        if comment:
            self.meta["info"]["comment"] = comment
        if private:
            self.meta["info"]["private"] = 1
        if source:
            self.meta["info"]["source"] = source
        if url_list:
            self.meta["url-list"] = url_list
        self.meta["info"]["name"] = os.path.basename(self.path)
        self.meta["info"]["piece length"] = self.piece_length
    # fmt: on

    def assemble(self):
        """Overload in subclasses.

        Raises
        ------
        `Exception`
            NotImplementedError
        """
        raise NotImplementedError

    def sort_meta(self):
        """Sort the info and meta dictionaries."""
        meta = self.meta
        meta["info"] = dict(sorted(list(meta["info"].items())))
        meta = dict(sorted(list(meta.items())))
        return meta

    def write(self, outfile=None):
        """Write meta information to .torrent file.

        Parameters
        ----------
        outfile : `str`
            Destination path for .torrent file. default=None

        Returns
        -------
        outfile : `str`
            Where the .torrent file was writen.
        meta : `dict`
            .torrent meta information.
        """
        if outfile is not None:
            self.outfile = outfile

        if self.outfile is None:
            self.outfile = str(self.path) + ".torrent"

        self.meta = self.sort_meta()
        pyben.dump(self.meta, self.outfile)
        return self.outfile, self.meta

__init__(self, path=None, announce=None, private=False, source=None, piece_length=None, comment=None, outfile=None, url_list=None, noprogress=False) special

Construct MetaFile superclass and assign local attributes.

Source code in torrentfile\torrent.py
def __init__(self, path=None, announce=None, private=False,
             source=None, piece_length=None, comment=None,
             outfile=None, url_list=None, noprogress=False):
    """Construct MetaFile superclass and assign local attributes."""
    if not path:
        raise utils.MissingPathError

    # base path to torrent content.
    self.path = path

    # Format piece_length attribute.
    if piece_length:
        self.piece_length = utils.normalize_piece_length(piece_length)
    else:
        self.piece_length = utils.path_piece_length(self.path)

    # Assign announce URL to empty string if none provided.
    if not announce:
        self.announce = ""
        self.announce_list = [[""]]

    # Most torrent clients have editting trackers as a feature.
    elif isinstance(announce, str):
        self.announce = announce
        self.announce_list = [announce]
    elif isinstance(announce, Sequence):
        self.announce = announce[0]
        self.announce_list = [announce]

    if private:
        self.private = 1
    else:
        self.private = None

    self.outfile = outfile
    self.noprogress = noprogress
    self.comment = comment
    self.url_list = url_list
    self.source = source
    self.meta = {
        "announce": self.announce,
        "announce-list": self.announce_list,
        "created by": f"TorrentFile:v{version}",
        "creation date": int(datetime.timestamp(datetime.now())),
        "info": {},
    }
    logger.debug("Announce list = %s", str(self.announce_list))
    if comment:
        self.meta["info"]["comment"] = comment
    if private:
        self.meta["info"]["private"] = 1
    if source:
        self.meta["info"]["source"] = source
    if url_list:
        self.meta["url-list"] = url_list
    self.meta["info"]["name"] = os.path.basename(self.path)
    self.meta["info"]["piece length"] = self.piece_length

assemble(self)

Overload in subclasses.

Exceptions:

Type Description
`Exception`

NotImplementedError

Source code in torrentfile\torrent.py
def assemble(self):
    """Overload in subclasses.

    Raises
    ------
    `Exception`
        NotImplementedError
    """
    raise NotImplementedError

set_callback(func) classmethod

Assign a callback function for the Hashing class to call for each hash.

Parameters:

Name Type Description Default
func function

The callback function which accepts a single paramter.

required
Source code in torrentfile\torrent.py
@classmethod
def set_callback(cls, func):
    """
    Assign a callback function for the Hashing class to call for each hash.

    Parameters
    ----------
    func : function
        The callback function which accepts a single paramter.
    """
    if "hasher" in vars(cls) and vars(cls)["hasher"]:
        cls.hasher.set_callback(func)

sort_meta(self)

Sort the info and meta dictionaries.

Source code in torrentfile\torrent.py
def sort_meta(self):
    """Sort the info and meta dictionaries."""
    meta = self.meta
    meta["info"] = dict(sorted(list(meta["info"].items())))
    meta = dict(sorted(list(meta.items())))
    return meta

write(self, outfile=None)

Write meta information to .torrent file.

Parameters:

Name Type Description Default
outfile `str`

Destination path for .torrent file. default=None

None

Returns:

Type Description
`str`

Where the .torrent file was writen.

Source code in torrentfile\torrent.py
def write(self, outfile=None):
    """Write meta information to .torrent file.

    Parameters
    ----------
    outfile : `str`
        Destination path for .torrent file. default=None

    Returns
    -------
    outfile : `str`
        Where the .torrent file was writen.
    meta : `dict`
        .torrent meta information.
    """
    if outfile is not None:
        self.outfile = outfile

    if self.outfile is None:
        self.outfile = str(self.path) + ".torrent"

    self.meta = self.sort_meta()
    pyben.dump(self.meta, self.outfile)
    return self.outfile, self.meta

TorrentFile (MetaFile)

Class for creating Bittorrent meta files.

Construct Torrentfile class instance object.

Parameters:

Name Type Description Default
kwargs `dict`

Dictionary containing torrent file options.

{}
Source code in torrentfile\torrent.py
class TorrentFile(MetaFile):
    """Class for creating Bittorrent meta files.

    Construct *Torrentfile* class instance object.

    Parameters
    ----------
    kwargs : `dict`
        Dictionary containing torrent file options.
    """

    hasher = Hasher

    def __init__(self, **kwargs):
        """Construct TorrentFile instance with given keyword args.

        Parameters
        ----------
        kwargs : `dict`
            dictionary of keyword args passed to superclass.
        """
        super().__init__(**kwargs)
        logger.debug("Making Bittorrent V1 meta file.")
        self.assemble()

    def assemble(self):
        """Assemble components of torrent metafile.

        Returns
        -------
        `dict`
            metadata dictionary for torrent file
        """
        info = self.meta["info"]
        size, filelist = utils.filelist_total(self.path)
        if os.path.isfile(self.path):
            info["length"] = size
        else:
            info["files"] = [
                {
                    "length": os.path.getsize(path),
                    "path": os.path.relpath(path, self.path).split(os.sep),
                }
                for path in filelist
            ]

        pieces = bytearray()
        feeder = Hasher(filelist, self.piece_length)
        if self.noprogress:
            for piece in feeder:
                pieces.extend(piece)
        else:
            from tqdm import tqdm

            for piece in tqdm(
                iterable=feeder,
                desc="Hashing Content",
                total=size // self.piece_length,
                unit="bytes",
                unit_scale=True,
                unit_divisor=self.piece_length,
                initial=0,
                leave=True,
            ):
                pieces.extend(piece)
        info["pieces"] = pieces

hasher (_CbMixin)

Piece hasher for Bittorrent V1 files.

Takes a sorted list of all file paths, calculates sha1 hash for fixed size pieces of file data from each file seemlessly until the last piece which may be smaller than others.

Parameters:

Name Type Description Default
paths `list`

List of files.

required
piece_length `int`

Size of chuncks to split the data into.

required
Source code in torrentfile\torrent.py
class Hasher(_CbMixin):
    """Piece hasher for Bittorrent V1 files.

    Takes a sorted list of all file paths, calculates sha1 hash
    for fixed size pieces of file data from each file
    seemlessly until the last piece which may be smaller than others.

    Parameters
    ----------
    paths : `list`
        List of files.
    piece_length : `int`
        Size of chuncks to split the data into.
    """

    def __init__(self, paths, piece_length):
        """Generate hashes of piece length data from filelist contents."""
        self.piece_length = piece_length
        self.paths = paths
        self.total = sum([os.path.getsize(i) for i in self.paths])
        self.index = 0
        self.current = open(self.paths[0], "rb")
        logger.debug(
            "Hashing v1 torrent file. Size: %s Piece Length: %s",
            humanize_bytes(self.total),
            humanize_bytes(self.piece_length),
        )

    def __iter__(self):
        """Iterate through feed pieces.

        Returns
        -------
        self : `iterator`
            Iterator for leaves/hash pieces.
        """
        return self

    def _handle_partial(self, arr):
        """Define the handling partial pieces that span 2 or more files.

        Parameters
        ----------
        arr : `bytearray`
            Incomplete piece containing partial data
        partial : `int`
            Size of incomplete piece_length

        Returns
        -------
        digest : `bytes`
            SHA1 digest of the complete piece.
        """
        while len(arr) < self.piece_length and self.next_file():
            target = self.piece_length - len(arr)
            temp = bytearray(target)
            size = self.current.readinto(temp)
            arr.extend(temp[:size])
            if size == target:
                break
        return sha1(arr).digest()  # nosec

    def next_file(self):
        """Seemlessly transition to next file in file list."""
        self.index += 1
        if self.index < len(self.paths):
            self.current.close()
            self.current = open(self.paths[self.index], "rb")
            return True
        return False

    def __next__(self):
        """Generate piece-length pieces of data from input file list."""
        while True:
            piece = bytearray(self.piece_length)
            size = self.current.readinto(piece)
            if size == 0:
                if not self.next_file():
                    raise StopIteration
            elif size < self.piece_length:
                return self._handle_partial(piece[:size])
            else:
                return sha1(piece).digest()  # nosec
__init__(self, paths, piece_length) special

Generate hashes of piece length data from filelist contents.

Source code in torrentfile\torrent.py
def __init__(self, paths, piece_length):
    """Generate hashes of piece length data from filelist contents."""
    self.piece_length = piece_length
    self.paths = paths
    self.total = sum([os.path.getsize(i) for i in self.paths])
    self.index = 0
    self.current = open(self.paths[0], "rb")
    logger.debug(
        "Hashing v1 torrent file. Size: %s Piece Length: %s",
        humanize_bytes(self.total),
        humanize_bytes(self.piece_length),
    )
__iter__(self) special

Iterate through feed pieces.

Returns:

Type Description
`iterator`

Iterator for leaves/hash pieces.

Source code in torrentfile\torrent.py
def __iter__(self):
    """Iterate through feed pieces.

    Returns
    -------
    self : `iterator`
        Iterator for leaves/hash pieces.
    """
    return self
__next__(self) special

Generate piece-length pieces of data from input file list.

Source code in torrentfile\torrent.py
def __next__(self):
    """Generate piece-length pieces of data from input file list."""
    while True:
        piece = bytearray(self.piece_length)
        size = self.current.readinto(piece)
        if size == 0:
            if not self.next_file():
                raise StopIteration
        elif size < self.piece_length:
            return self._handle_partial(piece[:size])
        else:
            return sha1(piece).digest()  # nosec
next_file(self)

Seemlessly transition to next file in file list.

Source code in torrentfile\torrent.py
def next_file(self):
    """Seemlessly transition to next file in file list."""
    self.index += 1
    if self.index < len(self.paths):
        self.current.close()
        self.current = open(self.paths[self.index], "rb")
        return True
    return False

__init__(self, **kwargs) special

Construct TorrentFile instance with given keyword args.

Parameters:

Name Type Description Default
kwargs `dict`

dictionary of keyword args passed to superclass.

{}
Source code in torrentfile\torrent.py
def __init__(self, **kwargs):
    """Construct TorrentFile instance with given keyword args.

    Parameters
    ----------
    kwargs : `dict`
        dictionary of keyword args passed to superclass.
    """
    super().__init__(**kwargs)
    logger.debug("Making Bittorrent V1 meta file.")
    self.assemble()

assemble(self)

Assemble components of torrent metafile.

Returns:

Type Description
`dict`

metadata dictionary for torrent file

Source code in torrentfile\torrent.py
def assemble(self):
    """Assemble components of torrent metafile.

    Returns
    -------
    `dict`
        metadata dictionary for torrent file
    """
    info = self.meta["info"]
    size, filelist = utils.filelist_total(self.path)
    if os.path.isfile(self.path):
        info["length"] = size
    else:
        info["files"] = [
            {
                "length": os.path.getsize(path),
                "path": os.path.relpath(path, self.path).split(os.sep),
            }
            for path in filelist
        ]

    pieces = bytearray()
    feeder = Hasher(filelist, self.piece_length)
    if self.noprogress:
        for piece in feeder:
            pieces.extend(piece)
    else:
        from tqdm import tqdm

        for piece in tqdm(
            iterable=feeder,
            desc="Hashing Content",
            total=size // self.piece_length,
            unit="bytes",
            unit_scale=True,
            unit_divisor=self.piece_length,
            initial=0,
            leave=True,
        ):
            pieces.extend(piece)
    info["pieces"] = pieces

TorrentFileHybrid (MetaFile)

Construct the Hybrid torrent meta file with provided parameters.

Parameters:

Name Type Description Default
kwargs `dict`

Keyword arguments for torrent options.

{}
Source code in torrentfile\torrent.py
class TorrentFileHybrid(MetaFile):
    """Construct the Hybrid torrent meta file with provided parameters.

    Parameters
    ----------
    kwargs : `dict`
        Keyword arguments for torrent options.
    """

    hasher = HasherHybrid

    def __init__(self, **kwargs):
        """Create Bittorrent v1 v2 hybrid metafiles."""
        super().__init__(**kwargs)
        logger.debug("Creating Hybrid torrent file.")
        self.name = os.path.basename(self.path)
        self.hashes = []
        self.piece_layers = {}
        self.pbar = None
        self.pieces = []
        self.files = []
        self.assemble()

    def assemble(self):
        """Assemble the parts of the torrentfile into meta dictionary."""
        info = self.meta["info"]
        info["meta version"] = 2

        if not self.noprogress:
            from tqdm import tqdm

            lst = utils.get_file_list(self.path)
            self.pbar = tqdm(
                desc="Hashing Files:",
                total=len(lst),
                leave=True,
                unit="file",
            )

        if os.path.isfile(self.path):
            info["file tree"] = {self.name: self._traverse(self.path)}
            info["length"] = os.path.getsize(self.path)
            if self.pbar:
                self.pbar.update(n=1)
        else:
            info["file tree"] = self._traverse(self.path)
            info["files"] = self.files
        info["pieces"] = b"".join(self.pieces)
        self.meta["piece layers"] = self.piece_layers
        return info

    def _traverse(self, path):
        """Build meta dictionary while walking directory.

        Parameters
        ----------
        path : `str`
            Path to target file.
        """
        if os.path.isfile(path):
            file_size = os.path.getsize(path)

            self.files.append(
                {
                    "length": file_size,
                    "path": os.path.relpath(path, self.path).split(os.sep),
                }
            )

            if file_size == 0:
                if self.pbar:
                    self.pbar.update(n=1)
                return {"": {"length": file_size}}

            file_hash = HasherHybrid(path, self.piece_length)

            if file_size > self.piece_length:
                self.piece_layers[file_hash.root] = file_hash.piece_layer

            self.hashes.append(file_hash)
            self.pieces.extend(file_hash.pieces)

            if file_hash.padding_file:
                self.files.append(file_hash.padding_file)

            if self.pbar:
                self.pbar.update(n=1)

            return {"": {"length": file_size, "pieces root": file_hash.root}}

        tree = {}
        if os.path.isdir(path):
            for name in sorted(os.listdir(path)):
                tree[name] = self._traverse(os.path.join(path, name))
        return tree

hasher (_CbMixin)

Calculate root and piece hashes for creating hybrid torrent file.

Create merkle tree layers from sha256 hashed 16KiB blocks of contents. With a branching factor of 2, merge layer hashes until blocks equal piece_length bytes for the piece layer, and then the root hash.

Parameters:

Name Type Description Default
path `str`

path to target file.

required
piece_length `int`

piece length for data chunks.

required
Source code in torrentfile\torrent.py
class HasherHybrid(_CbMixin):
    """Calculate root and piece hashes for creating hybrid torrent file.

    Create merkle tree layers from sha256 hashed 16KiB blocks of contents.
    With a branching factor of 2, merge layer hashes until blocks equal
    piece_length bytes for the piece layer, and then the root hash.

    Parameters
    ----------
    path : `str`
        path to target file.
    piece_length : `int`
        piece length for data chunks.
    """

    def __init__(self, path, piece_length):
        """Construct Hasher class instances for each file in torrent."""
        self.path = path
        self.piece_length = piece_length
        self.pieces = []
        self.layer_hashes = []
        self.piece_layer = None
        self.root = None
        self.padding_piece = None
        self.padding_file = None
        self.amount = piece_length // BLOCK_SIZE
        logger.debug(
            "Hashing partial Hybrid torrent file. Piece Length: %s Path: %s",
            humanize_bytes(self.piece_length),
            str(self.path),
        )
        with open(path, "rb") as data:
            self.process_file(data)

    def _pad_remaining(self, block_count):
        """Generate Hash sized, 0 filled bytes for padding.

        Parameters
        ----------
        block_count : `int`
            current total number of blocks collected.

        Returns
        -------
        padding : `bytes`
            Padding to fill remaining portion of tree.
        """
        # when the there is only one block for file
        remaining = self.amount - block_count
        if not self.layer_hashes:
            power2 = next_power_2(block_count)
            remaining = power2 - block_count
        return [bytes(HASH_SIZE) for _ in range(remaining)]

    def process_file(self, data):
        """Calculate layer hashes for contents of file.

        Parameters
        ----------
        data : `BytesIO`
            File opened in read mode.
        """
        while True:
            plength = self.piece_length
            blocks = []
            piece = sha1()  # nosec
            total = 0
            block = bytearray(BLOCK_SIZE)
            for _ in range(self.amount):
                size = data.readinto(block)
                if not size:
                    break
                total += size
                plength -= size
                blocks.append(sha256(block[:size]).digest())
                piece.update(block[:size])
            if not blocks:
                break
            if len(blocks) != self.amount:
                padding = self._pad_remaining(len(blocks))
                blocks.extend(padding)
            layer_hash = merkle_root(blocks)
            if self._cb:
                self._cb(layer_hash)
            self.layer_hashes.append(layer_hash)
            if plength > 0:
                self.padding_file = {
                    "attr": "p",
                    "length": size,
                    "path": [".pad", str(plength)],
                }
                piece.update(bytes(plength))
            self.pieces.append(piece.digest())  # nosec
        self._calculate_root()

    def _calculate_root(self):
        """Calculate the root hash for opened file."""
        self.piece_layer = b"".join(self.layer_hashes)

        if len(self.layer_hashes) > 1:
            pad_piece = merkle_root([bytes(32) for _ in range(self.amount)])

            pow2 = next_power_2(len(self.layer_hashes))
            remainder = pow2 - len(self.layer_hashes)

            self.layer_hashes += [pad_piece for _ in range(remainder)]
        self.root = merkle_root(self.layer_hashes)
__init__(self, path, piece_length) special

Construct Hasher class instances for each file in torrent.

Source code in torrentfile\torrent.py
def __init__(self, path, piece_length):
    """Construct Hasher class instances for each file in torrent."""
    self.path = path
    self.piece_length = piece_length
    self.pieces = []
    self.layer_hashes = []
    self.piece_layer = None
    self.root = None
    self.padding_piece = None
    self.padding_file = None
    self.amount = piece_length // BLOCK_SIZE
    logger.debug(
        "Hashing partial Hybrid torrent file. Piece Length: %s Path: %s",
        humanize_bytes(self.piece_length),
        str(self.path),
    )
    with open(path, "rb") as data:
        self.process_file(data)
process_file(self, data)

Calculate layer hashes for contents of file.

Parameters:

Name Type Description Default
data `BytesIO`

File opened in read mode.

required
Source code in torrentfile\torrent.py
def process_file(self, data):
    """Calculate layer hashes for contents of file.

    Parameters
    ----------
    data : `BytesIO`
        File opened in read mode.
    """
    while True:
        plength = self.piece_length
        blocks = []
        piece = sha1()  # nosec
        total = 0
        block = bytearray(BLOCK_SIZE)
        for _ in range(self.amount):
            size = data.readinto(block)
            if not size:
                break
            total += size
            plength -= size
            blocks.append(sha256(block[:size]).digest())
            piece.update(block[:size])
        if not blocks:
            break
        if len(blocks) != self.amount:
            padding = self._pad_remaining(len(blocks))
            blocks.extend(padding)
        layer_hash = merkle_root(blocks)
        if self._cb:
            self._cb(layer_hash)
        self.layer_hashes.append(layer_hash)
        if plength > 0:
            self.padding_file = {
                "attr": "p",
                "length": size,
                "path": [".pad", str(plength)],
            }
            piece.update(bytes(plength))
        self.pieces.append(piece.digest())  # nosec
    self._calculate_root()

__init__(self, **kwargs) special

Create Bittorrent v1 v2 hybrid metafiles.

Source code in torrentfile\torrent.py
def __init__(self, **kwargs):
    """Create Bittorrent v1 v2 hybrid metafiles."""
    super().__init__(**kwargs)
    logger.debug("Creating Hybrid torrent file.")
    self.name = os.path.basename(self.path)
    self.hashes = []
    self.piece_layers = {}
    self.pbar = None
    self.pieces = []
    self.files = []
    self.assemble()

assemble(self)

Assemble the parts of the torrentfile into meta dictionary.

Source code in torrentfile\torrent.py
def assemble(self):
    """Assemble the parts of the torrentfile into meta dictionary."""
    info = self.meta["info"]
    info["meta version"] = 2

    if not self.noprogress:
        from tqdm import tqdm

        lst = utils.get_file_list(self.path)
        self.pbar = tqdm(
            desc="Hashing Files:",
            total=len(lst),
            leave=True,
            unit="file",
        )

    if os.path.isfile(self.path):
        info["file tree"] = {self.name: self._traverse(self.path)}
        info["length"] = os.path.getsize(self.path)
        if self.pbar:
            self.pbar.update(n=1)
    else:
        info["file tree"] = self._traverse(self.path)
        info["files"] = self.files
    info["pieces"] = b"".join(self.pieces)
    self.meta["piece layers"] = self.piece_layers
    return info

TorrentFileV2 (MetaFile)

Class for creating Bittorrent meta v2 files.

Parameters:

Name Type Description Default
kwargs `dict`

Keyword arguments for torrent file options.

{}
Source code in torrentfile\torrent.py
class TorrentFileV2(MetaFile):
    """Class for creating Bittorrent meta v2 files.

    Parameters
    ----------
    kwargs : `dict`
        Keyword arguments for torrent file options.
    """

    hasher = HasherV2

    def __init__(self, **kwargs):
        """Construct `TorrentFileV2` Class instance from given parameters.

        Parameters
        ----------
        kwargs : `dict`
            keywword arguments to pass to superclass.
        """
        super().__init__(**kwargs)
        logger.debug("Create .torrent v2 file.")
        self.piece_layers = {}
        self.hashes = []
        self.pbar = None
        self.assemble()

    def update(self):
        """Update for the progress bar."""
        if self.pbar:
            self.pbar.update(n=1)

    def assemble(self):
        """Assemble then return the meta dictionary for encoding.

        Returns
        -------
        meta : `dict`
            Metainformation about the torrent.
        """
        info = self.meta["info"]

        if not self.noprogress:
            from tqdm import tqdm

            lst = utils.get_file_list(self.path)
            self.pbar = tqdm(
                desc="Hashing Files:",
                total=len(lst),
                leave=True,
                unit="file",
            )

        if os.path.isfile(self.path):
            info["file tree"] = {info["name"]: self._traverse(self.path)}
            info["length"] = os.path.getsize(self.path)
            self.update()
        else:
            info["file tree"] = self._traverse(self.path)

        info["meta version"] = 2
        self.meta["piece layers"] = self.piece_layers

    def _traverse(self, path):
        """Walk directory tree.

        Parameters
        ----------
        path : `str`
            Path to file or directory.
        """
        if os.path.isfile(path):
            # Calculate Size and hashes for each file.
            size = os.path.getsize(path)

            if size == 0:
                self.update()
                return {"": {"length": size}}

            fhash = HasherV2(path, self.piece_length)

            if size > self.piece_length:
                self.piece_layers[fhash.root] = fhash.piece_layer
            self.update()
            return {"": {"length": size, "pieces root": fhash.root}}

        file_tree = {}
        if os.path.isdir(path):
            for name in sorted(os.listdir(path)):
                file_tree[name] = self._traverse(os.path.join(path, name))
        return file_tree

hasher (_CbMixin)

Calculate the root hash and piece layers for file contents.

Iterates over 16KiB blocks of data from given file, hashes the data, then creates a hash tree from the individual block hashes until size of hashed data equals the piece-length. Then continues the hash tree until root hash is calculated.

Parameters:

Name Type Description Default
path `str`

Path to file.

required
piece_length `int`

Size of layer hashes pieces.

required
Source code in torrentfile\torrent.py
class HasherV2(_CbMixin):
    """Calculate the root hash and piece layers for file contents.

    Iterates over 16KiB blocks of data from given file, hashes the data,
    then creates a hash tree from the individual block hashes until size of
    hashed data equals the piece-length.  Then continues the hash tree until
    root hash is calculated.

    Parameters
    ----------
    path : `str`
        Path to file.
    piece_length : `int`
        Size of layer hashes pieces.
    """

    def __init__(self, path, piece_length):
        """Calculate and store hash information for specific file."""
        self.path = path
        self.root = None
        self.piece_layer = None
        self.layer_hashes = []
        self.piece_length = piece_length
        self.num_blocks = piece_length // BLOCK_SIZE
        logger.debug(
            "Hashing partial v2 torrent file. Piece Length: %s Path: %s",
            humanize_bytes(self.piece_length),
            str(self.path),
        )

        with open(self.path, "rb") as fd:
            self.process_file(fd)

    def process_file(self, fd):
        """Calculate hashes over 16KiB chuncks of file content.

        Parameters
        ----------
        fd : `str`
            Opened file in read mode.
        """
        while True:
            total = 0
            blocks = []
            leaf = bytearray(BLOCK_SIZE)
            # generate leaves of merkle tree

            for _ in range(self.num_blocks):
                size = fd.readinto(leaf)
                total += size
                if not size:
                    break
                blocks.append(sha256(leaf[:size]).digest())

            # blocks is empty mean eof
            if not blocks:
                break
            if len(blocks) != self.num_blocks:
                # when size of file doesn't fill the last block
                # when the file contains multiple pieces
                remaining = self.num_blocks - len(blocks)
                if not self.layer_hashes:
                    # when the there is only one block for file
                    power2 = next_power_2(len(blocks))
                    remaining = power2 - len(blocks)

                # pad the the rest with zeroes to fill remaining space.
                padding = [bytes(32) for _ in range(remaining)]
                blocks.extend(padding)
            # calculate the root hash for the merkle tree up to piece-length

            layer_hash = merkle_root(blocks)
            if self._cb:
                self._cb(layer_hash)
            self.layer_hashes.append(layer_hash)
        self._calculate_root()

    def _calculate_root(self):
        """Calculate root hash for the target file."""
        self.piece_layer = b"".join(self.layer_hashes)
        hashes = len(self.layer_hashes)
        if hashes > 1:
            pow2 = next_power_2(hashes)
            remainder = pow2 - hashes
            pad_piece = [bytes(HASH_SIZE) for _ in range(self.num_blocks)]
            for _ in range(remainder):
                self.layer_hashes.append(merkle_root(pad_piece))
        self.root = merkle_root(self.layer_hashes)
__init__(self, path, piece_length) special

Calculate and store hash information for specific file.

Source code in torrentfile\torrent.py
def __init__(self, path, piece_length):
    """Calculate and store hash information for specific file."""
    self.path = path
    self.root = None
    self.piece_layer = None
    self.layer_hashes = []
    self.piece_length = piece_length
    self.num_blocks = piece_length // BLOCK_SIZE
    logger.debug(
        "Hashing partial v2 torrent file. Piece Length: %s Path: %s",
        humanize_bytes(self.piece_length),
        str(self.path),
    )

    with open(self.path, "rb") as fd:
        self.process_file(fd)
process_file(self, fd)

Calculate hashes over 16KiB chuncks of file content.

Parameters:

Name Type Description Default
fd `str`

Opened file in read mode.

required
Source code in torrentfile\torrent.py
def process_file(self, fd):
    """Calculate hashes over 16KiB chuncks of file content.

    Parameters
    ----------
    fd : `str`
        Opened file in read mode.
    """
    while True:
        total = 0
        blocks = []
        leaf = bytearray(BLOCK_SIZE)
        # generate leaves of merkle tree

        for _ in range(self.num_blocks):
            size = fd.readinto(leaf)
            total += size
            if not size:
                break
            blocks.append(sha256(leaf[:size]).digest())

        # blocks is empty mean eof
        if not blocks:
            break
        if len(blocks) != self.num_blocks:
            # when size of file doesn't fill the last block
            # when the file contains multiple pieces
            remaining = self.num_blocks - len(blocks)
            if not self.layer_hashes:
                # when the there is only one block for file
                power2 = next_power_2(len(blocks))
                remaining = power2 - len(blocks)

            # pad the the rest with zeroes to fill remaining space.
            padding = [bytes(32) for _ in range(remaining)]
            blocks.extend(padding)
        # calculate the root hash for the merkle tree up to piece-length

        layer_hash = merkle_root(blocks)
        if self._cb:
            self._cb(layer_hash)
        self.layer_hashes.append(layer_hash)
    self._calculate_root()

__init__(self, **kwargs) special

Construct TorrentFileV2 Class instance from given parameters.

Parameters:

Name Type Description Default
kwargs `dict`

keywword arguments to pass to superclass.

{}
Source code in torrentfile\torrent.py
def __init__(self, **kwargs):
    """Construct `TorrentFileV2` Class instance from given parameters.

    Parameters
    ----------
    kwargs : `dict`
        keywword arguments to pass to superclass.
    """
    super().__init__(**kwargs)
    logger.debug("Create .torrent v2 file.")
    self.piece_layers = {}
    self.hashes = []
    self.pbar = None
    self.assemble()

assemble(self)

Assemble then return the meta dictionary for encoding.

Returns:

Type Description
`dict`

Metainformation about the torrent.

Source code in torrentfile\torrent.py
def assemble(self):
    """Assemble then return the meta dictionary for encoding.

    Returns
    -------
    meta : `dict`
        Metainformation about the torrent.
    """
    info = self.meta["info"]

    if not self.noprogress:
        from tqdm import tqdm

        lst = utils.get_file_list(self.path)
        self.pbar = tqdm(
            desc="Hashing Files:",
            total=len(lst),
            leave=True,
            unit="file",
        )

    if os.path.isfile(self.path):
        info["file tree"] = {info["name"]: self._traverse(self.path)}
        info["length"] = os.path.getsize(self.path)
        self.update()
    else:
        info["file tree"] = self._traverse(self.path)

    info["meta version"] = 2
    self.meta["piece layers"] = self.piece_layers

update(self)

Update for the progress bar.

Source code in torrentfile\torrent.py
def update(self):
    """Update for the progress bar."""
    if self.pbar:
        self.pbar.update(n=1)