TorrentFile API
Torrent
Module
torrentfile.
torrent
Classes and procedures pertaining to the creation of torrent meta files.
Classes
-
TorrentFile
construct .torrent file. -
TorrentFileV2
construct .torrent v2 files using provided data. -
MetaFile
base class for all MetaFile classes.
Constants
-
BLOCK_SIZE :
int
size of leaf hashes for merkle tree. -
HASH_SIZE :
int
Length of a sha256 hash.
Bittorrent V2
From Bittorrent.org Documentation pages.
Implementation details for Bittorrent Protocol v2.
Note
All strings in a .torrent file that contain text must be UTF-8 encoded.
Meta Version 2 Dictionary:
-
"announce": The URL of the tracker.
-
"info": This maps to a dictionary, with keys described below.
-
"name": A display name for the torrent. It is purely advisory.
-
"piece length": The number of bytes that each logical piece in the peer protocol refers to. I.e. it sets the granularity of piece, request, bitfield and have messages. It must be a power of two and at least 6KiB.
-
"meta version": An integer value, set to 2 to indicate compatibility with the current revision of this specification. Version 1 is not assigned to avoid confusion with BEP3. Future revisions will only increment this issue to indicate an incompatible change has been made, for example that hash algorithms were changed due to newly discovered vulnerabilities. Lementations must check this field first and indicate that a torrent is of a newer version than they can handle before performing other idations which may result in more general messages about invalid files. Files are mapped into this piece address space so that each non-empty
-
"file tree": A tree of dictionaries where dictionary keys represent UTF-8 encoded path elements. Entries with zero-length keys describe the properties of the composed path at that point. 'UTF-8 encoded' context only means that if the native encoding is known at creation time it must be converted to UTF-8. Keys may contain invalid UTF-8 sequences or characters and names that are reserved on specific filesystems. Implementations must be prepared to sanitize them. On platforms path components exactly matching '.' and '..' must be sanitized since they could lead to directory traversal attacks and conflicting path descriptions. On platforms that require UTF-8 path components this sanitizing step must happen after normalizing overlong UTF-8 encodings. File is aligned to a piece boundary and occurs in same order as the file tree. The last piece of each file may be shorter than the specified piece length, resulting in an alignment gap.
-
"length": Length of the file in bytes. Presence of this field indicates that the dictionary describes a file, not a directory. Which means it must not have any sibling entries.
-
"pieces root": For non-empty files this is the the root hash of a merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file. The last block may be shorter than 16KiB. The remaining leaf hashes beyond the end of the file required to construct upper layers of the merkle tree are set to zero. As of meta version 2 SHA2-256 is used as digest function for the merkle tree. The hash is stored in its binary form, not as human-readable string.
-
-
"piece layers": A dictionary of strings. For each file in the file tree that is larger than the piece size it contains one string value. The keys are the merkle roots while the values consist of concatenated hashes of one layer within that merkle tree. The layer is chosen so that one hash covers piece length bytes. For example if the piece size is 16KiB then the leaf hashes are used. If a piece size of 128KiB is used then 3rd layer up from the leaf hashes is used. Layer hashes which exclusively cover data beyond the end of file, i.e. are only needed to balance the tree, are omitted. All hashes are stored in their binary format. A torrent is not valid if this field is absent, the contained hashes do not match the merkle roots or are not from the correct layer.
Important
The file tree root dictionary itself must not be a file, i.e. it must not contain a zero-length key with a dictionary containing a length key.
Bittorrent V1
v1 meta-dictionary
-
announce: The URL of the tracker.
-
info: This maps to a dictionary, with keys described below.
-
name
: maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory. -
piece length
: maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated. -
piece length
: is almost always a power of two, most commonly 2^18 = 256 K -
pieces
: maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index. -
length
: In the single file case, maps to the length of the file in bytes. -
files
: If present then the download represents a single file, otherwise it represents a set of files which go in a directory structure. For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the valuefiles
maps to, and is a list of dictionaries containing the following keys:-
path
: A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name -
length
: Maps to the length of the file in bytes.
-
-
length
: Only present if the torrent contents is a single file. Maps to the length of the file in bytes.
-
Note
In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory.
MetaFile
— Base Class for all TorrentFile classes.TorrentFile
— Class for creating Bittorrent meta files.TorrentFileV2
— Class for creating Bittorrent meta v2 files.TorrentFileHybrid
— Construct the Hybrid torrent meta file with provided parameters.
torrentfile.torrent
Classes and procedures pertaining to the creation of torrent meta files.
Classes
-
TorrentFile
construct .torrent file. -
TorrentFileV2
construct .torrent v2 files using provided data. -
MetaFile
base class for all MetaFile classes.
Constants
-
BLOCK_SIZE :
int
size of leaf hashes for merkle tree. -
HASH_SIZE :
int
Length of a sha256 hash.
Bittorrent V2
From Bittorrent.org Documentation pages.
Implementation details for Bittorrent Protocol v2.
Note
All strings in a .torrent file that contain text must be UTF-8 encoded.
Meta Version 2 Dictionary:
-
"announce": The URL of the tracker.
-
"info": This maps to a dictionary, with keys described below.
-
"name": A display name for the torrent. It is purely advisory.
-
"piece length": The number of bytes that each logical piece in the peer protocol refers to. I.e. it sets the granularity of piece, request, bitfield and have messages. It must be a power of two and at least 6KiB.
-
"meta version": An integer value, set to 2 to indicate compatibility with the current revision of this specification. Version 1 is not assigned to avoid confusion with BEP3. Future revisions will only increment this issue to indicate an incompatible change has been made, for example that hash algorithms were changed due to newly discovered vulnerabilities. Lementations must check this field first and indicate that a torrent is of a newer version than they can handle before performing other idations which may result in more general messages about invalid files. Files are mapped into this piece address space so that each non-empty
-
"file tree": A tree of dictionaries where dictionary keys represent UTF-8 encoded path elements. Entries with zero-length keys describe the properties of the composed path at that point. 'UTF-8 encoded' context only means that if the native encoding is known at creation time it must be converted to UTF-8. Keys may contain invalid UTF-8 sequences or characters and names that are reserved on specific filesystems. Implementations must be prepared to sanitize them. On platforms path components exactly matching '.' and '..' must be sanitized since they could lead to directory traversal attacks and conflicting path descriptions. On platforms that require UTF-8 path components this sanitizing step must happen after normalizing overlong UTF-8 encodings. File is aligned to a piece boundary and occurs in same order as the file tree. The last piece of each file may be shorter than the specified piece length, resulting in an alignment gap.
-
"length": Length of the file in bytes. Presence of this field indicates that the dictionary describes a file, not a directory. Which means it must not have any sibling entries.
-
"pieces root": For non-empty files this is the the root hash of a merkle tree with a branching factor of 2, constructed from 16KiB blocks of the file. The last block may be shorter than 16KiB. The remaining leaf hashes beyond the end of the file required to construct upper layers of the merkle tree are set to zero. As of meta version 2 SHA2-256 is used as digest function for the merkle tree. The hash is stored in its binary form, not as human-readable string.
-
-
"piece layers": A dictionary of strings. For each file in the file tree that is larger than the piece size it contains one string value. The keys are the merkle roots while the values consist of concatenated hashes of one layer within that merkle tree. The layer is chosen so that one hash covers piece length bytes. For example if the piece size is 16KiB then the leaf hashes are used. If a piece size of 128KiB is used then 3rd layer up from the leaf hashes is used. Layer hashes which exclusively cover data beyond the end of file, i.e. are only needed to balance the tree, are omitted. All hashes are stored in their binary format. A torrent is not valid if this field is absent, the contained hashes do not match the merkle roots or are not from the correct layer.
Important
The file tree root dictionary itself must not be a file, i.e. it must not contain a zero-length key with a dictionary containing a length key.
Bittorrent V1
v1 meta-dictionary
-
announce: The URL of the tracker.
-
info: This maps to a dictionary, with keys described below.
-
name
: maps to a UTF-8 encoded string which is the suggested name to save the file (or directory) as. It is purely advisory. -
piece length
: maps to the number of bytes in each piece the file is split into. For the purposes of transfer, files are split into fixed-size pieces which are all the same length except for possibly the last one which may be truncated. -
piece length
: is almost always a power of two, most commonly 2^18 = 256 K -
pieces
: maps to a string whose length is a multiple of 20. It is to be subdivided into strings of length 20, each of which is the SHA1 hash of the piece at the corresponding index. -
length
: In the single file case, maps to the length of the file in bytes. -
files
: If present then the download represents a single file, otherwise it represents a set of files which go in a directory structure. For the purposes of the other keys, the multi-file case is treated as only having a single file by concatenating the files in the order they appear in the files list. The files list is the valuefiles
maps to, and is a list of dictionaries containing the following keys:-
path
: A list of UTF-8 encoded strings corresponding to subdirectory names, the last of which is the actual file name -
length
: Maps to the length of the file in bytes.
-
-
length
: Only present if the torrent contents is a single file. Maps to the length of the file in bytes.
-
Note
In the single file case, the name key is the name of a file, in the muliple file case, it's the name of a directory.
MetaFile
Base Class for all TorrentFile classes.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
`str` |
target path to torrent content. Default: None |
None |
announce |
`str` |
One or more tracker URL's. Default: None |
None |
comment |
`str` |
A comment. Default: None |
None |
piece_length |
`int` |
Size of torrent pieces. Default: None |
None |
private |
`bool` |
For private trackers. Default: None |
False |
outfile |
`str` |
target path to write .torrent file. Default: None |
None |
source |
`str` |
Private tracker source. Default: None |
None |
noprogress |
`bool` |
If True disable showing the progress bar. |
False |
Source code in torrentfile\torrent.py
class MetaFile:
"""Base Class for all TorrentFile classes.
Parameters
----------
path : `str`
target path to torrent content. Default: None
announce : `str`
One or more tracker URL's. Default: None
comment : `str`
A comment. Default: None
piece_length : `int`
Size of torrent pieces. Default: None
private : `bool`
For private trackers. Default: None
outfile : `str`
target path to write .torrent file. Default: None
source : `str`
Private tracker source. Default: None
noprogress : `bool`
If True disable showing the progress bar.
"""
hasher = None
@classmethod
def set_callback(cls, func):
"""
Assign a callback function for the Hashing class to call for each hash.
Parameters
----------
func : function
The callback function which accepts a single paramter.
"""
if "hasher" in vars(cls) and vars(cls)["hasher"]:
cls.hasher.set_callback(func)
# fmt: off
def __init__(self, path=None, announce=None, private=False,
source=None, piece_length=None, comment=None,
outfile=None, url_list=None, noprogress=False):
"""Construct MetaFile superclass and assign local attributes."""
if not path:
raise utils.MissingPathError
# base path to torrent content.
self.path = path
# Format piece_length attribute.
if piece_length:
self.piece_length = utils.normalize_piece_length(piece_length)
else:
self.piece_length = utils.path_piece_length(self.path)
# Assign announce URL to empty string if none provided.
if not announce:
self.announce = ""
self.announce_list = [[""]]
# Most torrent clients have editting trackers as a feature.
elif isinstance(announce, str):
self.announce = announce
self.announce_list = [announce]
elif isinstance(announce, Sequence):
self.announce = announce[0]
self.announce_list = [announce]
if private:
self.private = 1
else:
self.private = None
self.outfile = outfile
self.noprogress = noprogress
self.comment = comment
self.url_list = url_list
self.source = source
self.meta = {
"announce": self.announce,
"announce-list": self.announce_list,
"created by": f"TorrentFile:v{version}",
"creation date": int(datetime.timestamp(datetime.now())),
"info": {},
}
logger.debug("Announce list = %s", str(self.announce_list))
if comment:
self.meta["info"]["comment"] = comment
if private:
self.meta["info"]["private"] = 1
if source:
self.meta["info"]["source"] = source
if url_list:
self.meta["url-list"] = url_list
self.meta["info"]["name"] = os.path.basename(self.path)
self.meta["info"]["piece length"] = self.piece_length
# fmt: on
def assemble(self):
"""Overload in subclasses.
Raises
------
`Exception`
NotImplementedError
"""
raise NotImplementedError
def sort_meta(self):
"""Sort the info and meta dictionaries."""
meta = self.meta
meta["info"] = dict(sorted(list(meta["info"].items())))
meta = dict(sorted(list(meta.items())))
return meta
def write(self, outfile=None):
"""Write meta information to .torrent file.
Parameters
----------
outfile : `str`
Destination path for .torrent file. default=None
Returns
-------
outfile : `str`
Where the .torrent file was writen.
meta : `dict`
.torrent meta information.
"""
if outfile is not None:
self.outfile = outfile
if self.outfile is None:
self.outfile = str(self.path) + ".torrent"
self.meta = self.sort_meta()
pyben.dump(self.meta, self.outfile)
return self.outfile, self.meta
__init__(self, path=None, announce=None, private=False, source=None, piece_length=None, comment=None, outfile=None, url_list=None, noprogress=False)
special
Construct MetaFile superclass and assign local attributes.
Source code in torrentfile\torrent.py
def __init__(self, path=None, announce=None, private=False,
source=None, piece_length=None, comment=None,
outfile=None, url_list=None, noprogress=False):
"""Construct MetaFile superclass and assign local attributes."""
if not path:
raise utils.MissingPathError
# base path to torrent content.
self.path = path
# Format piece_length attribute.
if piece_length:
self.piece_length = utils.normalize_piece_length(piece_length)
else:
self.piece_length = utils.path_piece_length(self.path)
# Assign announce URL to empty string if none provided.
if not announce:
self.announce = ""
self.announce_list = [[""]]
# Most torrent clients have editting trackers as a feature.
elif isinstance(announce, str):
self.announce = announce
self.announce_list = [announce]
elif isinstance(announce, Sequence):
self.announce = announce[0]
self.announce_list = [announce]
if private:
self.private = 1
else:
self.private = None
self.outfile = outfile
self.noprogress = noprogress
self.comment = comment
self.url_list = url_list
self.source = source
self.meta = {
"announce": self.announce,
"announce-list": self.announce_list,
"created by": f"TorrentFile:v{version}",
"creation date": int(datetime.timestamp(datetime.now())),
"info": {},
}
logger.debug("Announce list = %s", str(self.announce_list))
if comment:
self.meta["info"]["comment"] = comment
if private:
self.meta["info"]["private"] = 1
if source:
self.meta["info"]["source"] = source
if url_list:
self.meta["url-list"] = url_list
self.meta["info"]["name"] = os.path.basename(self.path)
self.meta["info"]["piece length"] = self.piece_length
assemble(self)
Overload in subclasses.
Exceptions:
Type | Description |
---|---|
`Exception` |
NotImplementedError |
Source code in torrentfile\torrent.py
def assemble(self):
"""Overload in subclasses.
Raises
------
`Exception`
NotImplementedError
"""
raise NotImplementedError
set_callback(func)
classmethod
Assign a callback function for the Hashing class to call for each hash.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
func |
function |
The callback function which accepts a single paramter. |
required |
Source code in torrentfile\torrent.py
@classmethod
def set_callback(cls, func):
"""
Assign a callback function for the Hashing class to call for each hash.
Parameters
----------
func : function
The callback function which accepts a single paramter.
"""
if "hasher" in vars(cls) and vars(cls)["hasher"]:
cls.hasher.set_callback(func)
sort_meta(self)
Sort the info and meta dictionaries.
Source code in torrentfile\torrent.py
def sort_meta(self):
"""Sort the info and meta dictionaries."""
meta = self.meta
meta["info"] = dict(sorted(list(meta["info"].items())))
meta = dict(sorted(list(meta.items())))
return meta
write(self, outfile=None)
Write meta information to .torrent file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
outfile |
`str` |
Destination path for .torrent file. default=None |
None |
Returns:
Type | Description |
---|---|
`str` |
Where the .torrent file was writen. |
Source code in torrentfile\torrent.py
def write(self, outfile=None):
"""Write meta information to .torrent file.
Parameters
----------
outfile : `str`
Destination path for .torrent file. default=None
Returns
-------
outfile : `str`
Where the .torrent file was writen.
meta : `dict`
.torrent meta information.
"""
if outfile is not None:
self.outfile = outfile
if self.outfile is None:
self.outfile = str(self.path) + ".torrent"
self.meta = self.sort_meta()
pyben.dump(self.meta, self.outfile)
return self.outfile, self.meta
TorrentFile (MetaFile)
Class for creating Bittorrent meta files.
Construct Torrentfile class instance object.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
`dict` |
Dictionary containing torrent file options. |
{} |
Source code in torrentfile\torrent.py
class TorrentFile(MetaFile):
"""Class for creating Bittorrent meta files.
Construct *Torrentfile* class instance object.
Parameters
----------
kwargs : `dict`
Dictionary containing torrent file options.
"""
hasher = Hasher
def __init__(self, **kwargs):
"""Construct TorrentFile instance with given keyword args.
Parameters
----------
kwargs : `dict`
dictionary of keyword args passed to superclass.
"""
super().__init__(**kwargs)
logger.debug("Making Bittorrent V1 meta file.")
self.assemble()
def assemble(self):
"""Assemble components of torrent metafile.
Returns
-------
`dict`
metadata dictionary for torrent file
"""
info = self.meta["info"]
size, filelist = utils.filelist_total(self.path)
if os.path.isfile(self.path):
info["length"] = size
else:
info["files"] = [
{
"length": os.path.getsize(path),
"path": os.path.relpath(path, self.path).split(os.sep),
}
for path in filelist
]
pieces = bytearray()
feeder = Hasher(filelist, self.piece_length)
if self.noprogress:
for piece in feeder:
pieces.extend(piece)
else:
from tqdm import tqdm
for piece in tqdm(
iterable=feeder,
desc="Hashing Content",
total=size // self.piece_length,
unit="bytes",
unit_scale=True,
unit_divisor=self.piece_length,
initial=0,
leave=True,
):
pieces.extend(piece)
info["pieces"] = pieces
hasher (_CbMixin)
Piece hasher for Bittorrent V1 files.
Takes a sorted list of all file paths, calculates sha1 hash for fixed size pieces of file data from each file seemlessly until the last piece which may be smaller than others.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
paths |
`list` |
List of files. |
required |
piece_length |
`int` |
Size of chuncks to split the data into. |
required |
Source code in torrentfile\torrent.py
class Hasher(_CbMixin):
"""Piece hasher for Bittorrent V1 files.
Takes a sorted list of all file paths, calculates sha1 hash
for fixed size pieces of file data from each file
seemlessly until the last piece which may be smaller than others.
Parameters
----------
paths : `list`
List of files.
piece_length : `int`
Size of chuncks to split the data into.
"""
def __init__(self, paths, piece_length):
"""Generate hashes of piece length data from filelist contents."""
self.piece_length = piece_length
self.paths = paths
self.total = sum([os.path.getsize(i) for i in self.paths])
self.index = 0
self.current = open(self.paths[0], "rb")
logger.debug(
"Hashing v1 torrent file. Size: %s Piece Length: %s",
humanize_bytes(self.total),
humanize_bytes(self.piece_length),
)
def __iter__(self):
"""Iterate through feed pieces.
Returns
-------
self : `iterator`
Iterator for leaves/hash pieces.
"""
return self
def _handle_partial(self, arr):
"""Define the handling partial pieces that span 2 or more files.
Parameters
----------
arr : `bytearray`
Incomplete piece containing partial data
partial : `int`
Size of incomplete piece_length
Returns
-------
digest : `bytes`
SHA1 digest of the complete piece.
"""
while len(arr) < self.piece_length and self.next_file():
target = self.piece_length - len(arr)
temp = bytearray(target)
size = self.current.readinto(temp)
arr.extend(temp[:size])
if size == target:
break
return sha1(arr).digest() # nosec
def next_file(self):
"""Seemlessly transition to next file in file list."""
self.index += 1
if self.index < len(self.paths):
self.current.close()
self.current = open(self.paths[self.index], "rb")
return True
return False
def __next__(self):
"""Generate piece-length pieces of data from input file list."""
while True:
piece = bytearray(self.piece_length)
size = self.current.readinto(piece)
if size == 0:
if not self.next_file():
raise StopIteration
elif size < self.piece_length:
return self._handle_partial(piece[:size])
else:
return sha1(piece).digest() # nosec
__init__(self, paths, piece_length)
special
Generate hashes of piece length data from filelist contents.
Source code in torrentfile\torrent.py
def __init__(self, paths, piece_length):
"""Generate hashes of piece length data from filelist contents."""
self.piece_length = piece_length
self.paths = paths
self.total = sum([os.path.getsize(i) for i in self.paths])
self.index = 0
self.current = open(self.paths[0], "rb")
logger.debug(
"Hashing v1 torrent file. Size: %s Piece Length: %s",
humanize_bytes(self.total),
humanize_bytes(self.piece_length),
)
__iter__(self)
special
Iterate through feed pieces.
Returns:
Type | Description |
---|---|
`iterator` |
Iterator for leaves/hash pieces. |
Source code in torrentfile\torrent.py
def __iter__(self):
"""Iterate through feed pieces.
Returns
-------
self : `iterator`
Iterator for leaves/hash pieces.
"""
return self
__next__(self)
special
Generate piece-length pieces of data from input file list.
Source code in torrentfile\torrent.py
def __next__(self):
"""Generate piece-length pieces of data from input file list."""
while True:
piece = bytearray(self.piece_length)
size = self.current.readinto(piece)
if size == 0:
if not self.next_file():
raise StopIteration
elif size < self.piece_length:
return self._handle_partial(piece[:size])
else:
return sha1(piece).digest() # nosec
next_file(self)
Seemlessly transition to next file in file list.
Source code in torrentfile\torrent.py
def next_file(self):
"""Seemlessly transition to next file in file list."""
self.index += 1
if self.index < len(self.paths):
self.current.close()
self.current = open(self.paths[self.index], "rb")
return True
return False
__init__(self, **kwargs)
special
Construct TorrentFile instance with given keyword args.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
`dict` |
dictionary of keyword args passed to superclass. |
{} |
Source code in torrentfile\torrent.py
def __init__(self, **kwargs):
"""Construct TorrentFile instance with given keyword args.
Parameters
----------
kwargs : `dict`
dictionary of keyword args passed to superclass.
"""
super().__init__(**kwargs)
logger.debug("Making Bittorrent V1 meta file.")
self.assemble()
assemble(self)
Assemble components of torrent metafile.
Returns:
Type | Description |
---|---|
`dict` |
metadata dictionary for torrent file |
Source code in torrentfile\torrent.py
def assemble(self):
"""Assemble components of torrent metafile.
Returns
-------
`dict`
metadata dictionary for torrent file
"""
info = self.meta["info"]
size, filelist = utils.filelist_total(self.path)
if os.path.isfile(self.path):
info["length"] = size
else:
info["files"] = [
{
"length": os.path.getsize(path),
"path": os.path.relpath(path, self.path).split(os.sep),
}
for path in filelist
]
pieces = bytearray()
feeder = Hasher(filelist, self.piece_length)
if self.noprogress:
for piece in feeder:
pieces.extend(piece)
else:
from tqdm import tqdm
for piece in tqdm(
iterable=feeder,
desc="Hashing Content",
total=size // self.piece_length,
unit="bytes",
unit_scale=True,
unit_divisor=self.piece_length,
initial=0,
leave=True,
):
pieces.extend(piece)
info["pieces"] = pieces
TorrentFileHybrid (MetaFile)
Construct the Hybrid torrent meta file with provided parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
`dict` |
Keyword arguments for torrent options. |
{} |
Source code in torrentfile\torrent.py
class TorrentFileHybrid(MetaFile):
"""Construct the Hybrid torrent meta file with provided parameters.
Parameters
----------
kwargs : `dict`
Keyword arguments for torrent options.
"""
hasher = HasherHybrid
def __init__(self, **kwargs):
"""Create Bittorrent v1 v2 hybrid metafiles."""
super().__init__(**kwargs)
logger.debug("Creating Hybrid torrent file.")
self.name = os.path.basename(self.path)
self.hashes = []
self.piece_layers = {}
self.pbar = None
self.pieces = []
self.files = []
self.assemble()
def assemble(self):
"""Assemble the parts of the torrentfile into meta dictionary."""
info = self.meta["info"]
info["meta version"] = 2
if not self.noprogress:
from tqdm import tqdm
lst = utils.get_file_list(self.path)
self.pbar = tqdm(
desc="Hashing Files:",
total=len(lst),
leave=True,
unit="file",
)
if os.path.isfile(self.path):
info["file tree"] = {self.name: self._traverse(self.path)}
info["length"] = os.path.getsize(self.path)
if self.pbar:
self.pbar.update(n=1)
else:
info["file tree"] = self._traverse(self.path)
info["files"] = self.files
info["pieces"] = b"".join(self.pieces)
self.meta["piece layers"] = self.piece_layers
return info
def _traverse(self, path):
"""Build meta dictionary while walking directory.
Parameters
----------
path : `str`
Path to target file.
"""
if os.path.isfile(path):
file_size = os.path.getsize(path)
self.files.append(
{
"length": file_size,
"path": os.path.relpath(path, self.path).split(os.sep),
}
)
if file_size == 0:
if self.pbar:
self.pbar.update(n=1)
return {"": {"length": file_size}}
file_hash = HasherHybrid(path, self.piece_length)
if file_size > self.piece_length:
self.piece_layers[file_hash.root] = file_hash.piece_layer
self.hashes.append(file_hash)
self.pieces.extend(file_hash.pieces)
if file_hash.padding_file:
self.files.append(file_hash.padding_file)
if self.pbar:
self.pbar.update(n=1)
return {"": {"length": file_size, "pieces root": file_hash.root}}
tree = {}
if os.path.isdir(path):
for name in sorted(os.listdir(path)):
tree[name] = self._traverse(os.path.join(path, name))
return tree
hasher (_CbMixin)
Calculate root and piece hashes for creating hybrid torrent file.
Create merkle tree layers from sha256 hashed 16KiB blocks of contents. With a branching factor of 2, merge layer hashes until blocks equal piece_length bytes for the piece layer, and then the root hash.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
`str` |
path to target file. |
required |
piece_length |
`int` |
piece length for data chunks. |
required |
Source code in torrentfile\torrent.py
class HasherHybrid(_CbMixin):
"""Calculate root and piece hashes for creating hybrid torrent file.
Create merkle tree layers from sha256 hashed 16KiB blocks of contents.
With a branching factor of 2, merge layer hashes until blocks equal
piece_length bytes for the piece layer, and then the root hash.
Parameters
----------
path : `str`
path to target file.
piece_length : `int`
piece length for data chunks.
"""
def __init__(self, path, piece_length):
"""Construct Hasher class instances for each file in torrent."""
self.path = path
self.piece_length = piece_length
self.pieces = []
self.layer_hashes = []
self.piece_layer = None
self.root = None
self.padding_piece = None
self.padding_file = None
self.amount = piece_length // BLOCK_SIZE
logger.debug(
"Hashing partial Hybrid torrent file. Piece Length: %s Path: %s",
humanize_bytes(self.piece_length),
str(self.path),
)
with open(path, "rb") as data:
self.process_file(data)
def _pad_remaining(self, block_count):
"""Generate Hash sized, 0 filled bytes for padding.
Parameters
----------
block_count : `int`
current total number of blocks collected.
Returns
-------
padding : `bytes`
Padding to fill remaining portion of tree.
"""
# when the there is only one block for file
remaining = self.amount - block_count
if not self.layer_hashes:
power2 = next_power_2(block_count)
remaining = power2 - block_count
return [bytes(HASH_SIZE) for _ in range(remaining)]
def process_file(self, data):
"""Calculate layer hashes for contents of file.
Parameters
----------
data : `BytesIO`
File opened in read mode.
"""
while True:
plength = self.piece_length
blocks = []
piece = sha1() # nosec
total = 0
block = bytearray(BLOCK_SIZE)
for _ in range(self.amount):
size = data.readinto(block)
if not size:
break
total += size
plength -= size
blocks.append(sha256(block[:size]).digest())
piece.update(block[:size])
if not blocks:
break
if len(blocks) != self.amount:
padding = self._pad_remaining(len(blocks))
blocks.extend(padding)
layer_hash = merkle_root(blocks)
if self._cb:
self._cb(layer_hash)
self.layer_hashes.append(layer_hash)
if plength > 0:
self.padding_file = {
"attr": "p",
"length": size,
"path": [".pad", str(plength)],
}
piece.update(bytes(plength))
self.pieces.append(piece.digest()) # nosec
self._calculate_root()
def _calculate_root(self):
"""Calculate the root hash for opened file."""
self.piece_layer = b"".join(self.layer_hashes)
if len(self.layer_hashes) > 1:
pad_piece = merkle_root([bytes(32) for _ in range(self.amount)])
pow2 = next_power_2(len(self.layer_hashes))
remainder = pow2 - len(self.layer_hashes)
self.layer_hashes += [pad_piece for _ in range(remainder)]
self.root = merkle_root(self.layer_hashes)
__init__(self, path, piece_length)
special
Construct Hasher class instances for each file in torrent.
Source code in torrentfile\torrent.py
def __init__(self, path, piece_length):
"""Construct Hasher class instances for each file in torrent."""
self.path = path
self.piece_length = piece_length
self.pieces = []
self.layer_hashes = []
self.piece_layer = None
self.root = None
self.padding_piece = None
self.padding_file = None
self.amount = piece_length // BLOCK_SIZE
logger.debug(
"Hashing partial Hybrid torrent file. Piece Length: %s Path: %s",
humanize_bytes(self.piece_length),
str(self.path),
)
with open(path, "rb") as data:
self.process_file(data)
process_file(self, data)
Calculate layer hashes for contents of file.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
data |
`BytesIO` |
File opened in read mode. |
required |
Source code in torrentfile\torrent.py
def process_file(self, data):
"""Calculate layer hashes for contents of file.
Parameters
----------
data : `BytesIO`
File opened in read mode.
"""
while True:
plength = self.piece_length
blocks = []
piece = sha1() # nosec
total = 0
block = bytearray(BLOCK_SIZE)
for _ in range(self.amount):
size = data.readinto(block)
if not size:
break
total += size
plength -= size
blocks.append(sha256(block[:size]).digest())
piece.update(block[:size])
if not blocks:
break
if len(blocks) != self.amount:
padding = self._pad_remaining(len(blocks))
blocks.extend(padding)
layer_hash = merkle_root(blocks)
if self._cb:
self._cb(layer_hash)
self.layer_hashes.append(layer_hash)
if plength > 0:
self.padding_file = {
"attr": "p",
"length": size,
"path": [".pad", str(plength)],
}
piece.update(bytes(plength))
self.pieces.append(piece.digest()) # nosec
self._calculate_root()
__init__(self, **kwargs)
special
Create Bittorrent v1 v2 hybrid metafiles.
Source code in torrentfile\torrent.py
def __init__(self, **kwargs):
"""Create Bittorrent v1 v2 hybrid metafiles."""
super().__init__(**kwargs)
logger.debug("Creating Hybrid torrent file.")
self.name = os.path.basename(self.path)
self.hashes = []
self.piece_layers = {}
self.pbar = None
self.pieces = []
self.files = []
self.assemble()
assemble(self)
Assemble the parts of the torrentfile into meta dictionary.
Source code in torrentfile\torrent.py
def assemble(self):
"""Assemble the parts of the torrentfile into meta dictionary."""
info = self.meta["info"]
info["meta version"] = 2
if not self.noprogress:
from tqdm import tqdm
lst = utils.get_file_list(self.path)
self.pbar = tqdm(
desc="Hashing Files:",
total=len(lst),
leave=True,
unit="file",
)
if os.path.isfile(self.path):
info["file tree"] = {self.name: self._traverse(self.path)}
info["length"] = os.path.getsize(self.path)
if self.pbar:
self.pbar.update(n=1)
else:
info["file tree"] = self._traverse(self.path)
info["files"] = self.files
info["pieces"] = b"".join(self.pieces)
self.meta["piece layers"] = self.piece_layers
return info
TorrentFileV2 (MetaFile)
Class for creating Bittorrent meta v2 files.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
`dict` |
Keyword arguments for torrent file options. |
{} |
Source code in torrentfile\torrent.py
class TorrentFileV2(MetaFile):
"""Class for creating Bittorrent meta v2 files.
Parameters
----------
kwargs : `dict`
Keyword arguments for torrent file options.
"""
hasher = HasherV2
def __init__(self, **kwargs):
"""Construct `TorrentFileV2` Class instance from given parameters.
Parameters
----------
kwargs : `dict`
keywword arguments to pass to superclass.
"""
super().__init__(**kwargs)
logger.debug("Create .torrent v2 file.")
self.piece_layers = {}
self.hashes = []
self.pbar = None
self.assemble()
def update(self):
"""Update for the progress bar."""
if self.pbar:
self.pbar.update(n=1)
def assemble(self):
"""Assemble then return the meta dictionary for encoding.
Returns
-------
meta : `dict`
Metainformation about the torrent.
"""
info = self.meta["info"]
if not self.noprogress:
from tqdm import tqdm
lst = utils.get_file_list(self.path)
self.pbar = tqdm(
desc="Hashing Files:",
total=len(lst),
leave=True,
unit="file",
)
if os.path.isfile(self.path):
info["file tree"] = {info["name"]: self._traverse(self.path)}
info["length"] = os.path.getsize(self.path)
self.update()
else:
info["file tree"] = self._traverse(self.path)
info["meta version"] = 2
self.meta["piece layers"] = self.piece_layers
def _traverse(self, path):
"""Walk directory tree.
Parameters
----------
path : `str`
Path to file or directory.
"""
if os.path.isfile(path):
# Calculate Size and hashes for each file.
size = os.path.getsize(path)
if size == 0:
self.update()
return {"": {"length": size}}
fhash = HasherV2(path, self.piece_length)
if size > self.piece_length:
self.piece_layers[fhash.root] = fhash.piece_layer
self.update()
return {"": {"length": size, "pieces root": fhash.root}}
file_tree = {}
if os.path.isdir(path):
for name in sorted(os.listdir(path)):
file_tree[name] = self._traverse(os.path.join(path, name))
return file_tree
hasher (_CbMixin)
Calculate the root hash and piece layers for file contents.
Iterates over 16KiB blocks of data from given file, hashes the data, then creates a hash tree from the individual block hashes until size of hashed data equals the piece-length. Then continues the hash tree until root hash is calculated.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
path |
`str` |
Path to file. |
required |
piece_length |
`int` |
Size of layer hashes pieces. |
required |
Source code in torrentfile\torrent.py
class HasherV2(_CbMixin):
"""Calculate the root hash and piece layers for file contents.
Iterates over 16KiB blocks of data from given file, hashes the data,
then creates a hash tree from the individual block hashes until size of
hashed data equals the piece-length. Then continues the hash tree until
root hash is calculated.
Parameters
----------
path : `str`
Path to file.
piece_length : `int`
Size of layer hashes pieces.
"""
def __init__(self, path, piece_length):
"""Calculate and store hash information for specific file."""
self.path = path
self.root = None
self.piece_layer = None
self.layer_hashes = []
self.piece_length = piece_length
self.num_blocks = piece_length // BLOCK_SIZE
logger.debug(
"Hashing partial v2 torrent file. Piece Length: %s Path: %s",
humanize_bytes(self.piece_length),
str(self.path),
)
with open(self.path, "rb") as fd:
self.process_file(fd)
def process_file(self, fd):
"""Calculate hashes over 16KiB chuncks of file content.
Parameters
----------
fd : `str`
Opened file in read mode.
"""
while True:
total = 0
blocks = []
leaf = bytearray(BLOCK_SIZE)
# generate leaves of merkle tree
for _ in range(self.num_blocks):
size = fd.readinto(leaf)
total += size
if not size:
break
blocks.append(sha256(leaf[:size]).digest())
# blocks is empty mean eof
if not blocks:
break
if len(blocks) != self.num_blocks:
# when size of file doesn't fill the last block
# when the file contains multiple pieces
remaining = self.num_blocks - len(blocks)
if not self.layer_hashes:
# when the there is only one block for file
power2 = next_power_2(len(blocks))
remaining = power2 - len(blocks)
# pad the the rest with zeroes to fill remaining space.
padding = [bytes(32) for _ in range(remaining)]
blocks.extend(padding)
# calculate the root hash for the merkle tree up to piece-length
layer_hash = merkle_root(blocks)
if self._cb:
self._cb(layer_hash)
self.layer_hashes.append(layer_hash)
self._calculate_root()
def _calculate_root(self):
"""Calculate root hash for the target file."""
self.piece_layer = b"".join(self.layer_hashes)
hashes = len(self.layer_hashes)
if hashes > 1:
pow2 = next_power_2(hashes)
remainder = pow2 - hashes
pad_piece = [bytes(HASH_SIZE) for _ in range(self.num_blocks)]
for _ in range(remainder):
self.layer_hashes.append(merkle_root(pad_piece))
self.root = merkle_root(self.layer_hashes)
__init__(self, path, piece_length)
special
Calculate and store hash information for specific file.
Source code in torrentfile\torrent.py
def __init__(self, path, piece_length):
"""Calculate and store hash information for specific file."""
self.path = path
self.root = None
self.piece_layer = None
self.layer_hashes = []
self.piece_length = piece_length
self.num_blocks = piece_length // BLOCK_SIZE
logger.debug(
"Hashing partial v2 torrent file. Piece Length: %s Path: %s",
humanize_bytes(self.piece_length),
str(self.path),
)
with open(self.path, "rb") as fd:
self.process_file(fd)
process_file(self, fd)
Calculate hashes over 16KiB chuncks of file content.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
fd |
`str` |
Opened file in read mode. |
required |
Source code in torrentfile\torrent.py
def process_file(self, fd):
"""Calculate hashes over 16KiB chuncks of file content.
Parameters
----------
fd : `str`
Opened file in read mode.
"""
while True:
total = 0
blocks = []
leaf = bytearray(BLOCK_SIZE)
# generate leaves of merkle tree
for _ in range(self.num_blocks):
size = fd.readinto(leaf)
total += size
if not size:
break
blocks.append(sha256(leaf[:size]).digest())
# blocks is empty mean eof
if not blocks:
break
if len(blocks) != self.num_blocks:
# when size of file doesn't fill the last block
# when the file contains multiple pieces
remaining = self.num_blocks - len(blocks)
if not self.layer_hashes:
# when the there is only one block for file
power2 = next_power_2(len(blocks))
remaining = power2 - len(blocks)
# pad the the rest with zeroes to fill remaining space.
padding = [bytes(32) for _ in range(remaining)]
blocks.extend(padding)
# calculate the root hash for the merkle tree up to piece-length
layer_hash = merkle_root(blocks)
if self._cb:
self._cb(layer_hash)
self.layer_hashes.append(layer_hash)
self._calculate_root()
__init__(self, **kwargs)
special
Construct TorrentFileV2
Class instance from given parameters.
Parameters:
Name | Type | Description | Default |
---|---|---|---|
kwargs |
`dict` |
keywword arguments to pass to superclass. |
{} |
Source code in torrentfile\torrent.py
def __init__(self, **kwargs):
"""Construct `TorrentFileV2` Class instance from given parameters.
Parameters
----------
kwargs : `dict`
keywword arguments to pass to superclass.
"""
super().__init__(**kwargs)
logger.debug("Create .torrent v2 file.")
self.piece_layers = {}
self.hashes = []
self.pbar = None
self.assemble()
assemble(self)
Assemble then return the meta dictionary for encoding.
Returns:
Type | Description |
---|---|
`dict` |
Metainformation about the torrent. |
Source code in torrentfile\torrent.py
def assemble(self):
"""Assemble then return the meta dictionary for encoding.
Returns
-------
meta : `dict`
Metainformation about the torrent.
"""
info = self.meta["info"]
if not self.noprogress:
from tqdm import tqdm
lst = utils.get_file_list(self.path)
self.pbar = tqdm(
desc="Hashing Files:",
total=len(lst),
leave=True,
unit="file",
)
if os.path.isfile(self.path):
info["file tree"] = {info["name"]: self._traverse(self.path)}
info["length"] = os.path.getsize(self.path)
self.update()
else:
info["file tree"] = self._traverse(self.path)
info["meta version"] = 2
self.meta["piece layers"] = self.piece_layers
update(self)
Update for the progress bar.
Source code in torrentfile\torrent.py
def update(self):
"""Update for the progress bar."""
if self.pbar:
self.pbar.update(n=1)