thelper package¶
Top-level package for the ‘thelper’ framework.
Running import thelper
will recursively import all important subpackages and modules.
Subpackages¶
- thelper.data package
- thelper.gui package
- thelper.nn package
- thelper.optim package
- thelper.tasks package
- thelper.train package
- thelper.transforms package
Submodules¶
thelper.cli module¶
Command-line module, for use with a __main__
entrypoint.
This module contains the primary functions used to create or resume a training session, to start a visualization session, or to start an annotation session. The basic argument that needs to be provided by the user to create any kind of session is a configuration dictionary. For sessions that produce outputs, the path to a directory where to save the data is also needed.
-
thelper.cli.
annotate_data
(config, save_dir)[source]¶ Launches an annotation session for a dataset using a specialized GUI tool.
Note that the annotation type must be supported by the GUI tool. The annotations created by the user during the session will be saved in the session directory.
- Parameters
config – a dictionary that provides all required dataset and GUI tool configuration parameters; see
thelper.data.utils.create_parsers()
andthelper.gui.utils.create_annotator()
for more information.save_dir – the path to the root directory where the session directory should be saved. Note that this is not the path to the session directory itself, but its parent, which may also contain other session directories.
-
thelper.cli.
create_session
(config, save_dir)[source]¶ Creates a session to train a model.
All generated outputs (model checkpoints and logs) will be saved in a directory named after the session (the name itself is specified in
config
), and located insave_dir
.- Parameters
config – a dictionary that provides all required data configuration and trainer parameters; see
thelper.train.base.Trainer
andthelper.data.utils.create_loaders()
for more information. Here, it is only expected to contain aname
field that specifies the name of the session.save_dir – the path to the root directory where the session directory should be saved. Note that this is not the path to the session directory itself, but its parent, which may also contain other session directories.
See also
-
thelper.cli.
export_model
(config, save_dir)[source]¶ Launches a model exportation session.
This function will export a model defined via a configuration file into a new checkpoint that can be loaded elsewhere. The model can be built using the framework, or provided via its type, construction parameters, and weights. Its exported format will be compatible with the framework, and may also be an optimized/compiled version obtained using PyTorch’s JIT tracer.
The configuration dictionary must minimally contain a ‘model’ section that provides details on the model to be exported. A section named ‘export’ can be used to provide settings regarding the exportation approaches to use, and the task interface to save with the model. If a task is not explicitly defined in the ‘export’ section, the session configuration will be parsed for a ‘datasets’ section that can be used to define it. Otherwise, it must be provided through the model.
The exported checkpoint containing the model will be saved in the session’s output directory.
- Parameters
config – a dictionary that provides all required data configuration parameters; see
thelper.nn.utils.create_model()
for more information.save_dir – the path to the root directory where the session directory should be saved. Note that this is not the path to the session directory itself, but its parent, which may also contain other session directories.
See also
-
thelper.cli.
main
(args=None, argparser=None)[source]¶ Main entrypoint to use with console applications.
This function parses command line arguments and dispatches the execution based on the selected operating mode. Run with
--help
for information on the available arguments.Warning
If you are trying to resume a session that was previously executed using a now unavailable GPU, you will have to force the checkpoint data to be loaded on CPU using
--map-location=cpu
(or using-m=cpu
).
-
thelper.cli.
make_argparser
()[source]¶ Creates the (default) argument parser to use for the main entrypoint.
The argument parser will contain different “operating modes” that dictate the high-level behavior of the CLI. This function may be modified in branches of the framework to add project-specific features.
-
thelper.cli.
resume_session
(ckptdata, save_dir, config=None, eval_only=False)[source]¶ Resumes a previously created training session.
Since the saved checkpoints contain the original session’s configuration, the
config
argument can be set toNone
if the session should simply pick up where it was interrupted. Otherwise, theconfig
argument can be set to a new configuration that will override the older one. This is useful when fine-tuning a model, or when testing on a new dataset.Warning
If a session is resumed with an overriding configuration, the user must make sure that the inputs/outputs of the older model are compatible with the new parameters. For example, with classifiers, this means that the number of classes parsed by the dataset (and thus to be predicted by the model) should remain the same. This is a limitation of the framework that should be addressed in a future update.
Warning
A resumed session will not be compatible with its original RNG states if the number of workers used is changed. To get 100% reproducible results, make sure you run with the same worker count.
- Parameters
ckptdata – raw checkpoint data loaded via
torch.load()
; it will be parsed by the various parts of the framework that need to reload their previous state.save_dir – the path to the root directory where the session directory should be saved. Note that this is not the path to the session directory itself, but its parent, which may also contain other session directories.
config – a dictionary that provides all required data configuration and trainer parameters; see
thelper.train.base.Trainer
andthelper.data.utils.create_loaders()
for more information. Here, it is only expected to contain aname
field that specifies the name of the session.eval_only – specifies whether training should be resumed or the model should only be evaluated.
See also
-
thelper.cli.
setup
(args=None, argparser=None)[source]¶ Sets up the argument parser (if not already done externally) and parses the input CLI arguments.
This function may return an error code (integer) if the program should exit immediately. Otherwise, it will return the parsed arguments to use in order to redirect the execution flow of the entrypoint.
-
thelper.cli.
split_data
(config, save_dir)[source]¶ Launches a dataset splitting session.
This mode will generate an HDF5 archive that contains the split datasets defined in the session configuration file. This archive can then be reused in a new training session to guarantee a fixed distribution of training, validation, and testing samples. It can also be used outside the framework in order to reproduce an experiment.
The configuration dictionary must minimally contain two sections: ‘datasets’ and ‘loaders’. A third section, ‘split’, can be used to provide settings regarding the archive packing and compression approaches to use.
The HDF5 archive will be saved in the session’s output directory.
- Parameters
config – a dictionary that provides all required data configuration parameters; see
thelper.data.utils.create_loaders()
for more information.save_dir – the path to the root directory where the session directory should be saved. Note that this is not the path to the session directory itself, but its parent, which may also contain other session directories.
-
thelper.cli.
visualize_data
(config)[source]¶ Displays the images used in a training session.
This mode does not generate any output, and is only used to visualize the (transformed) images used in a training session. This is useful to debug the data augmentation and base transformation pipelines and make sure the modified images are valid. It does not attempt to load a model or instantiate a trainer, meaning the related fields are not required inside
config
.If the configuration dictionary includes a ‘loaders’ field, it will be parsed and used. Otherwise, if only a ‘datasets’ field is available, basic loaders will be instantiated to load the data. The ‘loaders’ field can also be ignored if ‘ignore_loaders’ is found within the ‘viz’ section of the config and set to
True
. Each minibatch will be displayed via pyplot or OpenCV. The display will block and wait for user input, unless ‘block’ is set within the ‘viz’ section’s ‘kwargs’ config asFalse
.- Parameters
config – a dictionary that provides all required data configuration parameters; see
thelper.data.utils.create_loaders()
for more information.
thelper.typedefs module¶
Typing definitions for thelper.
-
class
thelper.typedefs.
DataLoader
(dataset, batch_size=1, shuffle=False, sampler=None, batch_sampler=None, num_workers=0, collate_fn=<function default_collate>, pin_memory=False, drop_last=False, timeout=0, worker_init_fn=None)[source]¶ Bases:
torch.utils.data.dataloader.DataLoader
-
thelper.typedefs.
LoaderType
¶ alias of
thelper.typedefs.DataLoader
-
thelper.typedefs.
ModelType
¶ alias of
thelper.typedefs.Module
thelper.utils module¶
General utilities module.
This module only contains non-ML specific functions, i/o helpers, and matplotlib/pyplot drawing calls.
-
class
thelper.utils.
Struct
(**kwargs)[source]¶ Bases:
object
Generic runtime-defined C-like data structure (maps constructor elements to fields).
-
thelper.utils.
apply_color_map
(image, colormap, dst=None)[source]¶ Applies a color map to an image of 8-bit color indices; works similarly to cv2.applyColorMap (v3.3.1).
-
thelper.utils.
check_func_signature
(func, params)[source]¶ Checks whether the signature of a function matches the expected parameter list.
-
thelper.utils.
clipstr
(s, size, fill=' ')[source]¶ Clips a string to a specific length, with an optional fill character.
-
thelper.utils.
decode_data
(data, approach='lz4', **kwargs)[source]¶ Decodes a binary array using a given coding approach.
- Parameters
data – the binary array to decode.
approach – the encoding; supports none, lz4, jpg, png.
See also
-
thelper.utils.
download_file
(url, root, filename, md5=None)[source]¶ Downloads a file from a given URL to a local destination.
- Parameters
url – path to query for the file (query will be based on urllib).
root – destination folder where the file should be saved.
filename – destination name for the file.
md5 – optional, for md5 integrity check.
- Returns
The path to the downloaded file.
-
thelper.utils.
draw
(task, input, pred=None, target=None, block=False, ch_transpose=True, flip_bgr=False, redraw=None, **kwargs)[source]¶ Draws and returns a figure of a model input/predictions/targets using pyplot or OpenCV.
-
thelper.utils.
draw_bbox
(image, tl, br, text, color, box_thickness=2, font_thickness=1, font_scale=0.4, show=False, block=False, win_name='bbox')[source]¶ Draws a single bounding box on a given image (used in
thelper.utils.draw_bboxes()
).
-
thelper.utils.
draw_bboxes
(images, preds=None, bboxes=None, color_map=None, redraw=None, block=False, min_confidence=0.5, **kwargs)[source]¶ Draws and returns a set of bounding box prediction results.
-
thelper.utils.
draw_classifs
(images, preds=None, labels=None, class_names_map=None, redraw=None, block=False, **kwargs)[source]¶ Draws and returns a set of classification results.
-
thelper.utils.
draw_confmat
(confmat, class_list, size_inch=(5, 5), dpi=320, normalize=False, keep_unset=False, show=False, block=False)[source]¶ Draws and returns an a confusion matrix figure using pyplot.
-
thelper.utils.
draw_errbars
(labels, min_values, max_values, stddev_values, mean_values, xlabel='', ylabel='Raw Value', show=False, block=False)[source]¶ Draws and returns an error bar histogram figure using pyplot.
-
thelper.utils.
draw_histogram
(data, bins=50, xlabel='', ylabel='Proportion', show=False, block=False)[source]¶ Draws and returns a histogram figure using pyplot.
-
thelper.utils.
draw_images
(images, captions=None, redraw=None, show=True, block=False, use_cv2=True, cv2_flip_bgr=True, img_shape=None, max_img_size=None, grid_size_x=None, grid_size_y=None, caption_opts=None, window_name=None)[source]¶ Draws a set of images with optional captions.
-
thelper.utils.
draw_pascalvoc_curve
(metrics, size_inch=(5, 5), dpi=320, show=False, block=False)[source]¶ Draws and returns a precision-recall curve according to pascalvoc metrics.
-
thelper.utils.
draw_popbars
(labels, counts, xlabel='', ylabel='Pop. Count', show=False, block=False)[source]¶ Draws and returns a bar histogram figure using pyplot.
-
thelper.utils.
draw_predicts
(images, preds=None, targets=None, swap_channels=False, redraw=None, block=False, **kwargs)[source]¶ Draws and returns a set of generic prediction results.
-
thelper.utils.
draw_roc_curve
(fpr, tpr, labels=None, size_inch=(5, 5), dpi=320, show=False, block=False)[source]¶ Draws and returns an ROC curve figure using pyplot.
-
thelper.utils.
draw_segments
(images, preds=None, masks=None, color_map=None, redraw=None, block=False, **kwargs)[source]¶ Draws and returns a set of segmentation results.
-
thelper.utils.
encode_data
(data, approach='lz4', **kwargs)[source]¶ Encodes a numpy array using a given coding approach.
- Parameters
data – the numpy array to encode.
approach – the encoding; supports none, lz4, jpg, png.
See also
-
thelper.utils.
extract_tar
(filepath, root, flags='r:gz')[source]¶ Extracts the content of a tar file to a specific location.
- Parameters
filepath – location of the tar archive.
root – where to extract the archive’s content.
flags – extra flags passed to
tarfile.open
.
-
thelper.utils.
get_available_cuda_devices
(attempts_per_device=5)[source]¶ Tests all visible cuda devices and returns a list of available ones.
- Returns
List of available cuda device IDs (integers). An empty list means no cuda device is available, and the app should fallback to cpu.
-
thelper.utils.
get_bgr_from_hsl
(hue, sat, light)[source]¶ Converts a single HSL triplet (0-360 hue, 0-1 sat & lightness) into an 8-bit RGB triplet.
-
thelper.utils.
get_caller_name
(skip=2)[source]¶ Returns the name of a caller in the format module.class.method.
- Parameters
skip – specifies how many levels of stack to skip while getting the caller.
- Returns
An empty string is returned if skipped levels exceed stack height; otherwise, returns the requested caller name.
-
thelper.utils.
get_displayable_heatmap
(array, convert_rgb=True)[source]¶ Returns a ‘displayable’ array that has been min-maxed and mapped to color triplets.
-
thelper.utils.
get_displayable_image
(image, grayscale=False)[source]¶ Returns a ‘displayable’ image that has been normalized and padded to three channels.
-
thelper.utils.
get_env_list
()[source]¶ Returns a list of all packages installed in the current environment.
If the required packages cannot be imported, the returned list will be empty. Note that some packages may not be properly detected by this approach, and it is pretty hacky, so use it with a grain of salt (i.e. logging is fine).
-
thelper.utils.
get_file_paths
(input_path, data_root, allow_glob=False, can_be_dir=False)[source]¶ Parse a wildcard-compatible file name pattern at a given root level for valid file paths.
-
thelper.utils.
get_git_stamp
()[source]¶ Returns a print-friendly SHA signature for the framework’s underlying git repository (if found).
-
thelper.utils.
get_glob_paths
(input_glob_pattern, can_be_dir=False)[source]¶ Parse a wildcard-compatible file name pattern for valid file paths.
-
thelper.utils.
get_key
(key, config, msg=None, delete=False)[source]¶ Returns a value given a dictionary key, throwing if not available.
-
thelper.utils.
get_key_def
(key, config, default=None, msg=None, delete=False)[source]¶ Returns a value given a dictionary key, or the default value if it cannot be found.
-
thelper.utils.
get_label_color_mapping
(idx)[source]¶ Returns the PASCAL VOC color triplet for a given label index.
-
thelper.utils.
get_log_stamp
()[source]¶ Returns a print-friendly and filename-friendly identification string containing platform and time.
-
thelper.utils.
get_save_dir
(out_root, dir_name, config=None, resume=False, backup_ext='.json')[source]¶ Returns a directory path in which the app can save its data.
If a folder with name
dir_name
already exists in the directoryout_root
, then the user will be asked to pick a new name. If the user refuses,sys.exit(1)
is called. If config is notNone
, it will be saved to the output directory as a json file. Finally, alogs
directory will also be created in the output directory for writing logger files.- Parameters
out_root – path to the directory root where the save directory should be created.
dir_name – name of the save directory to create. If it already exists, a new one will be requested.
config – dictionary of app configuration parameters. Used to overwrite i/o queries, and will be written to the save directory in json format to test writing. Default is
None
.resume – specifies whether this session is new, or resumed from an older one (in the latter case, overwriting is allowed, and the user will never have to choose a new folder)
backup_ext – extension to use when creating configuration file backups.
- Returns
The path to the created save directory for this session.
-
thelper.utils.
import_class
(fullname)[source]¶ General-purpose runtime class importer.
- Supported syntax:
module.package.Class
will import the fully qualifiedClass
located inpackage
from the installedmodule
/some/path/mod.pkg.Cls
will importCls
as fully qualifiedmod.pkg.Cls
from/some/path
directory
- Parameters
fullname – the fully qualified class name to be imported.
- Returns
The imported class.
-
thelper.utils.
import_function
(fullname, params=None)[source]¶ General-purpose runtime function importer, with support for param binding.
- Parameters
fullname – the fully qualified function name to be imported.
params – optional params dictionary to bind to the function call via functools.
- Returns
The imported function, with optionally bound parameters.
-
thelper.utils.
init_logger
(log_level=0, filename=None, force_stdout=False)[source]¶ Initializes the framework logger with a specific filter level, and optional file output.
-
thelper.utils.
is_scalar
(val)[source]¶ Returns whether the input value is a scalar according to numpy and PyTorch.
-
thelper.utils.
load_checkpoint
(ckpt, map_location=None, always_load_latest=False, check_version=True)[source]¶ Loads a session checkpoint via PyTorch, check its compatibility, and returns its data.
If the
ckpt
parameter is a path to a valid directory, then that directly will be searched for a checkpoint. If multiple checkpoints are found, the latest will be returned (based on the epoch index in its name). iFalways_load_latest
is set to False and if a checkpoint namedckpt.best.pth
is found, it will be returned instead.- Parameters
ckpt – a file-like object or a path to the checkpoint file or session directory.
map_location – a function, string or a dict specifying how to remap storage locations. See
torch.load
for more information.always_load_latest – toggles whether to always try to load the latest checkpoint if a session directory is provided (instead of loading the ‘best’ checkpoint).
check_version – toggles whether the checkpoint’s version should be checked for compatibility issues, and query the user for how to proceed.
- Returns
Content of the checkpoint (a dictionary).
-
thelper.utils.
lreplace
(string, old_prefix, new_prefix)[source]¶ Replaces a single occurrence of old_prefix in the given string by new_prefix.
-
thelper.utils.
migrate_checkpoint
(ckptdata)[source]¶ Migrates the content of an incompatible or outdated checkpoint to the current version of the framework.
This function might not be able to fix all backward compatibility issues (e.g. it cannot fix class interfaces that were changed). Perfect reproductibility of tests cannot be guaranteed either if this migration tool is used.
- Parameters
ckptdata – checkpoint data in dictionary form obtained via
thelper.utils.load_checkpoint
. Note that the data contained in this dictionary will be modified in-place.- Returns
An updated checkpoint dictionary that should be compatible with the current version of the framework.
-
thelper.utils.
migrate_config
(config, cfg_ver_str)[source]¶ Migrates the content of an incompatible or outdated configuration to the current version of the framework.
This function might not be able to fix all backward compatibility issues (e.g. it cannot fix class interfaces that were changed). Perfect reproductibility of tests cannot be guaranteed either if this migration tool is used.
- Parameters
config – session configuration dictionary obtained e.g. by parsing a JSON file. Note that the data contained in this dictionary will be modified in-place.
cfg_ver_str – string representing the version for which the configuration was created (e.g. “0.2.0”).
- Returns
An updated configuration dictionary that should be compatible with the current version of the framework.
-
thelper.utils.
query_string
(question, choices=None, default=None, allow_empty=False, bypass=None)[source]¶ Asks the user a question and returns the answer (a generic string).
- Parameters
question – the string that is presented to the user.
choices – a list of predefined choices that the user can pick from. If
None
, then whatever the user types will be accepted.default – the presumed answer if the user just hits
<Enter>
. IfNone
, then an answer is required to continue.allow_empty – defines whether an empty answer should be accepted.
bypass – the returned value if the
bypass_queries
global variable is set toTrue
. Can beNone
, in which case the function will throw an exception.
- Returns
The string entered by the user.
-
thelper.utils.
query_yes_no
(question, default=None, bypass=None)[source]¶ Asks the user a yes/no question and returns the answer.
- Parameters
question – the string that is presented to the user.
default – the presumed answer if the user just hits
<Enter>
. It must be ‘yes’, ‘no’, orNone
(meaning an answer is required).bypass – the option to select if the
bypass_queries
global variable is set toTrue
. Can beNone
, in which case the function will throw an exception.
- Returns
True
for ‘yes’, orFalse
for ‘no’ (or their respective variations).
-
thelper.utils.
reporthook
(count, block_size, total_size)[source]¶ Report hook used to display a download progression bar when using urllib requests.
-
thelper.utils.
resolve_import
(fullname)[source]¶ Class name resolver.
Takes a string corresponding to a module and class fullname to be imported with
thelper.utils.import_class()
and resolves any back compatibility issues related to renamed or moved classes.- Parameters
fullname – the fully qualified class name to be resolved.
- Returns
The resolved class fullname.
-
thelper.utils.
safe_crop
(image, tl, br, bordertype=0, borderval=0, force_copy=False)[source]¶ Safely crops a region from within an image, padding borders if needed.
- Parameters
image – the image to crop (provided as a numpy array).
tl – a tuple or list specifying the (x,y) coordinates of the top-left crop corner.
br – a tuple or list specifying the (x,y) coordinates of the bottom-right crop corner.
bordertype – border copy type to use when the image is too small for the required crop size. See
cv2.copyMakeBorder
for more information.borderval – border value to use when the image is too small for the required crop size. See
cv2.copyMakeBorder
for more information.force_copy – defines whether to force a copy of the target image region even when it can be avoided.
- Returns
The cropped image.
-
thelper.utils.
save_config
(config, path, force_convert=True)[source]¶ Saves the given session/object configuration dictionary to the provided path.
The type of file that is created is based on the extension specified in the path. If the file cannot hold some of the objects within the configuration, they will be converted to strings before serialization, unless force_convert is set to False (in which case the function will raise an exception).
- Parameters
config – the session/object configuration dictionary to save.
path – the path specifying where to create the output file. The extension used will determine what type of backup to create (e.g. Pickle = .pkl, JSON = .json).
force_convert – specifies whether non-serializable types should be converted if necessary.
-
thelper.utils.
save_env_list
(path)[source]¶ Saves a list of all packages installed in the current environment to a log file.
- Parameters
path – the path where the log file should be created.
-
thelper.utils.
setup_cudnn
(config)[source]¶ Parses the provided config for CUDNN flags and sets up PyTorch accordingly.
-
thelper.utils.
setup_cv2
(config)[source]¶ Parses the provided config for OpenCV flags and sets up its global state accordingly.
-
thelper.utils.
setup_globals
(config)[source]¶ Parses the provided config for global flags and sets up the global state accordingly.
-
thelper.utils.
setup_plt
(config)[source]¶ Parses the provided config for matplotlib flags and sets up its global state accordingly.
-
thelper.utils.
str2bool
(s)[source]¶ Converts a string to a boolean.
If the lower case version of the provided string matches any of ‘true’, ‘1’, or ‘yes’, then the function returns
True
.
-
thelper.utils.
str2size
(input_str)[source]¶ Returns a (WIDTH, HEIGHT) integer size tuple from a string formatted as ‘WxH’.