srccheck - Checksum functionality for source code
- Purpose:
This module is used to perform checksum calculations on a collection of files to verify if the checksum calculated on each file matches the expected checksum value.
In practical terms, an application can call the
check()
method by passing a list of filepaths to be checksummed, along with a reference file (containing the expected checksums). If the checksum values match the reference file, a value ofTrue
is returned to the caller application, signaling the inspected source code files have not been modified and are ‘safe’ for use. Otherwise, a value ofFalse
is returned to the caller the filenames of each failing file are printed to the terminal.- Platform:
Linux/Windows | Python 3.7+
- Developer:
J Berendt
- Email:
- Comments:
n/a
- Example usage:
Generate an un-encrypted reference file:
>>> from utils4.srccheck import srccheck >>> files = ['list.c', 'of.py', 'files.sql'] >>> srccheck.generate(filepaths=files, encrypt=False)
Verify checksums from within an application, with an un-encrypted reference file:
>>> from utils4.srccheck import srccheck >>> srccheck.check(ref_file='path/to/srccheck.ref') True
Generate an encrypted reference file:
>>> from utils4.srccheck import srccheck >>> files = ['list.c', 'of.py', 'files.sql'] >>> srccheck.generate(filepaths=files, encrypt=True)
Verify checksums from within an application, with an encrypted reference file:
>>> from utils4.srccheck import srccheck >>> srccheck.check(ref_file='path/to/srccheck.ref', key_file='path/to/srccheck.key') True
Advanced usage:
If you wish to delay the output of mismatched files (to give the caller application display control), the caller can redirected the output from the
check()
method into a buffer and display at a more appropriate time. For example:>>> from contextlib import redirect_stdout >>> from io import StringIO >>> from utils4.srccheck import srccheck >>> buff = StringIO() >>> with redirect_stdout(buff): >>> test = srccheck.check(ref_file='path/to/srccheck.ref') >>> # ... >>> if not test: >>> print(buff.getvalue()) >>> buff.close() Checksum verification has failed for the following: - 02-01_first.c - 10-09_ptr_exchange.c - 06-ex07.c - 15-ex05_col_output.c - 02-03_multi_lines.c
- class srccheck.SourceCheck[source]
Verify source code checksums values are as expected.
- check(ref_file: str, key_file: str = '') bool [source]
Verify the provided source code file checksums are as expected.
If any checksum do not match, the names of those files are reported to the terminal.
- Parameters:
ref_file (str) – Full path to the reference file containing the full paths to the file(s) to be tested and the associated checksum value(s).
key_file (str, optional) – Full path to the key file. If a key file is not provided, the method assumes the reference file is in plaintext CSV and does not attempt to decrypt. Defaults to ‘’.
Note
If the
key_file
argument is not provided, it is assumed theref_file
is a plaintext CSV file, and decryption is not attempted.If the
key_file
argument is provided, it is assumed theref_file
has been encrypted, and decryption is carried out.- Raises:
FileNotFoundError – If either the reference file, or key file do not exist.
- Returns:
True if all file’s checksum values agree with the checksum listed in the reference file; otherwise False.
- Return type:
bool
- generate(filepaths: List[str], encrypt: bool = False)[source]
Generate the reference file containing the source file checksums, and the associated key file.
- Parameters:
filepaths (list[str]) – A list of full paths which are to be included in the reference file.
encrypt (bool, optional) – Encrypt the reference file and generate a key file. Defaults to False.
- Reference File:
If unencrypted:
The reference file is a flat, plaintext CSV file with the file path as the first field and the checksum value as the second field.
For example:
filepath_01,md5_hash_string_01 filepath_02,md5_hash_string_02 filepath_03,md5_hash_string_03 ... filepath_NN,md5_hash_string_NN
If encrypted:
The reference file contains is a serialised, encrypted representation of the full path and associated checksum value for all provided files, in JSON format. This data is written to the
srccheck.ref
file.A unique encryption key is created and stored with each call to this method, and stored to the
srccheck.key
file.To perform checks, both the reference file and the key file must be provided to the
check()
method.Note
These files are a pair. If one file is lost, the other file is useless.
- Layout:
If encrypted:
The layout of the deserialised and decrypted reference file is in basic JSON format, with the filename as the keys, and checksum values as the values.
For example:
{"filepath_01": "md5_hash_string_01", "filepath_02": "md5_hash_string_02", "filepath_03": "md5_hash_string_03", ..., "filepath_NN": "md5_hash_string_NN"}
- Raises:
FileNotFoundError – If any of the files provided to the
filepaths
argument do not exist.