srccheck - Checksum functionality for source code

Purpose:

This module is used to perform checksum calculations on a collection of files to verify if the checksum calculated on each file matches the expected checksum value.

In practical terms, an application can call the check() method by passing a list of filepaths to be checksummed, along with a reference file (containing the expected checksums). If the checksum values match the reference file, a value of True is returned to the caller application, signaling the inspected source code files have not been modified and are ‘safe’ for use. Otherwise, a value of False is returned to the caller the filenames of each failing file are printed to the terminal.

Platform:

Linux/Windows | Python 3.7+

Developer:

J Berendt

Email:

development@s3dev.uk

Comments:

n/a

Example usage:

Generate an un-encrypted reference file:

>>> from utils4.srccheck import srccheck

>>> files = ['list.c', 'of.py', 'files.sql']
>>> srccheck.generate(filepaths=files, encrypt=False)

Verify checksums from within an application, with an un-encrypted reference file:

>>> from utils4.srccheck import srccheck

>>> srccheck.check(ref_file='path/to/srccheck.ref')
True

Generate an encrypted reference file:

>>> from utils4.srccheck import srccheck

>>> files = ['list.c', 'of.py', 'files.sql']
>>> srccheck.generate(filepaths=files, encrypt=True)

Verify checksums from within an application, with an encrypted reference file:

>>> from utils4.srccheck import srccheck

>>> srccheck.check(ref_file='path/to/srccheck.ref',
                   key_file='path/to/srccheck.key')
True

Advanced usage:

If you wish to delay the output of mismatched files (to give the caller application display control), the caller can redirected the output from the check() method into a buffer and display at a more appropriate time. For example:

>>> from contextlib import redirect_stdout
>>> from io import StringIO
>>> from utils4.srccheck import srccheck

>>> buff = StringIO()
>>> with redirect_stdout(buff):
>>>     test = srccheck.check(ref_file='path/to/srccheck.ref')

>>> # ...

>>> if not test:
>>>     print(buff.getvalue())
>>> buff.close()

Checksum verification has failed for the following:
- 02-01_first.c
- 10-09_ptr_exchange.c
- 06-ex07.c
- 15-ex05_col_output.c
- 02-03_multi_lines.c
class srccheck.SourceCheck[source]

Verify source code checksums values are as expected.

check(ref_file: str, key_file: str = '') bool[source]

Verify the provided source code file checksums are as expected.

If any checksum do not match, the names of those files are reported to the terminal.

Parameters:
  • ref_file (str) – Full path to the reference file containing the full paths to the file(s) to be tested and the associated checksum value(s).

  • key_file (str, optional) – Full path to the key file. If a key file is not provided, the method assumes the reference file is in plaintext CSV and does not attempt to decrypt. Defaults to ‘’.

Note

If the key_file argument is not provided, it is assumed the ref_file is a plaintext CSV file, and decryption is not attempted.

If the key_file argument is provided, it is assumed the ref_file has been encrypted, and decryption is carried out.

Raises:

FileNotFoundError – If either the reference file, or key file do not exist.

Returns:

True if all file’s checksum values agree with the checksum listed in the reference file; otherwise False.

Return type:

bool

generate(filepaths: List[str], encrypt: bool = False)[source]

Generate the reference file containing the source file checksums, and the associated key file.

Parameters:
  • filepaths (list[str]) – A list of full paths which are to be included in the reference file.

  • encrypt (bool, optional) – Encrypt the reference file and generate a key file. Defaults to False.

Reference File:

If unencrypted:

The reference file is a flat, plaintext CSV file with the file path as the first field and the checksum value as the second field.

For example:

filepath_01,md5_hash_string_01
filepath_02,md5_hash_string_02
filepath_03,md5_hash_string_03
...
filepath_NN,md5_hash_string_NN

If encrypted:

The reference file contains is a serialised, encrypted representation of the full path and associated checksum value for all provided files, in JSON format. This data is written to the srccheck.ref file.

A unique encryption key is created and stored with each call to this method, and stored to the srccheck.key file.

To perform checks, both the reference file and the key file must be provided to the check() method.

Note

These files are a pair. If one file is lost, the other file is useless.

Layout:

If encrypted:

The layout of the deserialised and decrypted reference file is in basic JSON format, with the filename as the keys, and checksum values as the values.

For example:

{"filepath_01": "md5_hash_string_01",
 "filepath_02": "md5_hash_string_02",
 "filepath_03": "md5_hash_string_03",
 ...,
 "filepath_NN": "md5_hash_string_NN"}
Raises:

FileNotFoundError – If any of the files provided to the filepaths argument do not exist.