Transformers

Every different survey type which passes through SDX may require:

  • validation of data fields
  • application of business logic
  • conversion to downstream formats

We give each survey type its own transformer class. The class implements the custom behaviour required.

In order to minimise duplication of code, each new transformer class should inherit from a base class which already provides that functionality: Transformer.

Two legacy classes exist for the generation of page images. Transformer classes delegate to them for image generation:

Transformer

Transform survey data into formats required downstream.

class sdx.common.transformer.Transformer(response, seq_nr=0, log=None)[source]

A base class for SDX transformers.

Subclasses should define the contents of the following class variables:

defn = []

Transformer subclasses declare their transforms in this class variable. Each element is a 3-tuple consisting of:

  1. An integer or range corresponding to one or more question ids.
  2. A default value for the question(s).
  3. A Processing function.

Eg, to declare question ids 151, 152, 153 as unsigned integers with a default of 0:

defn = [
 (range(151, 154, 1), 0, Processor.unsigned_integer),

]
package = 'sdx.common'

Defines the package where survey definitions can be found. For deployed Python packages, this will be a dotted package name. If your Transformer class is deployed from a source repository, you should set this class variable to __name__.

pattern = '../surveys/{survey_id}.{inst_id}.json'

Defines a search pattern for survey definitions based on identifiers. The path is relative to the location specified by package above.

classmethod ops()[source]

Publish the sequence of operations for the transform.

Return an ordered mapping from question id to default value and processing function.

classmethod transform(data, survey=None)[source]

Perform a transform on survey data.

static create_zip(locn, manifest)[source]

Create a zip archive from a local directory and a manifest list.

Return the contents of the zip as bytes.

pack(settings=None, img_seq=None, tmp='tmp')[source]

Perform transformation on the survey data and pack the output into a zip file.

Return the contents of the zip as bytes. The object maintains a temporary directory while the output is generated.

ImageTransformer

The SDX Image Transformer:

class sdx.common.transforms.ImageTransformer.ImageTransformer(logger, settings, survey, response_data, sequence_no=1000)[source]
static extract_pdf_images(path, f_name)[source]

Extract pages from a PDF document.

Parameters:
  • path (str) – The location of the working directory.
  • f_name (str) – The file name of the PDF document.
Returns:

A sorted sequence of image file names.

This method delegates to the pdftoppm utility.

create_image_sequence(path, nmbr_seq=None)[source]

Renumber the image sequence extracted from pdf

Parameters:
  • path (str) – The location of the working directory.
  • nmbr_seq (list or generator.) – A sequence or generator of integers.
Returns:

A generator of file paths.

create_image_index(images)[source]

Takes a list of images and creates a index csv from them

create_zip(images, index)[source]

Create a zip from a renumbered sequence

cleanup(locn)[source]

Remove all temporary files, by removing top level dir

PDFTransformer

SDX PDF Transformer.

Example:

python transform/transformers/PDFTransformer.py –survey transform/surveys/144.0001.json < tests/replies/ukis-01.json > output.pdf

class sdx.common.transforms.PDFTransformer.PDFTransformer[source]