Transformers¶
Every different survey type which passes through SDX may require:
- validation of data fields
- application of business logic
- conversion to downstream formats
We give each survey type its own transformer class. The class implements the custom behaviour required.
In order to minimise duplication of code, each new transformer class should inherit from a base class which already provides that functionality: Transformer.
Two legacy classes exist for the generation of page images. Transformer classes delegate to them for image generation:
Transformer¶
Transform survey data into formats required downstream.
-
class
sdx.common.transformer.
Transformer
(response, seq_nr=0, log=None)[source]¶ A base class for SDX transformers.
Subclasses should define the contents of the following class variables:
-
defn
= []¶ Transformer subclasses declare their transforms in this class variable. Each element is a 3-tuple consisting of:
- An integer or range corresponding to one or more question ids.
- A default value for the question(s).
- A Processing function.
Eg, to declare question ids 151, 152, 153 as unsigned integers with a default of 0:
defn = [ (range(151, 154, 1), 0, Processor.unsigned_integer), ]
-
package
= 'sdx.common'¶ Defines the package where survey definitions can be found. For deployed Python packages, this will be a dotted package name. If your Transformer class is deployed from a source repository, you should set this class variable to __name__.
-
pattern
= '../surveys/{survey_id}.{inst_id}.json'¶ Defines a search pattern for survey definitions based on identifiers. The path is relative to the location specified by
package
above.
-
classmethod
ops
()[source]¶ Publish the sequence of operations for the transform.
Return an ordered mapping from question id to default value and processing function.
-
ImageTransformer¶
The SDX Image Transformer:
-
class
sdx.common.transforms.ImageTransformer.
ImageTransformer
(logger, settings, survey, response_data, sequence_no=1000)[source]¶ -
static
extract_pdf_images
(path, f_name)[source]¶ Extract pages from a PDF document.
Parameters: - path (str) – The location of the working directory.
- f_name (str) – The file name of the PDF document.
Returns: A sorted sequence of image file names.
This method delegates to the pdftoppm utility.
-
static