sensortoolkit.lib_utils._setup.SensorSetup

class SensorSetup(name, path=None)[source]

Bases: sensortoolkit.lib_utils._setup._Setup

Interactive class for handling the sensor data ingestion process.

Users specify various attributes about sensor datasets, including column names for parameter data and timestamp entries. A renaming scheme is then constructed for converting the original naming scheme for columns into a standardized format for parameter names. The formatting for columns with date or time-like entries is then specified. The file type for sensor data is selected from a dictionary of valid data types than can be ingested.

Parameters
  • name (str) – The name assigned to the air sensor. Typically incudes the sensor make (manufacturer) and model.

  • path (str, optional) – The path to the directory where the user intends to store data, figures, and reports relating to the sensor being testing. Defaults to None.

Methods

add_param_attrib

Assign parameter header attribute.

checkParamUnits

Prompt user to indicate whether units for passed parameter are the same as the preset units specified for the corresponding SDFS parameter.

config

Wrapper method for standard configuration setup.

copyDataSets

Copy recorded datasets from the selected file or folder location to the ../data/sensor_data/[sensor_name]/raw_data directory path.

exportSetup

Save the setup configuration to a setup.json file.

loadDataFile

Helper function for loading the first few rows of recorded datasets.

loadPreviousSetup

Ask the user if a previous setup config exists for the type of sensor or reference dataset that they are loading.

parseDataSets

Load the first few rows of recorded sensor datasets located in the ../data/sensor_data/[sensor_name]/raw_data directory path.

printSelectionBanner

Display a banner indicating the current configuration step.

selectDataSets

Choose the selection scheme for pointing to recorded data files.

setColumnHeaders

Manually set column headers if the user indicates 'None' for the row index for the column headers in setHeaderIndex().

setDataExtension

Select the file data extension for to the datasets that will be loaded.

setDataRelPath

Assign the relative path for the recorded dataset subdirectory.

setDateTimeFormat

Configure the date/time formatting for date/time column(s) specified in setTimeHeaders().

setHeaderIndex

Select the integer index position for the row containing headers.

setParamHeaders

Select the SDFS parameters corresponding to column names discovered by ParseDataSets().

setSerials

Indicate unique serial identifiers for each sensor unit tested.

setTimeHeaders

Specify the column(s) containing date/timestamp information.

setTimeZone

Select the time zone associated with the date/time column(s).

specifyCustomIngest

Ask the user whether a custom, prewritten ingestion module will be used to import sensor data instead of the standard_ingest() method.

Attributes

custom_params

data_types

pp

sdfs_params

add_param_attrib(param, attrib_key, attrib_val)

Assign parameter header attribute.

Search through the column index entries, if the parameter name within the column index subdictionary, add the passed attribute key and value.

Parameters
  • param (str) – The name of the parameter.

  • attrib_key (str) – The key to assign to the subdictionary entry.

  • attrib_val (int, float, or str) – The value to assign to the subdictionary entry.

Returns

None.

checkParamUnits(param, sdfs_param)

Prompt user to indicate whether units for passed parameter are the same as the preset units specified for the corresponding SDFS parameter.

Parameters
  • param (str) – The name of the parameter as logged in recorded datasets.

  • sdfs_param (str) – The name of the SDFS parameter corresponding to the recorded parameter.

Returns

A scalar quantity for converting the concentrations from the unit basis in which data were recorded to the unit basis for the SDFS parameter.

Return type

val (int, float, or Nonetype)

config()

Wrapper method for standard configuration setup.

Utilized by both sensor and reference setup schemes.

Returns

None.

copyDataSets()

Copy recorded datasets from the selected file or folder location to the ../data/sensor_data/[sensor_name]/raw_data directory path.

Returns

None.

exportSetup()

Save the setup configuration to a setup.json file.

Returns

None.

loadDataFile(file, **kwargs)

Helper function for loading the first few rows of recorded datasets.

Parameters

file (str) – Full path to dataset file.

Keyword Arguments:

Parameters

nrows (int) – The number of rows to load for the passed dataset. Defaults to 1.

Raises

TypeError – If data type is not in the list of valid extensions.

Returns

A DataFrame containing the first few rows of recorded datasets.

Return type

df (pandas DataFrame)

loadPreviousSetup()

Ask the user if a previous setup config exists for the type of sensor or reference dataset that they are loading. If they choose to use a previously configured setup.json file, use the file attributes to fill in various setup components, such as the parameter renaming, datetime formatting, etc.

Returns

None.

parseDataSets(print_banner=True)

Load the first few rows of recorded sensor datasets located in the ../data/sensor_data/[sensor_name]/raw_data directory path.

The names of column headers are located based on the indicated head index. A list of unique column headers is stored for subsequent reassignment of column header names.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

printSelectionBanner(select_type, options=[], notes=[])

Display a banner indicating the current configuration step.

Parameters
  • select_type (str) – The title of the configuration section.

  • options (list, optional) –

    List of interactive options indicating keyword characters used to modify the state of thge console and a description of what entering that keyword does. Defaults to [].

    Example

    >>> options = ['..press X to end adding entries',
                   '..press D to delete the previous entry']
    

  • notes (list, optional) – A list of strings containing notes or resources that may provide helpful context for the selection input or operation. Defaults to [].

Returns

None.

selectDataSets()

Choose the selection scheme for pointing to recorded data files.

Selection options include the following:

  • 'directory', which will locate and copy all of the data files in the specified directory for the indicated data type

  • 'recursive directory', which will locate and copy all data files within the specified directory and any subdirectories contained within the indicated folder path

  • 'files' which copies over files that the user manually selects within a directory.

Returns

None.

setColumnHeaders(print_banner=True)

Manually set column headers if the user indicates 'None' for the row index for the column headers in setHeaderIndex().

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Raises

ValueError – Raise if the value of the entered index is invalid (less than zero).

Returns

None.

setDataExtension()

Select the file data extension for to the datasets that will be loaded.

Choose the corresponding data file type for recorded datasets from '.csv', '.txt', '.xlsx'.

Returns

None.

setDataRelPath()

Assign the relative path for the recorded dataset subdirectory.

The relative path stems from the project path.

For sensor data, the relative path to raw (recorded datasets) should appear something like: /data/sensor_data/[sensor_name]/raw_data where ‘sensor_name’ is the name given to the air sensor.

For reference datasets, the relative path to raw (recorded datasets) should appear something like: /data/reference_data/[reference_data_source]/raw/[sitename_siteid] where ‘reference_data_source’ is the source or api service from which data were acquired, ‘sitename’ is the name given to the site, and ‘siteid’ is the AQS id for the site (if applicable).

Returns

None.

setDateTimeFormat()

Configure the date/time formatting for date/time column(s) specified in setTimeHeaders().

Returns

None.

setHeaderIndex(print_banner=True)

Select the integer index position for the row containing headers.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

setParamHeaders(print_banner=True)

Select the SDFS parameters corresponding to column names discovered by ParseDataSets().

A parameter renaming dictionary is created for reassigning the names of header labels.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

setSerials()[source]

Indicate unique serial identifiers for each sensor unit tested.

The identifying keyword for each sensor unit should be indicated within the recorded sensor dataset file names.

Returns

None.

setTimeHeaders(print_banner=True)

Specify the column(s) containing date/timestamp information.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

setTimeZone()

Select the time zone associated with the date/time column(s).

Timezones should be valid timezone names recognized by the pytz library.

Returns

None.

specifyCustomIngest()

Ask the user whether a custom, prewritten ingestion module will be used to import sensor data instead of the standard_ingest() method.

Returns

None.