sensortoolkit.lib_utils._setup.ReferenceSetup

class ReferenceSetup(path)[source]

Bases: sensortoolkit.lib_utils._setup._Setup

Interactive class for handling the reference data ingestion process.

Parameters

path (str, optional) – The path to the directory where the user intends to store data, figures, and reports relating to the sensor being testing. Defaults to None.

Methods

add_param_attrib

Assign parameter header attribute.

checkParamUnits

Prompt user to indicate whether units for passed parameter are the same as the preset units specified for the corresponding SDFS parameter.

config

Wrapper method for standard configuration setup.

copyDataSets

Copy recorded datasets from the selected file or folder location to the ../data/sensor_data/[sensor_name]/raw_data directory path.

displayMethods

Helper function for printing an abbreviated dataset of reference methods correponding to the indicated parameter.

exportSetup

Save the setup configuration to a setup.json file.

loadDataFile

Helper function for loading the first few rows of recorded datasets.

loadPreviousSetup

Ask the user if a previous setup config exists for the type of sensor or reference dataset that they are loading.

localRefDataIngest

Wrapper method for ingesting reference datasets acquired locally.

parseDataSets

Load the first few rows of recorded sensor datasets located in the ../data/sensor_data/[sensor_name]/raw_data directory path.

printSelectionBanner

Display a banner indicating the current configuration step.

processAirNowTech

Wrapper method for calling the sensortoolkit.reference.preprocess_airnowtech() method for converting downloaded AirNowTech datasets to SDFS format.

selectDataSets

Choose the selection scheme for pointing to recorded data files.

selectDataSource

Select the service/source from which reference data were acquired.

setColumnHeaders

Manually set column headers if the user indicates 'None' for the row index for the column headers in setHeaderIndex().

setDataExtension

Select the file data extension for to the datasets that will be loaded.

setDataRelPath

Assign the relative path for the recorded dataset subdirectory.

setDateTimeFormat

Configure the date/time formatting for date/time column(s) specified in setTimeHeaders().

setHeaderIndex

Select the integer index position for the row containing headers.

setParamHeaders

Select the SDFS parameters corresponding to column names discovered by ParseDataSets().

setParamMetaCols

Prompt user to enter various parameter metadata attributes.

setSiteInfo

Prompt user to enter various site attributes.

setTimeHeaders

Specify the column(s) containing date/timestamp information.

setTimeZone

Select the time zone associated with the date/time column(s).

specifyCustomIngest

Ask the user whether a custom, prewritten ingestion module will be used to import sensor data instead of the standard_ingest() method.

Attributes

api_services

critera_params

criteria_lookup

criteria_methods_path

custom_params

data_types

met_lookup

met_methods_path

pp

sdfs_params

add_param_attrib(param, attrib_key, attrib_val)

Assign parameter header attribute.

Search through the column index entries, if the parameter name within the column index subdictionary, add the passed attribute key and value.

Parameters
  • param (str) – The name of the parameter.

  • attrib_key (str) – The key to assign to the subdictionary entry.

  • attrib_val (int, float, or str) – The value to assign to the subdictionary entry.

Returns

None.

checkParamUnits(param, sdfs_param)

Prompt user to indicate whether units for passed parameter are the same as the preset units specified for the corresponding SDFS parameter.

Parameters
  • param (str) – The name of the parameter as logged in recorded datasets.

  • sdfs_param (str) – The name of the SDFS parameter corresponding to the recorded parameter.

Returns

A scalar quantity for converting the concentrations from the unit basis in which data were recorded to the unit basis for the SDFS parameter.

Return type

val (int, float, or Nonetype)

config()

Wrapper method for standard configuration setup.

Utilized by both sensor and reference setup schemes.

Returns

None.

copyDataSets()

Copy recorded datasets from the selected file or folder location to the ../data/sensor_data/[sensor_name]/raw_data directory path.

Returns

None.

displayMethods(param_code, lookup_data)[source]

Helper function for printing an abbreviated dataset of reference methods correponding to the indicated parameter.

Parameters
  • param_code (int) – AQS parameter code.

  • lookup_data (pandas DataFrame) – AQS method code lookup table containing a list of FRM/FEM reference methods.

Returns

A table containing a listing of reference methods designated FRM/FEMs for the indicated parameter.

Return type

table (pandas DataFrame)

exportSetup()

Save the setup configuration to a setup.json file.

Returns

None.

loadDataFile(file, **kwargs)

Helper function for loading the first few rows of recorded datasets.

Parameters

file (str) – Full path to dataset file.

Keyword Arguments:

Parameters

nrows (int) – The number of rows to load for the passed dataset. Defaults to 1.

Raises

TypeError – If data type is not in the list of valid extensions.

Returns

A DataFrame containing the first few rows of recorded datasets.

Return type

df (pandas DataFrame)

loadPreviousSetup()

Ask the user if a previous setup config exists for the type of sensor or reference dataset that they are loading. If they choose to use a previously configured setup.json file, use the file attributes to fill in various setup components, such as the parameter renaming, datetime formatting, etc.

Returns

None.

localRefDataIngest()[source]

Wrapper method for ingesting reference datasets acquired locally.

Datasets are ingested into SDFS format via the sensortoolkit.ingest.standard_ingest() method and processed datasets are grouped into one of three parameter classifications ('PM', 'Gases', or 'Met'). These datasets are then saved in monthly intervals to the ../data/reference_data/local/[sitename_siteid]/processed directory path.

Returns

None.

parseDataSets(print_banner=True)

Load the first few rows of recorded sensor datasets located in the ../data/sensor_data/[sensor_name]/raw_data directory path.

The names of column headers are located based on the indicated head index. A list of unique column headers is stored for subsequent reassignment of column header names.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

printSelectionBanner(select_type, options=[], notes=[])

Display a banner indicating the current configuration step.

Parameters
  • select_type (str) – The title of the configuration section.

  • options (list, optional) –

    List of interactive options indicating keyword characters used to modify the state of thge console and a description of what entering that keyword does. Defaults to [].

    Example

    >>> options = ['..press X to end adding entries',
                   '..press D to delete the previous entry']
    

  • notes (list, optional) – A list of strings containing notes or resources that may provide helpful context for the selection input or operation. Defaults to [].

Returns

None.

processAirNowTech()[source]

Wrapper method for calling the sensortoolkit.reference.preprocess_airnowtech() method for converting downloaded AirNowTech datasets to SDFS format.

Returns

None.

selectDataSets()

Choose the selection scheme for pointing to recorded data files.

Selection options include the following:

  • 'directory', which will locate and copy all of the data files in the specified directory for the indicated data type

  • 'recursive directory', which will locate and copy all data files within the specified directory and any subdirectories contained within the indicated folder path

  • 'files' which copies over files that the user manually selects within a directory.

Returns

None.

selectDataSource()[source]

Select the service/source from which reference data were acquired.

Choose from the following options:

  • 'local': Data files aqcuired locally (e.g., local transfer from agency overseeing reference instrumentation at air monitoring site).

  • 'airnowtech': User has downloaded files from the AirNowTech system and has saved files locally to the user’s system.

  • 'aqs': User will query EPA’s Air Quality System (AQS) API for reference data.

  • 'airnow': User will query the AirNow API for reference data.

Returns

None.

setColumnHeaders(print_banner=True)

Manually set column headers if the user indicates 'None' for the row index for the column headers in setHeaderIndex().

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Raises

ValueError – Raise if the value of the entered index is invalid (less than zero).

Returns

None.

setDataExtension()

Select the file data extension for to the datasets that will be loaded.

Choose the corresponding data file type for recorded datasets from '.csv', '.txt', '.xlsx'.

Returns

None.

setDataRelPath()

Assign the relative path for the recorded dataset subdirectory.

The relative path stems from the project path.

For sensor data, the relative path to raw (recorded datasets) should appear something like: /data/sensor_data/[sensor_name]/raw_data where ‘sensor_name’ is the name given to the air sensor.

For reference datasets, the relative path to raw (recorded datasets) should appear something like: /data/reference_data/[reference_data_source]/raw/[sitename_siteid] where ‘reference_data_source’ is the source or api service from which data were acquired, ‘sitename’ is the name given to the site, and ‘siteid’ is the AQS id for the site (if applicable).

Returns

None.

setDateTimeFormat()

Configure the date/time formatting for date/time column(s) specified in setTimeHeaders().

Returns

None.

setHeaderIndex(print_banner=True)

Select the integer index position for the row containing headers.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

setParamHeaders(print_banner=True)

Select the SDFS parameters corresponding to column names discovered by ParseDataSets().

A parameter renaming dictionary is created for reassigning the names of header labels.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

setParamMetaCols(param, sdfs_param)[source]

Prompt user to enter various parameter metadata attributes.

The user is prompted to enter the following attributes: - Units - Parameter AQS Code - Reference Method Code - Parameter Occurence Code

Parameters
  • param (str) – The name of the parameter as it appears in recorded datasets.

  • sdfs_param (str) – The corresponding SDFS parameter name.

Returns

None.

setSiteInfo()[source]

Prompt user to enter various site attributes.

The user is prompted to provide the following site attributes: - Site name - Agency overseeing site - Site AQS ID - Site latitude - Site longitude

Returns

None.

setTimeHeaders(print_banner=True)

Specify the column(s) containing date/timestamp information.

Parameters

print_banner (bool, optional) – If 'True', a banner indicating the title of the section, user input options, and notes will be printed to the console. Defaults to True.

Returns

None.

setTimeZone()

Select the time zone associated with the date/time column(s).

Timezones should be valid timezone names recognized by the pytz library.

Returns

None.

specifyCustomIngest()

Ask the user whether a custom, prewritten ingestion module will be used to import sensor data instead of the standard_ingest() method.

Returns

None.