sensortoolkit.calculate._regression_stats.regression_stats
- regression_stats(sensor_df_obj, ref_df_obj, deploy_dict, param, serials, verbose=True)[source]
Compute OLS regression statistics.
Module is used to compute the following regressions:
Sensor vs. FRM/FEM
Sensor vs. Inter-sensor average
For each instance, the dependent and independent variables are assigned as hourly/daily sensor data vs. hourly/daily reference data; please note the
ref_df_obj
object can be either a DataFrame containg FRM/FEM concentratons, or a DataFrame containing intersensor averages depending on the use case. The ‘ref’ label refers moreso to the fact that the dataset is used as the independent variable for regressions.Note
The DataFrames within the
sensor_df_obj
andref_df_obj
arguments should contain data reported at the same sampling frequency (e.g., if a sensor DataFrame containing data at 1-hour averaged intervals is passed to thesensor_df_obj
, the reference DataFrame passed toref_df_obj
must also contain data at 1-hour averaged intervals).- Parameters
sensor_df_obj (pandas DataFrame or list of pandas DataFrames) – Either a DataFrame or list of DataFrames containg sensor parameter measurements. Data corresponding to passed parameter name are used as the dependent variable.
ref_df_obj (pandas DataFrame) – Reference DataFrame (either FRM/FEM OR Inter-sensor averages). Data corresponding to passed parameter name are used as the independent variable.
deploy_dict (dict) – A dictionary containing descriptive statistics and textual information about the deployment (testing agency, site, time period, etc.), sensors tested, and site conditions during the evaluation.
param (str) – Parameter name for which to compute regression statistics.
serials (dict) – A dictionary of sensor serial identifiers for each unit in a testing group.
verbose (bool) – If true, print statements will be displayed in the console output. Defaults to True.
- Returns
Statistics DataFrame for either sensor vs. FRM/FEM or sensor vs. intersensor mean OLS regression.
- Return type
stats_df (pandas DataFrame)