dosertools.data_processing package¶
Submodules¶
dosertools.data_processing.array module¶
- dosertools.data_processing.array.closest_index_for_value(dataset: pandas.core.frame.DataFrame, column: str, value: float) int [source]¶
Finds the closest value for a given value in a column and returns its index.
- Parameters
dataset (pandas.DataFrame) – the dataframe that contains at least the column “column” column must be numeric (int or float)
column (str) – name of column to look the closest value in
value (float) – numeric value to compare entries in dataset[column] to
- Returns
closest_index_for_value (int) – Closest index in dataset[column] for given value
Examples
Given dataframe ‘df’ with column ‘a’ with values [-1,0,1,2] and ‘b’ with values [‘c’,1,1,1.2], the following would result from use of the function. closest_index_for_value(df,’a’,1.1) = 2 closest_index_for_value(df,’a’,1.9) = 3 closest_index_for_value(df,’b’,1.1) –> TypeError
- dosertools.data_processing.array.continuous_nonzero(array: numpy.ndarray) numpy.ndarray [source]¶
Returns array with index pairs indicating blocks of nonzero in given array.
Returns array with shape (m, 2), where m is the number of “blocks” of non-zeros. The first column is the index of the first non-zero, the second is the index of the first zero following the blocks. If the block reaches the end of the array, the second index will be the size of the array + 1. Follows convention of numpy where array(a,a+n) yields the values of indices a through a+n-1.
- Parameters
array (np.ndarray) – array to look for nonzero blocks within array must be numeric (integer or float)
- Returns
continuous_nonzero (np.ndarray) – (m, 2) array where m is the number of “blocks” of non-zeros. The first column is the index of the first non-zero, the second is the index of the first zero following the block.
Examples
array continuous_nonzero(array) [1,1,1,1,0,0,1,1,0] [[0,4],[6,8]] [0,0,-1,1,-1,1] [[2,6]]
- dosertools.data_processing.array.continuous_zero(array: numpy.ndarray) numpy.ndarray [source]¶
Returns array with index pairs indicating blocks of zero in given array.
Returns array with shape (m, 2), where m is the number of “blocks” of zeros. The first column is the index of the first zero, the second is the index of the first non-zero following the block. If the block reaches the end of the array, the second index will be the size of the array + 1. Follows convention of numpy where array(a,a+n) yields the values of indices a through a+n-1.
- Parameters
array (np.ndarray) – array to look for zero runs within array must be numeric (integer or float)
- Returns
nonzero_runs (np.ndarray) – (m, 2) array where m is the number of “blocks” of zeros. The first column is the index of the first zero, the second is the index of the first non-zero following the block.
Examples
array continuous_zero(array) [1,1,1,1,0,0,1,1,0] [[4,6],[8,9]] [0,0,-1,1,-1,1] [[0,2]]
- dosertools.data_processing.array.is_array_numeric(array: numpy.ndarray) bool [source]¶
Return True if array is float or int (numeric), otherwise False
- Parameters
array (np.ndarray) – array to check if numeric
- Returns
is_array_numeric (bool) – True if array is float or signed/unsigned int, otherwise False
Examples
is_array_numeric([0,1,2,3]) = True is_array_numeric([1.1,1.2,1.5]) = True is_array_numeric([‘a’,’b’,’c’]) = False is_array_numeric([True,False,False]) = False
- dosertools.data_processing.array.is_dataframe_column_numeric(dataset: pandas.core.frame.DataFrame, column: str) bool [source]¶
Return True if column in dataset is float or int (numeric), otherwise False
- Parameters
dataset (pandas.DataFrame) – the dataframe that contains at least the column “column”
column (str) – name of column to check if numeric
- Returns
is_dataframe_column_numeric (bool) – True if column in dataset is float or int, otherwise False
Examples
Given dataframe ‘df’ with column ‘a’ with values [-1,0,1,2] and ‘b’ with values [‘c’,1,1,1.2], the following would result from use of the function. is_dataframe_column_numeric(df,’a’) = True is_dataframe_column_numeric(df,’b’) = False is_dataframe_column_numeric(df,’c’) –> KeyError
dosertools.data_processing.csv module¶
- dosertools.data_processing.csv.csv_to_dataframe(csv: str, fname_format: str, sampleinfo_format: str, optional_settings: dict = {}) pandas.core.frame.DataFrame [source]¶
Reads in a csv into a dataframe with sample parameters.
- Parameters
csv (string) – Path to csv file to import.
tc_bounds (np.array) – Two value array containing the upper and lower bounds in “D/D0” where tc will be found in between.
fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. ex. “date_sampleinfo_fps_run”
sampleinfo_format (str) – The format of the sampleinfo section of the fname, separated by the deliminator specified by sample_split.
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.
sample_split (string) – The deliminator for splitting sampleinfo tag in folder/file names, used in sampleinfo_format. Default is “-“.
- Returns
csv_to_dataframe (pd.DataFrame) – dataframe with data from csv, sample information from filename, strain rate and critical time calculated
- dosertools.data_processing.csv.generate_df(csv_location: Union[str, bytes, os.PathLike], fname_format: str, sampleinfo_format: str, optional_settings: dict = {}) pandas.core.frame.DataFrame [source]¶
Reads in all csvs and process them into a dataframe.
Reads in data from all csvs in csv_location, process each, adding strain rate, critical time, diameter at critical time, and parameters from the filename, and put all data into one dataframe. Loops csv_to_dataframe for all csvs in folder.
- Parameters
csv_location (path-like) – folder in which csvs to process are stored
tc_bounds (np.array) – two value array containing the upper and lower bounds in “D/D0” where tc will be found in between
fname_format (str) – the format of the fname with parameter names separated by the deliminator specified by fname_split ex. “date_sampleinfo_fps_run”
sampleinfo_format (str) – the format of the sampleinfo section of the fname separated by the deliminator specified by sample_split
optional_settings (dict) – A dictionary of optional settings.
- Returns
generate_df (pd.DataFrame) – dataframe containing data from all csvs in csv_location
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
dosertools.data_processing.extension module¶
- dosertools.data_processing.extension.add_critical_time(dataset: pandas.core.frame.DataFrame, optional_settings: dict = {}) pandas.core.frame.DataFrame [source]¶
Finds critical time from maximum in strain rate, adds relevant columns.
Finds the critical time from the maximum in the strain rate within the bounds in di specified by tc_bounds. Adds the columns “tc (s)” (critical time), “t-tc (s)” (time past critical time), and “Dtc/D0” (diameter at critical time divided by initial diameter) to the dataset.
- Parameters
dataset (pandas.DataFrame) – dataset to which to add the “tc (s)”, “t-tc (s)”, and “Dtc/D0” columns must contain “D/D0”, “time (s)”, and “strain rate (1/s)” columns
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
tc_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end for finding the critical time. Default is [0.3,0.07].
- Returns
add_critical_time (pd.DataFrame) – dataset with “tc”, “t - tc (s)”, and “Dtc/D0” columns added
- dosertools.data_processing.extension.add_strain_rate(dataset: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame [source]¶
Calculates strain rate from D/D0 and time(s) data and adds it to dataset
Using the formula -2(d(D/D0)/dt)/(D/D0) for the strain rate, calculates the strain rate at each point in dataset using np.gradient for the derivative. Removes rows where the strain rate is infinite/NaN from the dataset. Returns a dataframe with all existing columns and the new strain rate (1/s) column.
- Parameters
dataset (pandas.DataFrame) – dataset to which to add the “strain rate (1/s)” column must contain “D/D0” and “time (s)” columns
- Returns
add_strain_rate (pandas.DataFrame) – dataset with strain rate (1/s) column added and all rows with infinite/NaN removed
- dosertools.data_processing.extension.truncate_data(dataset: pandas.core.frame.DataFrame, before: bool = True) pandas.core.frame.DataFrame [source]¶
Truncates a dataset before/after the longest block of continuous zeroes.
Given a dataset, truncates the dataset before/after (depending on the True/False value of before) the longest block of continuous zeroes in the “D/D0” column. The longest block of zeroes should occur after the liquid bridge breaks and the readout is no longer accurate.
- Parameters
dataset (pd.DataFrame) – Dataframe containing data to truncate. Dataframe must contain “D/D0” column.
before (bool, optional) – True if truncation should occur at the last nonzero value before the longest block of zeroes. (default) False if truncation should occur at the last zero in the longest block of zeroes.
- Returns
truncate_data (pd.DataFrame) – Dataframe with truncated data.
Examples
DataFrame ex1: time D/D0 0 1 0.1 0.9 0.2 0.8 0.3 0 0.4 0.6 0.5 0.4 0.6 0 0.7 0 0.8 0 0.9 0 1.0 0.1 1.1 0.2 1.2 0 1.3 0 1.4 0.1
truncate_data(ex1) time D/D0 0 1 0.1 0.9 0.2 0.8 0.3 0 0.4 0.6 0.5 0.4
truncate_data(ex1, False) time D/D0 0 1 0.1 0.9 0.2 0.8 0.3 0 0.4 0.6 0.5 0.4 0.6 0 0.7 0 0.8 0 0.9 0
DataFrame ex2: time 0 0.1
truncate_data(ex2) –> KeyError
dosertools.data_processing.figures module¶
- dosertools.data_processing.figures.layout_time_csvs(df: pandas.core.frame.DataFrame, plot_normalized: bool) holoviews.element.geom.Points [source]¶
Plots a time vs D/D0 graph of all samples and runs in df.
Plots with raw (time) or normalized (t - t_c) depending on the value of plot_normalized.
- Parameters
df (pd.DataFrame) – Dataframe of D/D0 and time data with sample and run information
plot_normalized (bool) – True to normalize time by t_c, the critical time, and plot t - tc on the x-axis False to plot raw time on the x-axis
- Returns
hv_layout (hv.Points) – Set of plots for each run and sample included in df
- dosertools.data_processing.figures.layout_viscosity_csvs(df: pandas.core.frame.DataFrame) holoviews.element.geom.Points [source]¶
Plots a strain vs (elongational viscosity / surface tension) graph of all samples and runs in df.
- Parameters
df (pd.DataFrame) – Dataframe with (elongational viscosity / surface tension) and strain with sample and run information
- Returns
hv_layout (hv.Points) – Set of plots for each run and sample included in df
- dosertools.data_processing.figures.save_figure(figure: holoviews.element.geom.Points, figure_name: str, summary_folder: Union[str, bytes, os.PathLike], optional_settings: dict = {}) None [source]¶
Saves the figure as an .html file which enables interactivity
- Parameters
figure (hv.Points) – Set of plots for each run and sample included in df
figure_name (string) – Filename with which to save the figure
summary_folder (Path) – Location to save the figure
optional_settings (dict) – Dictionary of optional settings.
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
- Returns
None, file saved to disk
dosertools.data_processing.fitting module¶
- dosertools.data_processing.fitting.annotate_summary_df(fitting_results_list: list, header_params: dict) pandas.core.frame.DataFrame [source]¶
Do we want to bring other columns with us like ion, polymer identity, etc? How to code that?
- Parameters
fitting_results_list (list) – generated by find_EC_slope
header_params (dict) – Contains the information profiled in the samplename
- Returns
lambdaE_df (pd.DataFrame) – dataframe containing lambdaE relaxation time for each run from the input df
- dosertools.data_processing.fitting.calculate_elongational_visc(df: pandas.core.frame.DataFrame, summary_df: pandas.core.frame.DataFrame, optional_settings: dict = {}) pandas.core.frame.DataFrame [source]¶
Calculates the quantity (elongational viscosity / surface tension) for each moment in the DOS dataset.
- Parameters
df (pd.DataFrame) – Contains D/D0, time, t - tc, strain rate, D(tc)/D0, etc for multiple runs and samples generated from data_processing.csv.generate_df
summary_df (pd.DataFrame) – Contains relaxation time, D(t_c)/D0, and sample info for all runs and samples generated from data_procescing.fitting.make_summary_dataframe
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
needle_diameter_mm (float) – The needle outer diameter in millimeters. Default is 0.7176 mm (22G needle).
- Returns
dataset_w_visc (pd.DataFrame) – Dataset with the additional values calculated by this function: strain and (elongational viscosity / surface tension)
- dosertools.data_processing.fitting.derivative_EC_fit(Dtc_D0: float, lambdaE: float, time: float, tc: float) float [source]¶
Calculates the derivative of the elasto-capillary region.
D(t)/D0 = D(tc)/D0 * (exp(-(t - tc)/(3*LambdaE))) D’(t)/D0 = (-1/(3*LambdaE)) * D(tc)/D0 * (exp(-(t - tc)/(3*LambdaE)))
- Parameters
Dtc_D0 (float) – The normalized diameter at which the transition to EC behavior occurs
LambdaE (float) – The relaxation time of the polymer solution
time (float) – the moment in time to evaluate the derivative
tc (float) – the critical time for the experiment, this number is purely emperical (it depends entirely on when in the process the video starts)
- Returns
D’(t)/D0 (float)
- dosertools.data_processing.fitting.find_EC_slope(run_dataset: pandas.core.frame.DataFrame, start: float, end: float) Tuple[float, float, float] [source]¶
Finds the exponential decay of the EC region for a single dataset. Also returns the intercept and r value of the fit
- Parameters
run_dataset (pd.DataFrame) – dataset of a single DOS run containing at least D/D0, time,
start (float) – value of D/D0 to start fitting the EC region from
end (float) – value of D/D0 to end fitting the EC region with
- Returns
slope, intercept, r_value (floats) – slope is the slope of the semilog-transformed version of the dataset. It corresponds to the decaying exponential in the linear-linear version of the data
- dosertools.data_processing.fitting.make_summary_dataframe(df: pandas.core.frame.DataFrame, sampleinfo_format: str, optional_settings: dict = {}) pandas.core.frame.DataFrame [source]¶
Condenses a DOS run into an extensional relaxation time by fitting the EC region (t > tc) to a decaying exponential
- Parameters
df (pd.DataFrame) – Contains D/D0, time, t - tc, strain rate, D(tc)/D0, etc for multiple runs and samples generated from data_processing.csv.generate_df
sampleinfo_format (str) – the format of the sampleinfo section of the filename separated by the deliminator specified by sample_split
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
fitting_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end of fitting of EC region. Default is [0.1, 0.045].
- Returns
summary_df (pd.DataFrame) – dataframe containing lambdaE (relaxation time) and R(t_c)/R_0 for each run from the input df, along with their sample info
- dosertools.data_processing.fitting.save_processed_df(df: pandas.core.frame.DataFrame, save_location: Union[str, bytes, os.PathLike], optional_settings: dict = {})[source]¶
Saves the processed dataset from a batch of videos.
The summary dataset includes the fluid properties, the ‘raw’ diameter vs time data, and the sampleinfo from the filename.
- Parameters
df (pd.DataFrame) – Raw dataset with calculated columns for strain rate, strain, etc, and parsed sampleinfo columns.
save_location (path-like) – Path to folder in which to save the csv.
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
summary_filename (string) – The base filename (no extension) for saving the summary csvs. If not provided, will be generated automatically based on the current date and time. Default is “” to trigger automatic generation.
- Returns
filename_string (string) – Filename at which the annotated dataframe was saved.
Saves file to disk.
- dosertools.data_processing.fitting.save_summary_df(summary_df: pandas.core.frame.DataFrame, save_location: Union[str, bytes, os.PathLike], optional_settings: dict = {})[source]¶
Saves the summary dataset from a large processed batch of videos.
The summary dataset includes only the fluid properties, not the ‘raw’ diameter vs time data.
- Parameters
summary_df (pd.DataFrame) – Contains LambdaE, D(tc)/D0, and sample info for multiple runs and samples
save_location (path-like) – path to folder in which to save the csv
filename (string) – Currently not used, but this will be the name to give to the summary dataset csv
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
summary_filename (string) – The base filename (no extension) for saving the summary csvs. If not provided, will be generated automatically based on the current date and time. Default is “” to trigger automatic generation.
- Returns
filename_string (string) – Filename at which the summary dataframe was saved.
Saves file to disk.
dosertools.data_processing.integration module¶
- dosertools.data_processing.integration.binaries_to_csvs(images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], short_fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]¶
Converts binary image folders into csvs of D/D0 vs. time.
Given a folder of folders of binary images, converts each set of binary images into a csv of D/D0 vs. time, retaining information in the filename.
- Parameters
images_folder (path-like) – Path to a folder in which the results of image processing were saved (i.e. the folders of binary images).
csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.
short_fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split with only tags present in the names of the folders in images_folder. Should have “vtype” and “remove” tags removed compared to videos_to_binaries. Must contain “fps” tag. ex. “date_sampleinfo_fps_run”
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.
- dosertools.data_processing.integration.csvs_to_summaries(csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], short_fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]¶
Processes the raw csvs and determines elongational relaxation time, D(tc)/D0, and elongational viscosity.
- Parameters
csv_folder (path-like) – Path to a folder in which to find the csv containing D/D0 vs. time.
summary_folder (path-like) – Path to a folder in which to save the csv of the summary and the annotated datatset
short_fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Format should not have “vtype” and “remove” tags–csvs will not have those formatting tags still attached. ex. “date_sampleinfo_fps_run”
sampleinfo_format (str) – The format of the sampleinfo section of the fname separated by the deliminator specified by sample_split.
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.
sample_split (string) – The deliminator for splitting sampleinfo tag in folder/file names, used in sampleinfo_format. Default is “-“.
fitting_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end of fitting of EC region. Default is [0.1, 0.045].
tc_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end for finding the critical time. Default is [0.3,0.07].
- dosertools.data_processing.integration.multiprocess_binaries_to_csvs(subfolder_index: int, subfolders: list, images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], short_fname_format: str, tic: float, optional_settings: dict = {}) None [source]¶
Converts binary image folders into csvs of D/D0 vs. time.
Given a folder of folders of binary images, converts each set of binary images into a csv of D/D0 vs. time, retaining information in the filename.
For multiprocessing to work properly, the function that invokes the pool of processors and the function that uses them need to be defined separately. Thus, this function is here, and is called in binaries_to_csvs.
- Parameters
subfolder_index (int) – Index to keep track of which folder we are currently processing
subfolders (list of folders) – List of folders that contain the binaries that this function is reading to produce Diameter and time data
images_folder (path-like) – Path to a folder in which the results of image processing were saved (i.e. the folders of binary images).
csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.
short_fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split with only tags present in the names of the folders in images_folder. Should have “vtype” and “remove” tags removed compared to videos_to_binaries. Must contain “fps” tag. ex. “date_sampleinfo_fps_run”
tic (float) – Stores the time that the processing began at. Used in verbose mode to determine how long processing takes
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.
- dosertools.data_processing.integration.multiprocess_vid_to_bin(file_number: int, fnames: list, exp_videos: list, bg_videos: list, images_folder: Union[str, bytes, os.PathLike], tic: float, optional_settings: dict = {}) None [source]¶
Converts videos in given folder into binary images.
Matches videos in videos_folder into experimental and background pairs, and converts those paired videos into background-subtracted binaries.
For multiprocessing to work properly, the function that invokes the pool of processors and the function that uses them need to be defined separately. Thus, this function is here, and is called in videos_to_binaries.
- Parameters
file_number (int) – Index to keep track of which folder or video we are processing
fnames (list of strings) – List of base folder names for each matched pair of experimental and background folders.
exp_videos (list of paths) – List of paths to experimental video folders that were matched with backgrounds.
bg_videos (list of paths) – List of paths to background video folders matched with exp_videos.
images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.
tic (float) – Stores the time that the processing began at. Used in verbose mode to determine how long processing takes.
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.
background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.
one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.
save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.
save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.
skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.
image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.
- dosertools.data_processing.integration.set_defaults(optional_settings: dict = {}) dict [source]¶
Sets default values for unset kets in optional_settings.
- Parameters
optional_settings (dict) – Dictionary of optional settings.
- Returns
settings (dict) – Dictionary with optional_settings and default values, prioritizing optional_settings values.
- Optional Settings and Defaults
nozzle_row (int) – Row to use for determining the nozzle diameter. Default is 1.
crop_width_coefficient (float) – Multiplied by the calculated nozzle_diameter to determine the buffer on either side of the observed nozzle edges to include in the cropped image. Default is 0.02
crop_height_coefficient (float) – Multiplied by the calculated nozzle_diameter to determine the bottom row that will be included in the cropped image. Default is 2.
crop_nozzle_coefficient (float) – Multiplied by the calculated nozzle_diameter to determine the top row of the cropped image. Default is 0.15.
fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.
sample_split (string) – The deliminator for splitting sampleinfo tag in folder/file names, used in sampleinfo_format. Default is “-“.
experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.
background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.
one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.
bg_drop_removal (bool) – True to remove the background drop from the background that is subtracted from the image before binarization. False to not alter the background. Default is False.
save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.
save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.
fitting_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end of fitting of EC region. Default is [0.1, 0.045].
tc_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end for finding the critical time. Default is [0.3,0.07].
needle_diameter_mm (float) – The needle outer diameter in millimeters. Default is 0.7176 mm (22G needle).
skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.
summary_filename (string) – The base filename (no extension) for saving the summary csvs. If not provided, will be generated automatically based on the current date and time. Default is “” to trigger automatic generation.
cpu_count (int) – How many cores to use for multithreading/multiprocessing. If nothing provided, default will be the maximum number of cores returned from os.cpu_count()
- dosertools.data_processing.integration.videos_to_binaries(videos_folder: Union[str, bytes, os.PathLike], images_folder: Union[str, bytes, os.PathLike], fname_format: str, optional_settings: dict = {})[source]¶
Converts videos in given folder into binary images.
Matches videos in videos_folder into experimental and background pairs, and converts those paired videos into background-subtracted binaries.
- Parameters
videos_folder (path-like) – Path to a folder of experimental and background video folders.
images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.
fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Must contain the “vtype” tag corresponding to experiment vs. background. Can contain “remove” to remove information that is not relevant or is different between the experimental and background video names and would prevent matching. ex. “date_sampleinfo_fps_run_vtype_remove_remove”
sampleinfo_format (str) – The format of the sampleinfo section of the fname separated by the deliminator specified by sample_split.
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
cpu_count (int) – How many cores to use for multithreading/multiprocessing. If nothing provided, default will be the maximum number of cores returned from os.cpu_count()
experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.
background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.
one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.
save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.
save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.
skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.
image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.
- dosertools.data_processing.integration.videos_to_csvs(videos_folder: Union[str, bytes, os.PathLike], images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]¶
Converts videos in given folder into csvs of D/D0 vs. time.
Matches videos in videos_folder into experimental and background pairs, converts those paired videos into background-subtracted binaries, analyzes the resulting binaries to extract D/D0 vs. time, and saves the results to csvs.
- Parameters
videos_folder (path-like) – Path to a folder of experimental and background video folders.
images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.
csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.
fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Must contain the “vtype” tag corresponding to experiment vs. background. Can contain “remove” to remove information that is not relevant or is different between the experimental and background video names and would prevent matching. Must contain “fps” tag. ex. “date_sampleinfo_fps_run_vtype_remove_remove”
optional_settings (dict) – A dictionary of optional settings.
- Optional Settings and Defaults
fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.
experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.
background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.
one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.
save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.
save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.
skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.
verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.
image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.
- dosertools.data_processing.integration.videos_to_summaries(videos_folder: Union[str, bytes, os.PathLike], images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]¶
Full integrating function: converts from videos to csv files
- Parameters
videos_folder (path-like) – Path to a folder of experimental and background video folders.
images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.
csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.
summary_save_location (path-like) – Path to a folder in which to save the csv of the summary and the annotated datatset
fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Must contain the “vtype” tag corresponding to experiment vs. background. Can contain “remove” to remove information that is not relevant or is different between the experimental and background video names and would prevent matching. ex. “date_sampleinfo_fps_run_vtype_remove_remove”
sampleinfo_format (str) – The format of the sampleinfo section of the fname separated by the deliminator specified by sample_split.
optional_settings (dict) – A dictionary of optional settings.