dosertools.data_processing package

Submodules

dosertools.data_processing.array module

dosertools.data_processing.array.closest_index_for_value(dataset: pandas.core.frame.DataFrame, column: str, value: float) int[source]

Finds the closest value for a given value in a column and returns its index.

Parameters
  • dataset (pandas.DataFrame) – the dataframe that contains at least the column “column” column must be numeric (int or float)

  • column (str) – name of column to look the closest value in

  • value (float) – numeric value to compare entries in dataset[column] to

Returns

closest_index_for_value (int) – Closest index in dataset[column] for given value

Examples

Given dataframe ‘df’ with column ‘a’ with values [-1,0,1,2] and ‘b’ with values [‘c’,1,1,1.2], the following would result from use of the function. closest_index_for_value(df,’a’,1.1) = 2 closest_index_for_value(df,’a’,1.9) = 3 closest_index_for_value(df,’b’,1.1) –> TypeError

dosertools.data_processing.array.continuous_nonzero(array: numpy.ndarray) numpy.ndarray[source]

Returns array with index pairs indicating blocks of nonzero in given array.

Returns array with shape (m, 2), where m is the number of “blocks” of non-zeros. The first column is the index of the first non-zero, the second is the index of the first zero following the blocks. If the block reaches the end of the array, the second index will be the size of the array + 1. Follows convention of numpy where array(a,a+n) yields the values of indices a through a+n-1.

Parameters

array (np.ndarray) – array to look for nonzero blocks within array must be numeric (integer or float)

Returns

continuous_nonzero (np.ndarray) – (m, 2) array where m is the number of “blocks” of non-zeros. The first column is the index of the first non-zero, the second is the index of the first zero following the block.

Examples

array continuous_nonzero(array) [1,1,1,1,0,0,1,1,0] [[0,4],[6,8]] [0,0,-1,1,-1,1] [[2,6]]

dosertools.data_processing.array.continuous_zero(array: numpy.ndarray) numpy.ndarray[source]

Returns array with index pairs indicating blocks of zero in given array.

Returns array with shape (m, 2), where m is the number of “blocks” of zeros. The first column is the index of the first zero, the second is the index of the first non-zero following the block. If the block reaches the end of the array, the second index will be the size of the array + 1. Follows convention of numpy where array(a,a+n) yields the values of indices a through a+n-1.

Parameters

array (np.ndarray) – array to look for zero runs within array must be numeric (integer or float)

Returns

nonzero_runs (np.ndarray) – (m, 2) array where m is the number of “blocks” of zeros. The first column is the index of the first zero, the second is the index of the first non-zero following the block.

Examples

array continuous_zero(array) [1,1,1,1,0,0,1,1,0] [[4,6],[8,9]] [0,0,-1,1,-1,1] [[0,2]]

dosertools.data_processing.array.is_array_numeric(array: numpy.ndarray) bool[source]

Return True if array is float or int (numeric), otherwise False

Parameters

array (np.ndarray) – array to check if numeric

Returns

is_array_numeric (bool) – True if array is float or signed/unsigned int, otherwise False

Examples

is_array_numeric([0,1,2,3]) = True is_array_numeric([1.1,1.2,1.5]) = True is_array_numeric([‘a’,’b’,’c’]) = False is_array_numeric([True,False,False]) = False

dosertools.data_processing.array.is_dataframe_column_numeric(dataset: pandas.core.frame.DataFrame, column: str) bool[source]

Return True if column in dataset is float or int (numeric), otherwise False

Parameters
  • dataset (pandas.DataFrame) – the dataframe that contains at least the column “column”

  • column (str) – name of column to check if numeric

Returns

is_dataframe_column_numeric (bool) – True if column in dataset is float or int, otherwise False

Examples

Given dataframe ‘df’ with column ‘a’ with values [-1,0,1,2] and ‘b’ with values [‘c’,1,1,1.2], the following would result from use of the function. is_dataframe_column_numeric(df,’a’) = True is_dataframe_column_numeric(df,’b’) = False is_dataframe_column_numeric(df,’c’) –> KeyError

dosertools.data_processing.csv module

dosertools.data_processing.csv.csv_to_dataframe(csv: str, fname_format: str, sampleinfo_format: str, optional_settings: dict = {}) pandas.core.frame.DataFrame[source]

Reads in a csv into a dataframe with sample parameters.

Parameters
  • csv (string) – Path to csv file to import.

  • tc_bounds (np.array) – Two value array containing the upper and lower bounds in “D/D0” where tc will be found in between.

  • fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. ex. “date_sampleinfo_fps_run”

  • sampleinfo_format (str) – The format of the sampleinfo section of the fname, separated by the deliminator specified by sample_split.

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.

  • sample_split (string) – The deliminator for splitting sampleinfo tag in folder/file names, used in sampleinfo_format. Default is “-“.

Returns

csv_to_dataframe (pd.DataFrame) – dataframe with data from csv, sample information from filename, strain rate and critical time calculated

dosertools.data_processing.csv.generate_df(csv_location: Union[str, bytes, os.PathLike], fname_format: str, sampleinfo_format: str, optional_settings: dict = {}) pandas.core.frame.DataFrame[source]

Reads in all csvs and process them into a dataframe.

Reads in data from all csvs in csv_location, process each, adding strain rate, critical time, diameter at critical time, and parameters from the filename, and put all data into one dataframe. Loops csv_to_dataframe for all csvs in folder.

Parameters
  • csv_location (path-like) – folder in which csvs to process are stored

  • tc_bounds (np.array) – two value array containing the upper and lower bounds in “D/D0” where tc will be found in between

  • fname_format (str) – the format of the fname with parameter names separated by the deliminator specified by fname_split ex. “date_sampleinfo_fps_run”

  • sampleinfo_format (str) – the format of the sampleinfo section of the fname separated by the deliminator specified by sample_split

  • optional_settings (dict) – A dictionary of optional settings.

Returns

generate_df (pd.DataFrame) – dataframe containing data from all csvs in csv_location

Optional Settings and Defaults

verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

dosertools.data_processing.csv.get_csvs(csv_location: Union[str, bytes, os.PathLike]) list[source]

Returns list of csvs in csv_location.

Parameters

csv_location (path-like) – path to a location containing desired csvs

Returns

get_csvs (list) – sorted list of csvs in csv_location as path strings

dosertools.data_processing.extension module

dosertools.data_processing.extension.add_critical_time(dataset: pandas.core.frame.DataFrame, optional_settings: dict = {}) pandas.core.frame.DataFrame[source]

Finds critical time from maximum in strain rate, adds relevant columns.

Finds the critical time from the maximum in the strain rate within the bounds in di specified by tc_bounds. Adds the columns “tc (s)” (critical time), “t-tc (s)” (time past critical time), and “Dtc/D0” (diameter at critical time divided by initial diameter) to the dataset.

Parameters
  • dataset (pandas.DataFrame) – dataset to which to add the “tc (s)”, “t-tc (s)”, and “Dtc/D0” columns must contain “D/D0”, “time (s)”, and “strain rate (1/s)” columns

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults

tc_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end for finding the critical time. Default is [0.3,0.07].

Returns

add_critical_time (pd.DataFrame) – dataset with “tc”, “t - tc (s)”, and “Dtc/D0” columns added

dosertools.data_processing.extension.add_strain_rate(dataset: pandas.core.frame.DataFrame) pandas.core.frame.DataFrame[source]

Calculates strain rate from D/D0 and time(s) data and adds it to dataset

Using the formula -2(d(D/D0)/dt)/(D/D0) for the strain rate, calculates the strain rate at each point in dataset using np.gradient for the derivative. Removes rows where the strain rate is infinite/NaN from the dataset. Returns a dataframe with all existing columns and the new strain rate (1/s) column.

Parameters

dataset (pandas.DataFrame) – dataset to which to add the “strain rate (1/s)” column must contain “D/D0” and “time (s)” columns

Returns

add_strain_rate (pandas.DataFrame) – dataset with strain rate (1/s) column added and all rows with infinite/NaN removed

dosertools.data_processing.extension.truncate_data(dataset: pandas.core.frame.DataFrame, before: bool = True) pandas.core.frame.DataFrame[source]

Truncates a dataset before/after the longest block of continuous zeroes.

Given a dataset, truncates the dataset before/after (depending on the True/False value of before) the longest block of continuous zeroes in the “D/D0” column. The longest block of zeroes should occur after the liquid bridge breaks and the readout is no longer accurate.

Parameters
  • dataset (pd.DataFrame) – Dataframe containing data to truncate. Dataframe must contain “D/D0” column.

  • before (bool, optional) – True if truncation should occur at the last nonzero value before the longest block of zeroes. (default) False if truncation should occur at the last zero in the longest block of zeroes.

Returns

truncate_data (pd.DataFrame) – Dataframe with truncated data.

Examples

DataFrame ex1: time D/D0 0 1 0.1 0.9 0.2 0.8 0.3 0 0.4 0.6 0.5 0.4 0.6 0 0.7 0 0.8 0 0.9 0 1.0 0.1 1.1 0.2 1.2 0 1.3 0 1.4 0.1

truncate_data(ex1) time D/D0 0 1 0.1 0.9 0.2 0.8 0.3 0 0.4 0.6 0.5 0.4

truncate_data(ex1, False) time D/D0 0 1 0.1 0.9 0.2 0.8 0.3 0 0.4 0.6 0.5 0.4 0.6 0 0.7 0 0.8 0 0.9 0

DataFrame ex2: time 0 0.1

truncate_data(ex2) –> KeyError

dosertools.data_processing.figures module

dosertools.data_processing.figures.layout_time_csvs(df: pandas.core.frame.DataFrame, plot_normalized: bool) holoviews.element.geom.Points[source]

Plots a time vs D/D0 graph of all samples and runs in df.

Plots with raw (time) or normalized (t - t_c) depending on the value of plot_normalized.

Parameters
  • df (pd.DataFrame) – Dataframe of D/D0 and time data with sample and run information

  • plot_normalized (bool) – True to normalize time by t_c, the critical time, and plot t - tc on the x-axis False to plot raw time on the x-axis

Returns

hv_layout (hv.Points) – Set of plots for each run and sample included in df

dosertools.data_processing.figures.layout_viscosity_csvs(df: pandas.core.frame.DataFrame) holoviews.element.geom.Points[source]

Plots a strain vs (elongational viscosity / surface tension) graph of all samples and runs in df.

Parameters

df (pd.DataFrame) – Dataframe with (elongational viscosity / surface tension) and strain with sample and run information

Returns

hv_layout (hv.Points) – Set of plots for each run and sample included in df

dosertools.data_processing.figures.save_figure(figure: holoviews.element.geom.Points, figure_name: str, summary_folder: Union[str, bytes, os.PathLike], optional_settings: dict = {}) None[source]

Saves the figure as an .html file which enables interactivity

Parameters
  • figure (hv.Points) – Set of plots for each run and sample included in df

  • figure_name (string) – Filename with which to save the figure

  • summary_folder (Path) – Location to save the figure

  • optional_settings (dict) – Dictionary of optional settings.

Optional Settings and Defaults

verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

Returns

None, file saved to disk

dosertools.data_processing.fitting module

dosertools.data_processing.fitting.annotate_summary_df(fitting_results_list: list, header_params: dict) pandas.core.frame.DataFrame[source]

Do we want to bring other columns with us like ion, polymer identity, etc? How to code that?

Parameters
  • fitting_results_list (list) – generated by find_EC_slope

  • header_params (dict) – Contains the information profiled in the samplename

Returns

lambdaE_df (pd.DataFrame) – dataframe containing lambdaE relaxation time for each run from the input df

dosertools.data_processing.fitting.calculate_elongational_visc(df: pandas.core.frame.DataFrame, summary_df: pandas.core.frame.DataFrame, optional_settings: dict = {}) pandas.core.frame.DataFrame[source]

Calculates the quantity (elongational viscosity / surface tension) for each moment in the DOS dataset.

Parameters
  • df (pd.DataFrame) – Contains D/D0, time, t - tc, strain rate, D(tc)/D0, etc for multiple runs and samples generated from data_processing.csv.generate_df

  • summary_df (pd.DataFrame) – Contains relaxation time, D(t_c)/D0, and sample info for all runs and samples generated from data_procescing.fitting.make_summary_dataframe

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults

needle_diameter_mm (float) – The needle outer diameter in millimeters. Default is 0.7176 mm (22G needle).

Returns

dataset_w_visc (pd.DataFrame) – Dataset with the additional values calculated by this function: strain and (elongational viscosity / surface tension)

dosertools.data_processing.fitting.derivative_EC_fit(Dtc_D0: float, lambdaE: float, time: float, tc: float) float[source]

Calculates the derivative of the elasto-capillary region.

D(t)/D0 = D(tc)/D0 * (exp(-(t - tc)/(3*LambdaE))) D’(t)/D0 = (-1/(3*LambdaE)) * D(tc)/D0 * (exp(-(t - tc)/(3*LambdaE)))

Parameters
  • Dtc_D0 (float) – The normalized diameter at which the transition to EC behavior occurs

  • LambdaE (float) – The relaxation time of the polymer solution

  • time (float) – the moment in time to evaluate the derivative

  • tc (float) – the critical time for the experiment, this number is purely emperical (it depends entirely on when in the process the video starts)

Returns

D’(t)/D0 (float)

dosertools.data_processing.fitting.find_EC_slope(run_dataset: pandas.core.frame.DataFrame, start: float, end: float) Tuple[float, float, float][source]

Finds the exponential decay of the EC region for a single dataset. Also returns the intercept and r value of the fit

Parameters
  • run_dataset (pd.DataFrame) – dataset of a single DOS run containing at least D/D0, time,

  • start (float) – value of D/D0 to start fitting the EC region from

  • end (float) – value of D/D0 to end fitting the EC region with

Returns

slope, intercept, r_value (floats) – slope is the slope of the semilog-transformed version of the dataset. It corresponds to the decaying exponential in the linear-linear version of the data

dosertools.data_processing.fitting.make_summary_dataframe(df: pandas.core.frame.DataFrame, sampleinfo_format: str, optional_settings: dict = {}) pandas.core.frame.DataFrame[source]

Condenses a DOS run into an extensional relaxation time by fitting the EC region (t > tc) to a decaying exponential

Parameters
  • df (pd.DataFrame) – Contains D/D0, time, t - tc, strain rate, D(tc)/D0, etc for multiple runs and samples generated from data_processing.csv.generate_df

  • sampleinfo_format (str) – the format of the sampleinfo section of the filename separated by the deliminator specified by sample_split

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults

fitting_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end of fitting of EC region. Default is [0.1, 0.045].

Returns

summary_df (pd.DataFrame) – dataframe containing lambdaE (relaxation time) and R(t_c)/R_0 for each run from the input df, along with their sample info

dosertools.data_processing.fitting.save_processed_df(df: pandas.core.frame.DataFrame, save_location: Union[str, bytes, os.PathLike], optional_settings: dict = {})[source]

Saves the processed dataset from a batch of videos.

The summary dataset includes the fluid properties, the ‘raw’ diameter vs time data, and the sampleinfo from the filename.

Parameters
  • df (pd.DataFrame) – Raw dataset with calculated columns for strain rate, strain, etc, and parsed sampleinfo columns.

  • save_location (path-like) – Path to folder in which to save the csv.

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.

  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • summary_filename (string) – The base filename (no extension) for saving the summary csvs. If not provided, will be generated automatically based on the current date and time. Default is “” to trigger automatic generation.

Returns

  • filename_string (string) – Filename at which the annotated dataframe was saved.

  • Saves file to disk.

dosertools.data_processing.fitting.save_summary_df(summary_df: pandas.core.frame.DataFrame, save_location: Union[str, bytes, os.PathLike], optional_settings: dict = {})[source]

Saves the summary dataset from a large processed batch of videos.

The summary dataset includes only the fluid properties, not the ‘raw’ diameter vs time data.

Parameters
  • summary_df (pd.DataFrame) – Contains LambdaE, D(tc)/D0, and sample info for multiple runs and samples

  • save_location (path-like) – path to folder in which to save the csv

  • filename (string) – Currently not used, but this will be the name to give to the summary dataset csv

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.

  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • summary_filename (string) – The base filename (no extension) for saving the summary csvs. If not provided, will be generated automatically based on the current date and time. Default is “” to trigger automatic generation.

Returns

  • filename_string (string) – Filename at which the summary dataframe was saved.

  • Saves file to disk.

dosertools.data_processing.integration module

dosertools.data_processing.integration.binaries_to_csvs(images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], short_fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]

Converts binary image folders into csvs of D/D0 vs. time.

Given a folder of folders of binary images, converts each set of binary images into a csv of D/D0 vs. time, retaining information in the filename.

Parameters
  • images_folder (path-like) – Path to a folder in which the results of image processing were saved (i.e. the folders of binary images).

  • csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.

  • short_fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split with only tags present in the names of the folders in images_folder. Should have “vtype” and “remove” tags removed compared to videos_to_binaries. Must contain “fps” tag. ex. “date_sampleinfo_fps_run”

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.

dosertools.data_processing.integration.csvs_to_summaries(csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], short_fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]

Processes the raw csvs and determines elongational relaxation time, D(tc)/D0, and elongational viscosity.

Parameters
  • csv_folder (path-like) – Path to a folder in which to find the csv containing D/D0 vs. time.

  • summary_folder (path-like) – Path to a folder in which to save the csv of the summary and the annotated datatset

  • short_fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Format should not have “vtype” and “remove” tags–csvs will not have those formatting tags still attached. ex. “date_sampleinfo_fps_run”

  • sampleinfo_format (str) – The format of the sampleinfo section of the fname separated by the deliminator specified by sample_split.

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.

  • sample_split (string) – The deliminator for splitting sampleinfo tag in folder/file names, used in sampleinfo_format. Default is “-“.

  • fitting_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end of fitting of EC region. Default is [0.1, 0.045].

  • tc_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end for finding the critical time. Default is [0.3,0.07].

dosertools.data_processing.integration.multiprocess_binaries_to_csvs(subfolder_index: int, subfolders: list, images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], short_fname_format: str, tic: float, optional_settings: dict = {}) None[source]

Converts binary image folders into csvs of D/D0 vs. time.

Given a folder of folders of binary images, converts each set of binary images into a csv of D/D0 vs. time, retaining information in the filename.

For multiprocessing to work properly, the function that invokes the pool of processors and the function that uses them need to be defined separately. Thus, this function is here, and is called in binaries_to_csvs.

Parameters
  • subfolder_index (int) – Index to keep track of which folder we are currently processing

  • subfolders (list of folders) – List of folders that contain the binaries that this function is reading to produce Diameter and time data

  • images_folder (path-like) – Path to a folder in which the results of image processing were saved (i.e. the folders of binary images).

  • csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.

  • short_fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split with only tags present in the names of the folders in images_folder. Should have “vtype” and “remove” tags removed compared to videos_to_binaries. Must contain “fps” tag. ex. “date_sampleinfo_fps_run”

  • tic (float) – Stores the time that the processing began at. Used in verbose mode to determine how long processing takes

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.

dosertools.data_processing.integration.multiprocess_vid_to_bin(file_number: int, fnames: list, exp_videos: list, bg_videos: list, images_folder: Union[str, bytes, os.PathLike], tic: float, optional_settings: dict = {}) None[source]

Converts videos in given folder into binary images.

Matches videos in videos_folder into experimental and background pairs, and converts those paired videos into background-subtracted binaries.

For multiprocessing to work properly, the function that invokes the pool of processors and the function that uses them need to be defined separately. Thus, this function is here, and is called in videos_to_binaries.

Parameters
  • file_number (int) – Index to keep track of which folder or video we are processing

  • fnames (list of strings) – List of base folder names for each matched pair of experimental and background folders.

  • exp_videos (list of paths) – List of paths to experimental video folders that were matched with backgrounds.

  • bg_videos (list of paths) – List of paths to background video folders matched with exp_videos.

  • images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.

  • tic (float) – Stores the time that the processing began at. Used in verbose mode to determine how long processing takes.

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.

  • background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.

  • one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.

  • save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.

  • save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.

  • skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.

  • image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.

dosertools.data_processing.integration.set_defaults(optional_settings: dict = {}) dict[source]

Sets default values for unset kets in optional_settings.

Parameters

optional_settings (dict) – Dictionary of optional settings.

Returns

settings (dict) – Dictionary with optional_settings and default values, prioritizing optional_settings values.

Optional Settings and Defaults
  • nozzle_row (int) – Row to use for determining the nozzle diameter. Default is 1.

  • crop_width_coefficient (float) – Multiplied by the calculated nozzle_diameter to determine the buffer on either side of the observed nozzle edges to include in the cropped image. Default is 0.02

  • crop_height_coefficient (float) – Multiplied by the calculated nozzle_diameter to determine the bottom row that will be included in the cropped image. Default is 2.

  • crop_nozzle_coefficient (float) – Multiplied by the calculated nozzle_diameter to determine the top row of the cropped image. Default is 0.15.

  • fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.

  • sample_split (string) – The deliminator for splitting sampleinfo tag in folder/file names, used in sampleinfo_format. Default is “-“.

  • experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.

  • background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.

  • one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.

  • bg_drop_removal (bool) – True to remove the background drop from the background that is subtracted from the image before binarization. False to not alter the background. Default is False.

  • save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.

  • save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.

  • fitting_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end of fitting of EC region. Default is [0.1, 0.045].

  • tc_bounds (2 element list of floats) – [start, end] The D/D0 to bound the start and end for finding the critical time. Default is [0.3,0.07].

  • needle_diameter_mm (float) – The needle outer diameter in millimeters. Default is 0.7176 mm (22G needle).

  • skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.

  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.

  • summary_filename (string) – The base filename (no extension) for saving the summary csvs. If not provided, will be generated automatically based on the current date and time. Default is “” to trigger automatic generation.

  • cpu_count (int) – How many cores to use for multithreading/multiprocessing. If nothing provided, default will be the maximum number of cores returned from os.cpu_count()

dosertools.data_processing.integration.videos_to_binaries(videos_folder: Union[str, bytes, os.PathLike], images_folder: Union[str, bytes, os.PathLike], fname_format: str, optional_settings: dict = {})[source]

Converts videos in given folder into binary images.

Matches videos in videos_folder into experimental and background pairs, and converts those paired videos into background-subtracted binaries.

Parameters
  • videos_folder (path-like) – Path to a folder of experimental and background video folders.

  • images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.

  • fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Must contain the “vtype” tag corresponding to experiment vs. background. Can contain “remove” to remove information that is not relevant or is different between the experimental and background video names and would prevent matching. ex. “date_sampleinfo_fps_run_vtype_remove_remove”

  • sampleinfo_format (str) – The format of the sampleinfo section of the fname separated by the deliminator specified by sample_split.

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • cpu_count (int) – How many cores to use for multithreading/multiprocessing. If nothing provided, default will be the maximum number of cores returned from os.cpu_count()

  • experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.

  • background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.

  • one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.

  • save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.

  • save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.

  • skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.

  • image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.

dosertools.data_processing.integration.videos_to_csvs(videos_folder: Union[str, bytes, os.PathLike], images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]

Converts videos in given folder into csvs of D/D0 vs. time.

Matches videos in videos_folder into experimental and background pairs, converts those paired videos into background-subtracted binaries, analyzes the resulting binaries to extract D/D0 vs. time, and saves the results to csvs.

Parameters
  • videos_folder (path-like) – Path to a folder of experimental and background video folders.

  • images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.

  • csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.

  • fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Must contain the “vtype” tag corresponding to experiment vs. background. Can contain “remove” to remove information that is not relevant or is different between the experimental and background video names and would prevent matching. Must contain “fps” tag. ex. “date_sampleinfo_fps_run_vtype_remove_remove”

  • optional_settings (dict) – A dictionary of optional settings.

Optional Settings and Defaults
  • fname_split (string) – The deliminator for splitting folder/file names, used in fname_format. Default is “_”.

  • experiment_tag (string) – The tag for identifying experimental videos. May be empty (“”). Default is “exp”.

  • background_tag (string) – The tag for identifying background videos. May not be empty. Default is “bg”.

  • one_background (bool) – True to use one background for a group of experiments only differing by run number. False to pair backgrounds and experiments 1:1. Default is False.

  • save_crop (bool) – True to save intermediate cropped images (i.e. experimental video images cropped but not background-subtracted or binarized). Default is False.

  • save_bg_sub (bool) – True to save background-subtracted images (i.e. experimental video images cropped and background-subtracted but not binarized). Default is False.

  • skip_existing (bool) – Determines the behavior when a file already appears exists when a function would generate it. True to skip any existing files. False to overwrite (or delete and then write, where overwriting would generate an error). Default is True.

  • verbose (bool) – Determines whether processing functions print statements as they progress through major steps. True to see print statements, False to hide non-errors/warnings. Default is False.

  • image_extension (string) – The extension for images in the video folder. TIFF recommended. Default is “tif”. Do not include “.”.

dosertools.data_processing.integration.videos_to_summaries(videos_folder: Union[str, bytes, os.PathLike], images_folder: Union[str, bytes, os.PathLike], csv_folder: Union[str, bytes, os.PathLike], summary_folder: Union[str, bytes, os.PathLike], fname_format: str, sampleinfo_format: str, optional_settings: dict = {})[source]

Full integrating function: converts from videos to csv files

Parameters
  • videos_folder (path-like) – Path to a folder of experimental and background video folders.

  • images_folder (path-like) – Path to a folder in which to save the results of image processing, binaries and optional cropped and background-subtracted images.

  • csv_folder (path-like) – Path to a folder in which to save the csv containing D/D0 vs. time.

  • summary_save_location (path-like) – Path to a folder in which to save the csv of the summary and the annotated datatset

  • fname_format (str) – The format of the fname with parameter names separated by the deliminator specified by fname_split. Must contain the “vtype” tag corresponding to experiment vs. background. Can contain “remove” to remove information that is not relevant or is different between the experimental and background video names and would prevent matching. ex. “date_sampleinfo_fps_run_vtype_remove_remove”

  • sampleinfo_format (str) – The format of the sampleinfo section of the fname separated by the deliminator specified by sample_split.

  • optional_settings (dict) – A dictionary of optional settings.

Module contents