find_delay
- find_delay.find_delay(array_1, array_2, freq_array_1=1, freq_array_2=1, compute_envelope=True, window_size_env=1000000.0, overlap_ratio_env=0.5, filter_below=None, filter_over=50, resampling_rate=None, window_size_res=10000000.0, overlap_ratio_res=0.5, resampling_mode='cubic', return_delay_format='index', return_correlation_value=False, threshold=0.9, plot_figure=False, plot_intermediate_steps=False, x_format_figure='auto', path_figure=None, mono_channel=0, verbosity=1)
This function tries to find the timestamp at which an excerpt (array_2) begins in a time series (array_1). The computation is performed through cross-correlation. Before so, the envelopes of both arrays can first be calculated and filtered (recommended for audio files), and resampled (necessary when the sampling rate of the two arrays is unequal). The function returns the timestamp of the maximal correlation value, or None if this value is below threshold. Optionally, it can also return a second element, the maximal correlation value.
Added in version 1.0.
Changed in version 1.1: Separated the figure generation into a new function _create_figure.
Changed in version 2.0: Changed parameters number_of_windows to window_size.
Changed in version 2.1: Decreased default window_size_res value from 1e8 to 1e7.
Changed in version 2.3: Corrected the figure saving to a file.
Changed in version 2.4: Modified the cross-correlation to look for the excerpt at the edges of the first array. Added the new parameter x_format_figure, allowing to have HH:MM:SS times on the x-axis.
Changed in version 2.9: array_1 and array_2 can now be strings containing paths to WAV files. Added the parameter mono_channel.
Important
Because it is easy to get confused: this function returns the timestamp in array_1 where array_2 begins. This means that, if you want to align array_1 and array_2, you need to remove the delay to each timestamp of array_1: that way, the value at timestamp 0 in array_1 will be aligned with the value at timestamp 0 in array_2.
Note
Since version 2.4, this function can find excerpts containing data that would be present outside the main array. In other words, if the excerpt starts 1 second before the onset of the original array, the function will return a delay of -1 sec. However, this should be avoided, as information missing from the original array will result in lower correlation - with a substantial amount of data missing from the original array, the function may return erroneous results. This is why it is always preferable to use excerpts that are entirely contained in the original array.
- Parameters:
array_1 (list, np.ndarray or str) –
A first array of samples, or a string containing the path to a WAV file. In this case, the parameter freq_array_1 will be ignored and extracted from the WAV file. Note that if the WAV file contains more than one channel, the function will turn the WAV to mono by only keeping the channel with index 0.
Changed in version 2.9.
array_2 (list, np.ndarray or str) –
An second array of samples, smaller than or of equal size to the first one, that is allegedly an excerpt from the first one. The amplitude, frequency or values do not have to match exactly the ones from the first array. The parameter can also be a string containing the path to a WAV file (see description of parameter array_1).
Changed in version 2.9.
freq_array_1 (int or float, optional) – The sampling frequency of the first array, in Hz (default: 1).
freq_array_2 (int or float, optional) – The sampling frequency of the second array, in Hz (default: 1).
compute_envelope (bool, optional) – If True (default), calculates the envelope of the array values before performing the cross-correlation.
window_size_env (int or None, optional) –
The size of the windows in which to cut the arrays to calculate the envelope. Cutting long arrays in windows allows to speed up the computation. If this parameter is set on None, the window size will be set on the number of samples. A good value for this parameter is generally 1 million.
Added in version 2.0.
overlap_ratio_env (float or None, optional) – The ratio of samples overlapping between each window. If this parameter is not None, each window will overlap with the previous (and, logically, the next) for an amount of samples equal to the number of samples in a window times the overlap ratio. Then, only the central values of each window will be preserved and concatenated; this allows to discard any “edge” effect due to the windowing. If the parameter is set on None or 0, the windows will not overlap. By default, this parameter is set on 0.5, meaning that each window will overlap for half of their values with the previous, and half of their values with the next.
filter_below (int or None, optional) – If set, a high-pass filter will be applied on the envelopes before performing the cross-correlation (default: 0 Hz).
filter_over (int or None, optional) – If set, a low-pass filter will be applied on the envelopes before performing the cross-correlation (default: 50 Hz).
resampling_rate (int or None, optional) – The sampling rate at which to downsample the arrays for the cross-correlation. A larger value will result in longer computation times. Setting the parameter on None will not downsample the arrays, which will result in an error if the two arrays are not the same frequency. If this parameter is None, the next parameters related to resampling can be ignored. A recommended value for this parameter when working with audio files is 1000, as it will speed up the computation of the cross-correlation while still giving a millisecond-precision delay.
window_size_res (int or None, optional) –
The size of the windows in which to cut the arrays. Cutting lo,g arrays in windows allows to speed up the computation. If this parameter is set on None, the window size will be set on the number of samples. A good value for this parameter is generally 1e7.
Added in version 2.0.
Changed in version 2.1: Decreased default window_size_res value from 1e8 to 1e7.
overlap_ratio_res (float or None, optional) – The ratio of samples overlapping between each window. If this parameter is not None, each window will overlap with the previous (and, logically, the next) for an amount of samples equal to the number of samples in a window times the overlap ratio. Then, only the central values of each window will be preserved and concatenated; this allows to discard any “edge” effect due to the windowing. If the parameter is set on None or 0, the windows will not overlap. By default, this parameter is set on 0.5, meaning that each window will overlap for half of their values with the previous, and half of their values with the next.
resampling_mode (str, optional) –
This parameter allows for various values:
"linear"
performs a linear numpy.interp interpolation. This method, though simple, may not be very precise for upsampling naturalistic stimuli."cubic"
performs a cubic interpolation via scipy.interpolate.CubicSpline. This method, while smoother than the linear interpolation, may lead to unwanted oscillations nearby strong variations in the data."pchip"
performs a monotonic cubic spline interpolation (Piecewise Cubic Hermite Interpolating Polynomial) via scipy.interpolate.PchipInterpolator."akima"
performs another type of monotonic cubic spline interpolation, using scipy.interpolate.Akima1DInterpolator."take"
keeps one out of n samples from the original array. While being the fastest computation, it will be prone to imprecision if the downsampling factor is not an integer divider of the original frequency."interp1d_XXX"
uses the function scipy.interpolate.interp1d. The XXX part of the parameter can be replaced by"linear"
,"nearest"
,"nearest-up"
,"zero"
, “slinear”, ``"quadratic"
,"cubic"
,"previous"
, and"next"
(see the documentation of this function for specifics).
threshold (float, optional) – The threshold of the minimum correlation value between the two arrays to accept a delay as a solution. If multiple delays are over threshold, the delay with the maximum correlation value will be returned. This value should be between 0 and 1; if the maximum found value is below the threshold, the function will return None instead of a timestamp.
return_delay_format (str, optional) –
This parameter can be either
"index"
,"ms"
,"s"
, or"timedelta"
:If
"index"
(default), the function will return the index in array_1 at which array_2 has the highest cross-correlation value.If
"ms"
, the function will return the timestamp in array_1, in milliseconds, at which array_2 has the highest cross-correlation value.If
"s"
, the function will return the timestamp in array_1, in seconds, at which array_2 has the highest cross-correlation value.If
"timedelta"
, the function will return the timestamp in array_1 at which array_2 has the highest cross-correlation value as a datetime.timedelta object. Note that, in the case where the result is negative, the timedelta format may give unexpected display results (-1 second returns -1 days, 86399 seconds).
return_correlation_value (bool, optional) – If True, the function returns a second value: the correlation value at the returned delay. This value will be None if it is below the specified threshold.
plot_figure (bool, optional) – If set on True, plots a graph showing the result of the cross-correlation using Matplotlib. Note that plotting the figure causes an interruption of the code execution.
plot_intermediate_steps (bool, optional) – If set on True, plots the original arrays, the envelopes (if calculated) and the resampled arrays (if calculated) besides the cross-correlation.
x_format_figure (str, optional) –
If set on “time”, the values on the x axes of the output will take the HH:MM:SS format (or MM:SS if the time series are less than one hour long). If set on “float”, the values on the x axes will be displayed as float (unit: second). If set on “auto” (default), the format of the values on the x axes will be defined depending on the value of return_delay_format.
Added in version 2.4.
path_figure (str or None, optional) – If set, saves the figure at the given path.
mono_channel (int or str, optional) –
Defines the method to use to convert multiple-channel WAV files to mono, if one of the parameters array1 or array2 is a path pointing to a WAV file. By default, this parameter value is
0
: the channel with index 0 in the WAV file is used as the array, while all the other channels are discarded. This value can be any of the channels indices (using1
will preserve the channel with index 1, etc.). This parameter can also take the value"average"
: in that case, a new channel is created by averaging the values of all the channels of the WAV file. Note that this parameter applies to both arrays: in the case where you need to select different channels for each WAV file, open the files before calling the function and pass the samples and frequencies as parameters.Added in version 2.9.
verbosity (int, optional) –
Sets how much feedback the code will provide in the console output:
0: Silent mode. The code won’t provide any feedback, apart from error messages.
1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.
2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.
- Returns:
int, float, timedelta or None – The sample index, timestamp or timedelta of array_1 at which array_2 can be found (defined by the parameter return_delay_format), or None if array1 is not contained in array_2.
float or None, optional – Optionally, if return_correlation_value is True, the correlation value at the corresponding index/timestamp.