Audio

Description

Default class for audio recordings matching a Sequence, typically the voice of the subject of the motion capture. This class allows to perform a variety of transformations of the audio stream, such as getting the envelope, pitch and formants of the speech.

Initialisation

class classes.audio.Audio(path_or_samples, frequency=None, name=None, verbosity=1)

Default class for audio clips matching a Sequence, typically the voice of the subject of the motion capture. This class allows to perform a variety of transformations of the audio stream, such as getting the envelope, pitch and formants of the speech.

New in version 2.0.

Parameters:
  • path_or_samples (str or list(int)) – The path to the audio file, or a list containing the samples of an audio file. If the file is a path, it should either point to a .wav file, or to a file containing the timestamps and samples in a text form (.json, .csv, .tsv, .txt or .mat). It is also possible to point to a folder containing one file per sample. See Audio formats for the acceptable file types.

  • frequency (int or float, optional) – The frequency, in Hz (or samples per sec) at which the parameter path_or_samples is set. This parameter will be ignored if path_or_samples is a path, but will be used to define the timestamps of the Audio object if path_or_samples is a list of samples.

  • name (str, optional) – Defines a name for the Audio instance. If a string is provided, the attribute name will take its value. If not, see Audio._define_name_init().

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

samples

A list containing the audio samples, in chronological order.

Type:

np.ndarray(int)

timestamps

A list containing the timestamps matching each audio sample. Consequently, samples and timestamps should have the same length.

Type:

list(float)

frequency

The amount of samples per second.

Type:

int or float

name

Custom name given to the audio. If no name has been provided upon initialisation, it will be defined by Audio._define_name_init().

Type:

str

path

Path to the audio file passed as a parameter upon creation; if samples were provided, this attribute will be None.

Type:

str

files

List of files contained in the path. The list will be of size 1 if the path points to a single file.

Type:

list(str)

kind

A parameter that is set on "Audio", to differentiate it from the different types of AudioDerivative.

Type:

str

Magic methods

Audio.__len__()

Returns the number of samples in the audio clip (i.e., the length of the attribute samples).

New in version 2.0.

Returns:

The number of samples in the audio clip.

Return type:

int

Audio.__getitem__(index)

Returns the sample of index specified by the parameter index.

Parameters:

index (int) – The index of the sample to return.

Returns:

A sample from the attribute samples.

Return type:

float

Public methods

Setter functions

Audio.set_name(name)

Sets the name attribute of the Audio instance. This name can be used as display functions or as a means to identify the audio.

New in version 2.0.

Parameters:

name (str) – A name to describe the audio clip.

Example

>>> aud = Audio("C:/Users/Walter/Sequences/audio.wav")
>>> aud.set_name("Audio 28980")

Getter functions

Audio.get_path()

Returns the attribute path of the Audio instance.

. versionadded:: 2.0

Returns:

The path of the Audio instance.

Return type:

str

Audio.get_name()

Returns the attribute name of the Audio instance.

. versionadded:: 2.0

Returns:

The name of the Audio instance.

Return type:

str

Audio.get_samples()

Returns the attribute samples of the Audio instance.

. versionadded:: 2.0

Returns:

The samples of the Audio instance.

Return type:

list(int)

Audio.get_sample(sample_index)

Returns the sample corresponding to the index passed as parameter.

New in version 2.0.

Parameters:

sample_index (int) – The index of the sample.

Returns:

A sample from the sequence.

Return type:

int

Audio.get_number_of_samples()

Returns the number of samples in the audio clip.

New in version 2.0.

Returns:

The amount of samples in the audio clip.

Return type:

int

Audio.get_timestamps()

Returns a list of the timestamps for every sample, in seconds.

New in version 2.0.

Returns:

List of the timestamps of all the samples of the audio clip, in seconds.

Return type:

list(float)

Audio.get_duration()

Returns the duration of the audio clip, in seconds.

New in version 2.0.

Returns:

The duration of the audio clip, in seconds.

Return type:

float

Audio.get_frequency()

Returns the frequency of the audio clip, in hertz.

New in version 2.0.

Returns:

The frequency of the audio clip, in hertz.

Return type:

int or float

Transformation functions

Audio.get_envelope(filter_below=0, filter_over=10, verbosity=1)

Calculates the envelope of the audio clip, applies a band-pass filter if values are provided, and returns an Envelope object.

New in version 2.0.

Parameters:
  • filter_below (int or None, optional) – If not None nor 0, this value will be provided as the lowest frequency of the band-pass filter.

  • filter_over (int or None, optional) – If not None nor 0, this value will be provided as the highest frequency of the high-pass filter.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Returns:

The filtered envelope of the audio clip.

Return type:

Envelope

Audio.get_pitch(zeros_as_nan=False, verbosity=1)

Calculates the pitch of the voice in the audio clip, and returns a Pitch object.

New in version 2.0.

Parameters:
  • zeros_as_nan (bool, optional) – If set on True, the values where the pitch is equal to 0 will be replaced by numpy.nan objects.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Returns:

The pitch of the voice in the audio clip.

Return type:

Pitch

Audio.get_intensity(verbosity=1)

Calculates the intensity of the voice in the audio clip, and returns an Intensity object.

New in version 2.0.

Parameters:

verbosity (int, optional) –

Sets how much feedback the code will provide in the console output:

  • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

  • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

  • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Returns:

The intensity of the voice in the audio clip.

Return type:

Intensity

Audio.get_formant(formant=1, verbosity=1)

Calculates the formants of the voice in the audio clip, and returns a Formant object.

New in version 2.0.

Parameters:
  • formant (int, optional.) – One of the formants of the voice in the audio clip (1 (default), 2, 3, 4 or 5).

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Returns:

The value of a formant of the voice in the audio clip.

Return type:

Formant

Audio.resample(frequency, mode='cubic', name=None, verbosity=1)

Resamples an audio clip to the frequency parameter. It first creates a new set of timestamps at the desired frequency, and then interpolates the original data to the new timestamps.

New in version 2.0.

Parameters:
  • frequency (float) – The frequency, in hertz, at which you want to resample the audio clip. A frequency of 4 will return samples at 0.25 s intervals.

  • mode (str, optional) – This parameter also allows for all the values accepted for the kind parameter in the function scipy.interpolate.interp1d(): "linear", "nearest", "nearest-up", "zero", "slinear", "quadratic", "cubic"”, "previous", and "next". See the documentation for this Python module for more.

  • name (str or None, optional) – Defines the name of the output audio clip. If set on None, the name will be the same as the input audio clip, with the suffix "+RS".

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Returns:

A new audio clip containing resampled timestamps and samples.

Return type:

Audio

Warning

This function allows both the upsampling and the downsampling of audio clips. However, during any of these operations, the algorithm only estimates the real values of the samples. You should then consider the upsampling (and the downsampling, to a lesser extent) with care. You can control the frequency of the original audio clip with Audio.get_frequency().

Conversion functions

Audio.convert_to_table()

Returns a list of lists where each sublist contains a timestamp and a sample. The first sublist contains the headers of the table. The output then resembles the table found in Tabled formats.

New in version 2.0.

Returns:

A list of lists that can be interpreted as a table, containing headers, and with the timestamps and the coordinates of the joints from the sequence on each row.

Return type:

list(list)

Audio.convert_to_json()

Returns a list ready to be exported in JSON. The returned JSON data is a dictionary with two keys: “Sample” is the key to the list of samples, while “Frequency” is the key to the sampling frequency of the audio clip.

New in version 2.0.

Returns:

A dictionary containing the data of the audio clip, ready to be exported in JSON.

Return type:

dict

Saving function

Audio.save(folder_out, name=None, file_format='xlsx', individual=False, verbosity=1)

Saves an audio clip in a file or a folder. The function saves the sequence under folder_out/name.file_format. All the non-existent subfolders present in the folder_out path will be created by the function. The function also updates the path attribute of the Audio clip.

New in version 2.0.

Parameters:
  • folder_out (str, optional) – The path to the folder where to save the file or files. If one or more subfolders of the path do not exist, the function will create them. If the string provided is empty (by default), the audio clip will be saved in the current working directory. If the string provided contains a file with an extension, the fields name and file_format will be ignored.

  • name (str or None, optional) – Defines the name of the file or files where to save the audio clip. If set on None, the name will be set on the attribute name of the audio clip; if that attribute is also set on None, the name will be set on "out". If individual is set on True, each sample will be saved as a different file, having the index of the pose as a suffix after the name (e.g. if the name is "sample" and the file format is "txt", the poses will be saved as sample_0.txt, sample_1.txt, sample_2.txt, etc.).

  • file_format (str or None, optional) –

    The file format in which to save the audio clip. The file format must be "json" (default), "xlsx", "txt", "csv", "tsv", "wav", or, if you are a masochist, "mat". Notes:

    • "xls" will save the file with an .xlsx extension.

    • Any string starting with a dot will be accepted (e.g. ".csv" instead of "csv").

    • "csv;" will force the value separator on ;, while "csv," will force the separator on ,. By default, the function will detect which separator the system uses.

    • "txt" and "tsv" both separate the values by a tabulation.

    • Any other string will not return an error, but rather be used as a custom extension. The data will be saved as in a text file (using tabulations as values separators).

    Warning

    While it is possible to save audio clips as .mat or custom extensions, the toolbox will not recognize these files upon opening. The support for .mat and custom extensions as input may come in a future release, but for now these are just offered as output options.

  • individual (bool, optional) –

    If set on False (default), the function will save the audio clip in a unique file. If set on True, the function will save each sample of the audio clip in an individual file, appending an underscore and the index of the sample (starting at 0) after the name. This option is not available and will be ignored if file_format is set on "wav".

    Warning

    It is not recommended to save each sample in a different file. This incredibly tedious way of handling audio files has only been implemented to follow the same logic as for the Sequence files, and should be avoided.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Private methods

Initialisation functions

Audio._define_name_init(name, verbosity=1)

Sets the name attribute for an instance of the Audio class, using the name provided during the initialization, or the path. If no name is provided, the function will create the name based on the path attribute, by defining the name as the last element of the path hierarchy (last subfolder, or file name). For example, if path is "C:/Users/Bender/Documents/Recording001/", the function will define the name on "Recording001". If both name and path are set on None, the sequence name will be defined as "Unnamed audio".

New in version 2.0.

Parameters:
  • name (str) – The name passed as parameter in Audio.__init__()

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._load_from_path(path, verbosity=1)

Loads the audio data from the path provided during the initialization.

New in version 2.0.

Parameters:
  • path (str) – Path to the audio file passed as a parameter upon creation.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._load_from_samples(samples, frequency, verbosity=1)

Loads the audio data when samples and frequency have been provided upon initialisation.

New in version 2.0.

Parameters:
  • samples (list(int)) – A list containing the audio samples, in chronological order.

  • frequency (int or float) – The amount of samples per second.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._fetch_files_from_folder(verbosity=1)

Finds all the files ending with the accepted extensions (.csv, .json, .tsv, .txt, or .xlsx) in the folder defined by path, and orders the files according to their name.

New in version 2.0.

Note

This functions ignores the elements of the directory defined by path if:
  • They don’t have an extension

  • They are a folder

  • Their extension is not one of the accepted ones (.csv, .json, .tsv, .txt, or .xlsx)

  • The file name does not contain an underscore (_)

If a file has a valid extension, the function tries to detect an underscore (_) in the name. The file names should be xxxxxx_0.ext, where xxxxxx can be any series of characters, 0 must be the index of the sample (with or without leading zeros), and ext must be an accepted extension (.csv, .json, .tsv, .txt, or .xlsx). The first pose of the sequence must have the index 0. If the file does not have an underscore in the name, it is ignored. The indices must be coherent with the chronological order of the timestamps.

The function uses the number after the underscore to order the samples. This is due to differences in how file systems handle numbers without leading zeros: some place sample_11.json alphabetically before sample_2.json (1 comes before 2), while some other systems place it after as 11 is greater than 2. In order to avoid these, the function converts the number after the underscore into an integer to place it properly according to its index.

Parameters:

verbosity (int, optional) –

Sets how much feedback the code will provide in the console output:

  • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

  • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

  • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._load_samples(verbosity=1)

Loads the single sample files or the global file containing all the samples. Depending on the input, this function calls either Audio._load_single_sample_file() or Audio._load_audio_file().

New in version 2.0.

Parameters:

verbosity (int, optional) –

Sets how much feedback the code will provide in the console output:

  • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

  • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

  • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._load_single_sample_file(path)

Loads the content of a single sample file into the Audio object. Depending on the file type, this function handles the content differently (see Audio formats).

New in version 2.0.

Parameters:

path (str) – The path of a file containing a single sample and timestamp.

Audio._load_audio_file(verbosity=1)

Loads the content of a file containing all the samples of the audio stream. Depending on the file type, this function handles the content differently (see Audio formats).

New in version 2.0.

Parameters:

verbosity (int, optional) –

Sets how much feedback the code will provide in the console output:

  • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

  • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

  • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._read_wav(verbosity=1)

Opens a .wav file using scipy.io.wavfile.read, and loads the attributes samples and frequency. If the wav file has more than one channel, it is converted to mono by averaging the values from all the samples via tool_functions.stereo_to_mono().

New in version 2.0.

Parameters:

verbosity (int, optional) –

Sets how much feedback the code will provide in the console output:

  • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

  • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

  • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._read_text_file(verbosity=1)

Opens a file containing the samples in .json, .xlsx, .csv, .tsv or .txt file containing the timestamps and samples of the audio and loads the attributes samples, timestamps, and frequency. Depending on the file type, this function handles the content differently (see Audio formats).

New in version 2.0.

Parameters:

verbosity (int, optional) –

Sets how much feedback the code will provide in the console output:

  • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

  • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

  • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._calculate_frequency()

Determines the frequency (number of samples per second) of the audio by calculating the time elapsed between the two first timestamps. This function is automatically called when reading the timestamps and samples from a text file.

New in version 2.0.

Audio._calculate_timestamps()

Calculates the timestamps of the audio samples from the frequency and the number of samples. This function is automatically called when reading the frequency and samples from an audio file, or when a list of samples and a frequency are passed as a parameters during the initialisation.

New in version 2.0.

Saving functions

Audio._save_json(folder_out, name=None, individual=False, verbosity=1)

Saves an audio clip as a json file or files. This function is called by the Audio.save() method, and saves the Audio instance as folder_out/name.file_format.

New in version 2.0.

Parameters:
  • folder_out (str) – The path to the folder where to save the file or files. If one or more subfolders of the path do not exist, the function will create them.

  • name (str or None, optional) – Defines the name of the file or files where to save the audio clip. If set on None, the name will be set on "out" if individual is False, or on "sample" if individual is True.

  • individual (bool, optional) –

    If set on False (default), the function will save the audio clip in a unique file. If set on True, the function will save each sample of the audio clip in an individual file, appending an underscore and the index of the sample (starting at 0) after the name.

    Warning

    It is not recommended to save each sample in a different file. This incredibly tedious way of handling audio files has only been implemented to follow the same logic as for the Sequence files, and should be avoided.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._save_mat(folder_out, name=None, individual=False, verbosity=1)

Saves an audio clip as a Matlab .mat file or files. This function is called by the Audio.save() method, and saves the Audio instance as folder_out/name.file_format.

New in version 2.0.

Important

This function is dependent of the module scipy.

Parameters:
  • folder_out (str) – The path to the folder where to save the file or files. If one or more subfolders of the path do not exist, the function will create them.

  • name (str or None, optional) – Defines the name of the file or files where to save the audio clip. If set on None, the name will be set on "out" if individual is False, or on "sample" if individual is True.

  • individual (bool, optional) –

    If set on False (default), the function will save the audio clip in a unique file. If set on True, the function will save each sample of the audio clip in an individual file, appending an underscore and the index of the sample (starting at 0) after the name.

    Warning

    It is not recommended to save each sample in a different file. This incredibly tedious way of handling audio files has only been implemented to follow the same logic as for the Sequence files, and should be avoided.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._save_xlsx(folder_out, name=None, individual=False, verbosity=1)

Saves an audio clip as an Excel .xlsx file or files. This function is called by the Audio.save() method, and saves the Audio instance as folder_out/name.file_format.

New in version 2.0.

Important

This function is dependent of the module openpyxl.

Parameters:
  • folder_out (str) – The path to the folder where to save the file or files. If one or more subfolders of the path do not exist, the function will create them.

  • name (str or None, optional) – Defines the name of the file or files where to save the audio clip. If set on None, the name will be set on "out" if individual is False, or on "sample" if individual is True.

  • individual (bool, optional) –

    If set on False (default), the function will save the audio clip in a unique file. If set on True, the function will save each sample of the audio clip in an individual file, appending an underscore and the index of the sample (starting at 0) after the name.

    Warning

    It is not recommended to save each sample in a different file. This incredibly tedious way of handling audio files has only been implemented to follow the same logic as for the Sequence files, and should be avoided.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.

Audio._save_wav(folder_out, name=None)

Saves an audio clip as a .wav file or files. This function is called by the Audio.save() method, and saves the Audio instance as folder_out/name.file_format.

New in version 2.0.

Parameters:
  • folder_out (str) – The path to the folder where to save the file or files. If one or more subfolders of the path do not exist, the function will create them.

  • name (str or None, optional) – Defines the name of the file or files where to save the audio clip. If set on None, the name will be set on "out".

Audio._save_txt(folder_out, name=None, file_format='csv', individual=False, verbosity=1)

Saves an audio clip as a .txt, .csv, .tsv, or custom extension file or files. This function is called by the Audio.save() method, and saves the Audio instance as folder_out/name.file_format.

New in version 2.0.

Parameters:
  • folder_out (str) – The path to the folder where to save the file or files. If one or more subfolders of the path do not exist, the function will create them.

  • name (str or None, optional) – Defines the name of the file or files where to save the audio clip. If set on None, the name will be set on "out" if individual is False, or on "sample" if individual is True.

  • file_format (str, optional) – The file format in which to save the audio clip. The file format can be "txt", "csv" (default) or "tsv". "csv;" will force the value separator on ";", while "csv," will force the separator on ",". By default, the function will detect which separator the system uses. "txt" and "tsv" both separate the values by a tabulation. Any other string will not return an error, but rather be used as a custom extension. The data will be saved as in a text file (using tabulations as values separators).

  • individual (bool, optional) –

    If set on False (default), the function will save the audio clip in a unique file. If set on True, the function will save each sample of the audio clip in an individual file, appending an underscore and the index of the sample (starting at 0) after the name.

    Warning

    It is not recommended to save each sample in a different file. This incredibly tedious way of handling audio files has only been implemented to follow the same logic as for the Sequence files, and should be avoided.

  • verbosity (int, optional) –

    Sets how much feedback the code will provide in the console output:

    • 0: Silent mode. The code won’t provide any feedback, apart from error messages.

    • 1: Normal mode (default). The code will provide essential feedback such as progression markers and current steps.

    • 2: Chatty mode. The code will provide all possible information on the events happening. Note that this may clutter the output and slow down the execution.