Manual¶
This page is a detailed guide for using ms3 for different tasks. It supposes you are working in an interactive Python interpreter such as IPython, Jupyter, Google Colab, or just the console.
Good to know¶
Terminology¶
Measure counts (MC) vs. measure numbers (MN)¶
Measure counts are strictly increasing numbers for all <measure> nodes in the score, regardless of their length. This information is crucial for correctly addressing positions in a MuseScore file and are shown in the software’s status bar. The first measure is always counted as 1 (following MuseScore’s convention), even if it is an anacrusis.
Measure numbers are the traditional way by which humans refer to positions in a score. They follow a couple of conventions which can be summarised as counting complete bars. Quite often, a complete bar (MN) can be made up of two <measure> nodes (MC). In the context of this library, score addressability needs to be maintained for humans and computers, therefore a mapping MC -> MN is preserved in the score information DataFrames.
Onset positions¶
Onsets express positions of events in a score as their distance from the beginning of the corresponding
MC or MN. The distances are expressed as fractions of a whole note. In other words, beat 1 has
onset 0
, an event on beat 2 of a 4/4 meter has onset 1/4
and so on.
Since there are two ways of referencing measures (MC and MN), there are also two ways of expressing onsets:
mc_onset
expresses the distance from the corresponding MCmn_onset
expresses the distance from the corresponding MN
In most cases, the two values value will be identical, but take as an example the case where a 4/4 measure with MN 8
is divided into MC 9 of length 3/4 and MC 10 of length 1/4 because of a repeat sign or a double bar line. Since MC 9
corresponds to the first part of MN 8, the two onset values are identical. But for the anacrusis on beat 4, the values
differ: mc_onset
is 0
but mn_onset
is 3/4
because this is the distance from MN 8.
Read-only mode¶
For parsing faster using less memory. Scores parsed in read-only mode cannot be changed because the original XML structure is not kept in memory.
Parsing¶
This chapter explains how to
parse a single score to access and manipulate the contained information using a
Score
objectparse a group of scores to access and manipulate the contained information using a
Parse
object.
Parsing a single score¶
Import the library.
To parse a single score, we will use the class
Score
. We could import the whole library:>>> import ms3 >>> s = ms3.Score()
or simply import the class:
>>> from ms3 import Score >>> s = Score()
Locate the MuseScore 3 score you want to parse.
Tip
MSCZ files are ZIP files containing the uncompressed MSCX. In order to trace the score’s version history, it is recommended to always work with MSCX files.
In the examples, we parse the annotated first page of Giovanni Battista Pergolesi’s influential Stabat Mater. The file is called
stabat.mscx
and can be downloaded from here (open link and keyCtrl + S
to save the file or right-click on the link toSave link as...
).Create a
Score
object.In the example, the MuseScore 3 file is located at
~/ms3/docs/stabat.mscx
so we can simply create the object and bind it to the variables
like so:>>> from ms3 import Score >>> s = Score('~/ms3/docs/stabat.mscx')
Inspect the object.
To have a look at the created object we can simply evoke its variable:
>>> s MuseScore file -------------- ~/ms3/docs/stabat.mscx Attached annotations -------------------- 48 labels: staff voice label_type color_name 3 2 0 (dcml) default 48
Parsing options¶
-
Score.
__init__
(musescore_file=None, infer_label_types=['dcml'], read_only=False, labels_cfg={}, logger_cfg={}, parser='bs4', ms=None)[source] - Parameters
musescore_file (
str
, optional) – Path to the MuseScore file to be parsed.infer_label_types (
list
ordict
, optional) – Determine which label types are determined automatically. Defaults to [‘dcml’]. Pass[]
to infer only main types 0 - 3. Pass{'type_name': r"^(regular)(Expression)$"}
to callms3.Score.new_type()
.read_only (
bool
, optional) – Defaults toFalse
, meaning that the parsing is slower and uses more memory in order to allow for manipulations of the score, such as adding and deleting labels. Set toTrue
if you’re only extracting information.labels_cfg (
dict
) – Store a configuration dictionary to determine the output format of theAnnotations
object representing the currently attached annotations. SeeMSCX.labels_cfg
.logger_cfg (
dict
, optional) – The following options are available: ‘name’: LOGGER_NAME -> by default the logger name is based on the parsed file(s) ‘level’: {‘W’, ‘D’, ‘I’, ‘E’, ‘C’, ‘WARNING’, ‘DEBUG’, ‘INFO’, ‘ERROR’, ‘CRITICAL’} ‘file’: PATH_TO_LOGFILE to store all log messages under the given path.parser ('bs4', optional) – The only XML parser currently implemented is BeautifulSoup 4.
ms (
str
, optional) – If you want to parse musicXML files or MuseScore 2 files by temporarily converting them, pass the path or command of your local MuseScore 3 installation. If you’re using the standard path, you may try ‘auto’, or ‘win’ for Windows, ‘mac’ for MacOS, or ‘mscore’ for Linux.
Parsing multiple scores¶
Import the library.
To parse multiple scores, we will use the class
ms3.Parse
. We could import the whole library:>>> import ms3 >>> p = ms3.Parse()
or simply import the class:
>>> from ms3 import Parse >>> p = Parse()
Locate the folder containing MuseScore files.
In this example, we are going to parse all files included in the ms3 repository which has been cloned into the home directory and therefore has the path
~/ms3
.Create a
Parse
objectThe object is created by calling it with the directory to scan, and bound to the variable
p
. By default, scores are grouped by the subdirectories they are in and one key is automatically created for each of them to access the files separately.>>> from ms3 import Parse >>> p = Parse('~/ms3') >>> p
58 files. KEY -> EXTENSIONS ---------------------------------- tests/measures -> {'.tsv': 7} docs -> {'.mscx': 4} tests/harmonies -> {'.tsv': 4} tests/notes -> {'.tsv': 7} tests -> {'.tsv': 25, '.xml': 1} tests/MS3 -> {'.mscx': 7} tests/repeat_dummies -> {'.mscx': 3} None of the 15 score files have been parsed. 1 files would need to be converted, for which you need to set the 'ms' property to your MuseScore 3 executable.
By default, present TSV files are detected and can be parsed as well, allowing one to access already extracted information without parsing the scores anew. In order to select only particular files, a regular expression can be passed to the parameter
file_re
. In the following example, only files ending onmscx
are collected in the object ($
stands for the end of the filename, without it, files including the string ‘mscx’ anywhere in their names would be selected, too):>>> from ms3 import Parse >>> p = Parse('~/ms3', file_re='mscx$', key='ms3') >>> p
14 files. KEY -> EXTENSIONS ----------------- ms3 -> {'.mscx': 14} None of the 14 score files have been parsed.
In this example, we assigned the key
'ms3'
. Note that the same MSCX files that were distributed over several keys in the previous example are now grouped together. Keys allow operations to be performed on a particular group of selected files. For example, we could add MSCX files from another folder using the methodadd_dir()
and the key'other'
:>>> p.add_dir('~/other_folder', file_re='mscx$', key='other') >>> p
72 files. KEY -> EXTENSIONS ------------------- other -> {'.mscx': 58} ms3 -> {'.mscx': 14} None of the 72 score files have been parsed.
Most methods of the
Parse
object have akeys
parameter to perform an operation of a particular group of files.Parse the scores.
In order to simply parse all registered MuseScore files, call the method
parse_mscx()
. Instead, you can pass the argumentkeys
to parse only one (or several) selected group(s) to save time. The argumentlevel
controls how many log messages you see; here, it is set to ‘critical’ or ‘c’ to suppress all warnings:>>> p.parse_mscx(keys='ms3', level='c') >>> p
/home/hentsche/PycharmProjects/ms3/src/ms3/bs4_parser.py:349: PerformanceWarning: DataFrame is highly fragmented. This is usually the result of calling `frame.insert` many times, which has poor performance. Consider joining all columns at once using pd.concat(axis=1) instead. To get a de-fragmented frame, use `newframe = frame.copy()` self._cl[col] = np.nan INFO Parse -- parse.py (line 1709) parse_mscx(): All 14 files have been parsed successfully. 72 files. KEY -> EXTENSIONS ------------------- ms3 -> {'.mscx': 14} other -> {'.mscx': 58} 14/72 MSCX files have been parsed. 8 of them have annotations attached. KEY -> ANNOTATION LAYERS ------------------------ ms3 -> staff voice label_type color -> 1 1 0 (dcml) default 568 -> 3 1 0 (dcml) blue 1 -> cyan 1 -> default 63 -> lime 1 -> magenta 1 -> ms3_aquamarine 1 -> ms3_blue 1 -> ms3_chartreuse 1 -> ms3_cornflowerblue 1 -> ms3_darkcyan 1 -> ms3_darkgoldenrod 1 -> ms3_darkgray 1 -> ms3_darkmagenta 1 -> ms3_darkolivegreen 1 -> ms3_darkslateblue 1 -> ms3_darkviolet 1 -> ms3_deeppink 1 -> ms3_deepskyblue 1 -> ms3_dodgerblue 1 -> ms3_green 1 -> ms3_indianred 1 -> ms3_indigo 1 -> ms3_khaki 1 -> ms3_lawngreen 1 -> ms3_lightcoral 1 -> ms3_lightgreen 1 -> ms3_lightpink 1 -> ms3_lightsalmon 1 -> ms3_lightsteelblue 1 -> ms3_maroon 1 -> ms3_mediumorchid 1 -> ms3_mediumseagreen 1 -> ms3_navy 1 -> ms3_olive 1 -> ms3_orange 1 -> ms3_orangered 1 -> ms3_palegreen 1 -> ms3_paleturquoise 1 -> ms3_royalblue 1 -> ms3_sienna 1 -> ms3_teal 1 -> ms3_violet 1 -> red 1 -> springgreen 1 -> white 1 -> yellow 1 -> 2 (Roman Numeral) ms3_darkred 1 -> 3 (Absolute Chord) ms3_darkred 1 -> 3 (dcml) ms3_darkgreen 1 -> 1 1 0 (Plain Text) default 7 -> 3 (Absolute Chord) default 166 -> 3 2 0 (dcml) default 48 -> 2 1 0 (dcml) default 167
As we can see, only the files with the key ‘ms3’ were parsed and the table shows an overview of the counts of the included label types in the different notational layers (i.e. staff & voice), grouped by their colours.
Parsing options¶
-
Parse.
__init__
(directory=None, paths=None, key=None, index=['rel_paths', 'fnames'], file_re=None, folder_re='.*', exclude_re='^(\\.|_)', recursive=True, simulate=False, labels_cfg={}, logger_cfg={}, ms=None)[source] - Parameters
directory (optional) – Arguments for the method
add_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
key (optional) – Arguments for the method
add_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
index (element or
Collection
of {‘key’, ‘i’,Collection
, ‘full_paths’, ‘rel_paths’, ‘scan_paths’, ‘paths’, ‘files’, ‘fnames’, ‘fexts’}) – Arguments for the methodadd_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
file_re (optional) – Arguments for the method
add_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
folder_re (optional) – Arguments for the method
add_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
exclude_re (optional) – Arguments for the method
add_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
recursive (optional) – Arguments for the method
add_folder()
. Ifdir
is not passed, no files are added to the new object except if you passpaths
paths (
Collection
orstr
, optional) – List of file paths you want to add. Ifdir
is also passed, all files will be combined in the same object. WARNING: If you want to use a custom index, don’t use both arguments simultaneously.index –
Change this parameter if you want to create particular indices for output DataFrames.The resulting index must be unique (for identification) and have as many elements as added files.Every single element or Collection of elements ∈ {‘key’, ‘i’,Collection
, ‘full_paths’, ‘rel_paths’, ‘scan_paths’, ‘paths’, ‘files’, ‘fnames’, ‘fexts’} stands for an index level in theMultiIndex
.If you pass a Collection that does not start with one of the defined keywords, it is interpreted as an index level itself and needs to have at least as many elements as the number of added files.The defaultNone
is equivalent to passing(key, i)
, i.e. a MultiIndex of IDs which is always unique.The keywords correspond to the dictionaries of Parse object that contain the constituents of the file paths.simulate (
bool
, optional) – Pass True if no parsing is actually to be done.logger_cfg (
dict
, optional) –The following options are available:’name’: LOGGER_NAME -> by default the logger name is based on the parsed file(s)’level’: {‘W’, ‘D’, ‘I’, ‘E’, ‘C’, ‘WARNING’, ‘DEBUG’, ‘INFO’, ‘ERROR’, ‘CRITICAL’}’path’: Directory in which log files are stored. If ‘file’ is relative, this path is used as root, otherwise, it is ignored.’file’: PATH_TO_LOGFILE Pass absolute path to store all log messages in a single log file. If PATH_TO_LOGFILE is relative, multiple log files are created dynamically, relative to the original MSCX files’ paths. If ‘path’ is set, the corresponding subdirectory structure is created there.ms (
str
, optional) – If you want to parse musicXML files or MuseScore 2 files by temporarily converting them, pass the path or command of your local MuseScore 3 installation. If you’re using the standard path, you may try ‘auto’, or ‘win’ for Windows, ‘mac’ for MacOS, or ‘mscore’ for Linux.
Extracting score information¶
One of ms3’s main functionalities is storing the information contained in parsed scores as tabular files (TSV format). More information on the generated files is summarized here
Using the commandline¶
The most convenient way to achieve this is the command ms3 extract
and its capital-letter parameters summarize
the available tables:
-M [folder], --measures [folder]
Folder where to store TSV files with measure information needed for tasks such as unfolding repetitions.
-N [folder], --notes [folder]
Folder where to store TSV files with information on all notes.
-R [folder], --rests [folder]
Folder where to store TSV files with information on all rests.
-L [folder], --labels [folder]
Folder where to store TSV files with information on all annotation labels.
-X [folder], --expanded [folder]
Folder where to store TSV files with expanded DCML labels.
-E [folder], --events [folder]
Folder where to store TSV files with all events (notes, rests, articulation, etc.) without further processing.
-C [folder], --chords [folder]
Folder where to store TSV files with <chord> tags, i.e. groups of notes in the same voice with identical onset and duration. The tables include lyrics, slurs, and other markup.
-D [path], --metadata [path]
Directory or full path for storing one TSV file with metadata. If no filename is included in the path, it is called metadata.tsv
The typical way to use this command for a corpus of scores is to keep the MuseScore files in a subfolder (called,
for example, MS3
) and to use the parameters’ default values, effectively creating additional subfolders for each
extracted aspect next to each folder containing MuseScore files. For example if we take the folder structure of
the ms3 repository:
ms3
├── docs
│ ├── cujus.mscx
│ ├── o_quam.mscx
│ ├── quae.mscx
│ └── stabat.mscx
└── tests
├── MS3
│ ├── 05_symph_fant.mscx
│ ├── 76CASM34A33UM.mscx
│ ├── BWV_0815.mscx
│ ├── D973deutscher01.mscx
│ ├── Did03M-Son_regina-1762-Sarti.mscx
│ ├── K281-3.mscx
│ └── stabat_03_coloured.mscx
└── repeat_dummies
├── repeats0.mscx
├── repeats1.mscx
└── repeats2.mscx
Upon calling ms3 extract -N
, two new notes
folders containing note lists are created:
ms3
├── docs
│ ├── cujus.mscx
│ ├── o_quam.mscx
│ ├── quae.mscx
│ └── stabat.mscx
├── notes
│ ├── cujus.tsv
│ ├── o_quam.tsv
│ ├── quae.tsv
│ └── stabat.tsv
└── tests
├── MS3
│ ├── 05_symph_fant.mscx
│ ├── 76CASM34A33UM.mscx
│ ├── BWV_0815.mscx
│ ├── D973deutscher01.mscx
│ ├── Did03M-Son_regina-1762-Sarti.mscx
│ ├── K281-3.mscx
│ └── stabat_03_coloured.mscx
├── notes
│ ├── 05_symph_fant.tsv
│ ├── 76CASM34A33UM.tsv
│ ├── BWV_0815.tsv
│ ├── D973deutscher01.tsv
│ ├── Did03M-Son_regina-1762-Sarti.tsv
│ ├── K281-3.tsv
│ ├── repeats0.tsv
│ ├── repeats1.tsv
│ ├── repeats2.tsv
│ └── stabat_03_coloured.tsv
└── repeat_dummies
├── repeats0.mscx
├── repeats1.mscx
└── repeats2.mscx
We witness this behaviour because the default value is ../notes
, interpreted as relative path in relation to
each MuseScore file. Alternatively, a relative path can be specified without initial ./
or ../
,
e.g. ms3 extract -N notes
, to store the note lists in a recreated sub-directory structure:
ms3
├── docs
├── notes
│ ├── docs
│ └── tests
│ ├── MS3
│ └── repeat_dummies
└── tests
├── MS3
└── repeat_dummies
A third option consists in specifying an absolute path which causes all note lists to be stored in the specified
folder, e.g. ms3 extract -N ~/notes
:
~/notes
├── 05_symph_fant.tsv
├── 76CASM34A33UM.tsv
├── BWV_0815.tsv
├── cujus.tsv
├── D973deutscher01.tsv
├── Did03M-Son_regina-1762-Sarti.tsv
├── K281-3.tsv
├── o_quam.tsv
├── quae.tsv
├── repeats0.tsv
├── repeats1.tsv
├── repeats2.tsv
├── stabat_03_coloured.tsv
└── stabat.tsv
Note that this leads to problems if MuseScore files from different subdirectories have identical filenames.
In any case it is good practice to not use nested folders to allow for easier file access. For example, a typical
DCML corpus will store all MuseScore files in the MS3
folder and
include at least the folders created by ms3 extract -N -M -X
:
.
├── harmonies
├── measures
├── MS3
└── notes
Extracting score information manually¶
What ms3 extract
effectively does is creating a Parse
object, calling its method
parse_mscx()
and then store_lists()
. In addition to the
command, the method allows for storing two additional aspects, namely notes_and_rests
and cadences
(if
the score contains cadence labels). For each of the available aspects,
{notes, measures, rests, notes_and_rests, events, labels, chords, cadences, expanded}
,
the method provides two parameters, namely _folder
(where to store TSVs) and _suffix
,
i.e. a slug appended to the respective filenames. If the parameter
simulate=True
is passed, no files are written but the file paths to be
created are returned. Since corpora might have quite diverse directory structures,
ms3 gives you various ways of specifying folders which will be explained in detail
in the following section.
Briefly, the rules for specifying the folders are as follows:
absolute folder (e.g.
~/labels
): Store all files in this particular folder without creating subfolders.relative folder starting with
./
or../
: relative folders are created “at the end” of the original subdirectory structure, i.e. relative to the MuseScore files.relative folder not starting with
./
or../
(e.g.rests
): relative folders are created at the top level (of the original directory or the specifiedroot_dir
) and the original subdirectory structure is replicated in each of them.
To see examples for the three possibilities, see the following section.
Specifying folders¶
Consider a two-level folder structure contained in the root directory .
which is the one passed to Parse
:
.
├── docs
│ ├── cujus.mscx
│ ├── o_quam.mscx
│ ├── quae.mscx
│ └── stabat.mscx
└── tests
└── MS3
├── 05_symph_fant.mscx
├── 76CASM34A33UM.mscx
├── BWV_0815.mscx
├── D973deutscher01.mscx
├── Did03M-Son_regina-1762-Sarti.mscx
└── K281-3.mscx
The first level contains the subdirectories docs (4 files) and tests (6 files in the subdirectory MS3). Now we look at the three different ways to specify folders for storing notes and measures.
Absolute Folders¶
When we specify absolute paths, all files are stored in the specified directories. In this example, the measures and notes are stored in the two specified subfolders of the home directory ~, regardless of the original subdirectory structure.
>>> p.store_lists(notes_folder='~/notes', measures_folder='~/measures')
~
├── measures
│ ├── 05_symph_fant.tsv
│ ├── 76CASM34A33UM.tsv
│ ├── BWV_0815.tsv
│ ├── cujus.tsv
│ ├── D973deutscher01.tsv
│ ├── Did03M-Son_regina-1762-Sarti.tsv
│ ├── K281-3.tsv
│ ├── o_quam.tsv
│ ├── quae.tsv
│ └── stabat.tsv
└── notes
├── 05_symph_fant.tsv
├── 76CASM34A33UM.tsv
├── BWV_0815.tsv
├── cujus.tsv
├── D973deutscher01.tsv
├── Did03M-Son_regina-1762-Sarti.tsv
├── K281-3.tsv
├── o_quam.tsv
├── quae.tsv
└── stabat.tsv
Relative Folders¶
In contrast, specifying relative folders recreates the original subdirectory structure.
There are two different possibilities for that. The first possibility is naming
relative folder names, meaning that the subdirectory structure (docs
and tests
)
is recreated in each of the folders:
>>> p.store_lists(root_dir='~/tsv', notes_folder='notes', measures_folder='measures')
~/tsv
├── measures
│ ├── docs
│ │ ├── cujus.tsv
│ │ ├── o_quam.tsv
│ │ ├── quae.tsv
│ │ └── stabat.tsv
│ └── tests
│ └── MS3
│ ├── 05_symph_fant.tsv
│ ├── 76CASM34A33UM.tsv
│ ├── BWV_0815.tsv
│ ├── D973deutscher01.tsv
│ ├── Did03M-Son_regina-1762-Sarti.tsv
│ └── K281-3.tsv
└── notes
├── docs
│ ├── cujus.tsv
│ ├── o_quam.tsv
│ ├── quae.tsv
│ └── stabat.tsv
└── tests
└── MS3
├── 05_symph_fant.tsv
├── 76CASM34A33UM.tsv
├── BWV_0815.tsv
├── D973deutscher01.tsv
├── Did03M-Son_regina-1762-Sarti.tsv
└── K281-3.tsv
Note that in this example, we have specified a root_dir
. Leaving this argument
out will create the same structure in the directory from which the Parse
object was created, i.e. the folder structure would be:
.
├── docs
├── measures
│ ├── docs
│ └── tests
│ └── MS3
├── notes
│ ├── docs
│ └── tests
│ └── MS3
└── tests
└── MS3
If, instead, you want to create the specified relative folders relative to each
MuseScore file’s location, specify them with an initial dot. ./
means
“relative to the original path” and ../
one level up from the original path.
To exemplify both:
>>> p.store_lists(root_dir='~/tsv', notes_folder='./notes', measures_folder='../measures')
~/tsv
├── docs
│ └── notes
│ ├── cujus.tsv
│ ├── o_quam.tsv
│ ├── quae.tsv
│ └── stabat.tsv
├── measures
│ ├── cujus.tsv
│ ├── o_quam.tsv
│ ├── quae.tsv
│ └── stabat.tsv
└── tests
├── measures
│ ├── 05_symph_fant.tsv
│ ├── 76CASM34A33UM.tsv
│ ├── BWV_0815.tsv
│ ├── D973deutscher01.tsv
│ ├── Did03M-Son_regina-1762-Sarti.tsv
│ └── K281-3.tsv
└── MS3
└── notes
├── 05_symph_fant.tsv
├── 76CASM34A33UM.tsv
├── BWV_0815.tsv
├── D973deutscher01.tsv
├── Did03M-Son_regina-1762-Sarti.tsv
└── K281-3.tsv
The notes
folders are created in directories where MuseScore files are located,
and the measures
folders one directory above, respectively. Leaving out the
root_dir
argument would lead to the same folder structure but in the directory
from which the Parse
object has been created. In a similar manner,
the arguments p.store_lists(notes_folder='.', measures_folder='.')
would create
the TSV files just next to the MuseScore files. However, this would lead to warnings
such as
Warning
The notes at ~/ms3/docs/cujus.tsv have been overwritten with measures.
In such a case we need to specify a suffix for at least one of both aspects:
p.store_lists(notes_folder='.', notes_suffix='_notes',
measures_folder='.', measures_suffix='_measures')
Examples¶
Before you are sure to have picked the right parameters for your desired output,
you can simply use the simulate=True
argument which lets you view the paths
without actually creating any files. In this variant, all aspects are stored each
in individual folders but with identical filenames:
>>> p = Parse('~/ms3/docs', key='pergo')
>>> p.parse_mscx()
>>> p.store_lists( notes_folder='./notes',
rests_folder='./rests',
notes_and_rests_folder='./notes_and_rests',
simulate=True
)
['~/ms3/docs/notes/cujus.tsv',
'~/ms3/docs/rests/cujus.tsv',
'~/ms3/docs/notes_and_rests/cujus.tsv',
'~/ms3/docs/notes/o_quam.tsv',
'~/ms3/docs/rests/o_quam.tsv',
'~/ms3/docs/notes_and_rests/o_quam.tsv',
'~/ms3/docs/notes/quae.tsv',
'~/ms3/docs/rests/quae.tsv',
'~/ms3/docs/notes_and_rests/quae.tsv',
'~/ms3/docs/notes/stabat.tsv',
'~/ms3/docs/rests/stabat.tsv',
'~/ms3/docs/notes_and_rests/stabat.tsv']
In this variant, the different ways of specifying folders are exemplified. To demonstrate all subtleties we parse the
same four files but this time from the perspective of ~/ms3
:
>>> p = Parse('~/ms3', folder_re='docs', key='pergo')
>>> p.parse_mscx()
>>> p.store_lists( notes_folder='./notes', # relative to ms3/docs
measures_folder='../measures', # one level up from ms3/docs
rests_folder='rests', # relative to the parsed directory
labels_folder='~/labels', # absolute folder
expanded_folder='~/labels', expanded_suffix='_exp',
simulate = True
)
['~/ms3/docs/notes/cujus.tsv',
'~/ms3/rests/docs/cujus.tsv',
'~/ms3/measures/cujus.tsv',
'~/labels/cujus.tsv',
'~/labels/cujus_exp.tsv',
'~/ms3/docs/notes/o_quam.tsv',
'~/ms3/rests/docs/o_quam.tsv',
'~/ms3/measures/o_quam.tsv',
'~/labels/o_quam.tsv',
'~/labels/o_quam_exp.tsv',
'~/ms3/docs/notes/quae.tsv',
'~/ms3/rests/docs/quae.tsv',
'~/ms3/measures/quae.tsv',
'~/labels/quae.tsv',
'~/labels/quae_exp.tsv',
'~/ms3/docs/notes/stabat.tsv',
'~/ms3/rests/docs/stabat.tsv',
'~/ms3/measures/stabat.tsv',
'~/labels/stabat.tsv',
'~/labels/stabat_exp.tsv']
Column Names¶
General Columns¶
mc Measure Counts¶
Measure count, identifier for the measure units in the XML encoding. Always starts with 1 for correspondence to MuseScore’s status bar. For more detailed information, please refer to Measure counts (MC) vs. measure numbers (MN).
mn Measure Numbers¶
Measure number, continuous count of complete measures as used in printed editions. Starts with 1 except for pieces beginning with a pickup measure, numbered as 0. MNs are identical for first and second endings! For more detailed information, please refer to Measure counts (MC) vs. measure numbers (MN).
mc_onset¶
The value for mc_onset
represents, expressed as fraction of a whole note, a position in a measure where 0
corresponds to the earliest possible position (in most cases beat 1). For more detailed information, please
refer to Onset positions.
Tip
When loading a table from a TSV file, it is recommended to parse the text of this
column with fractions.Fraction
to be able to calculate with the values.
MS3 does this automatically.
mn_onset¶
The value for mn_onset
represents, expressed as fraction of a whole note, a position in a measure where 0
corresponds to the earliest possible position of the corresponding measure number (MN). For more detailed information,
please refer to Onset positions.
quarterbeats¶
This column expresses positions, otherwise accessible only as a tuple (mc, mc_onset)
, as a running count of
quarter notes from the piece’s beginning (quarterbeat = 0). If second endings are present in the score, only the
last ending is counted in order to give authentic values to such a score, as if played without repetitions. If
repetitions are unfolded, i.e. the table corresponds to a full play-through of the score, all endings are taken into
account correctly.
Measures¶
act_dur Actual duration of a measure¶
The value of act_dur
in most cases equals the time signature, expressed as a fraction; meaning for example that
a “normal” measure in 6/8 has act_dur = 3/4
. If the measure has an irregular length, for example a pickup measure
of length 1/8, would have act_dur = 1/8
.
The value of act_dur
plays an important part in inferring MNs
from MCs. See also the columns dont_count and numbering_offset.
barline¶
The column barline
encodes information about the measure’s final bar line.
breaks¶
The column breaks
may include three different values: {'line', 'page', 'section'}
which represent the different
breaks types. In the case of section breaks, MuseScore
dont_count Measures excluded from bar count¶
This is a binary value that corresponds to MuseScore’s setting Exclude from bar count
from the Bar Properties
menu.
The value is 1
for pickup bars, second MCs of divided MNs and some volta measures,
and NaN
otherwise.
keysig Key Signatures¶
The feature keysig
represents the key signature of a particular measure.
It is an integer which, if positive, represents the number of sharps, and if
negative, the number of flats. E.g.: 3
: three sharps, -2
: two flats,
0
: no accidentals.
mc_offset Offset of a MC¶
The column mc_offset
, in most cases, has the value 0
because it expresses the deviation of this MC’s
mc_onset 0
(beginning of the MC)
from beat 1 of the corresponding MN. If the value is a fraction > 0, it means that this MC is part of a MN which is
composed of at least two MCs, and it expresses the current MC’s offset in terms of the duration of all (usually 1) preceding MCs
which are also part of the corresponding MN. In the standard case that one MN would be split in two MCs, the first MC
would have mc_offset = 0
, and the second one mc_offset = the previous MC's
act_dur .
next¶
Every cell in this column has at least one integer, namely the MC of the subsequent bar, or -1
in the cast of the last.
In the case of repetitions, measures can have more than one subsequent MCs, in which case the integers are separated by
', '
.
The column is used for checking whether irregular measure lengths even themselves out because otherwise the inferred MNs might be wrong. Also, it is needed for MS3’s unfold repeats functionality (TODO).
Developers
Within MS3, the next
column holds tuples, which MS3 should normally store as strings without parenthesis. For
example, the tuple (17, 1)
is stored as '17, 1'
. However, users might have extracted and stored a raw DataFrame
from a Score
object and MS3 needs to handle both formats.
numbering_offset Offsetting MNs¶
MuseScore’s measure number counter can be reset at a given MC by using the Add to bar number
setting from the
Bar Properties
menu. If numbering_offset
≠ 0, the counting offset is added to the current MN and all subsequent
MNs are inferred accordingly.
Scores which include several pieces (e.g. in variations or a suite),
sometimes, instead of using section breaks, use numbering_offset
to simulate a restart for counting
MNs at every new section. This leads to ambiguous MNs.
repeats¶
The column repeats
indicates the presence of repeat signs and can have the values
{'start', 'end', 'startend', 'firstMeasure', 'lastMeasure'}
. MS3 performs a test on the
repeat signs’ plausibility and throws warnings when some inference is required for this.
The repeats
column needs to have the correct repeat sign structure in order to have a correct next
column which, in return, is required for MS3’s unfolding repetitions functionality. (TODO)
timesig Time Signatures¶
The time signature timesig
of a particular measure is expressed as a string, e.g. '2/2'
.
The actual duration of a measure can deviate from the time signature for notational reasons: For example,
a pickup bar could have an actual duration of 1/4
but still be part of a '3/8'
meter, which usually
has an actual duration of 3/8
.
Tip
When loading a table from a file, time signatures are not parsed as fractions because then both
'2/2'
and '4/4'
, for example, would become 1
.
volta¶
In the case of first and second (third etc.) endings, this column holds the number of every “bracket”, “house”, or volta, which should increase from 1. This is required for MS3’s unfold repeats function (TODO) to work.
The MNs for all voltas except those of the first one need to be amended to match those of the first volta. In the case where these voltas have only one measure each, the dont_count option suffices. If the voltas have more than one measure, the numbering_offset setting needs to be used.