Help
1 Basic Note on Help
How EDICTOR can be used has been illustrated in various forms, over the last years, since the first version was published in 2017. Although it would be nice to have a complete tutorial that explains all major features available at one point, it seems unrealistic to provided this, also because the creation of such a tutorial would deprive me of the time to develop new features. So for now, it has to be sufficient for users interested in the tool to check out the resources mentioned here, to contact us via GitHub in case of bugs, or to just test the tool and see how it works in action.
Testing EDICTOR in action should be fairly easy, since you cannot break anything if you work with the tool. The recommendation is to click a lot and to check which buttons can be clicked. Left and right mouse click play an important role, so you should make sure to test what happens if you right-mouse-click or left-mouse-click a certain field. If you open panels in EDICTOR and do not know what these panels do, there is always a little question mark on the top right of the panel. If you click this question mark, basic information on the panel will be shared. In this way, it should be possible to make acquaintance with the tool rather quickly.
2 Tutorials
The following tutorials have been published over the last years and shared freely on preprint servers or larger repositories.
List, J.-M. (2017): Historical Language Comparison with LingPy and EDICTOR [Historischer Sprachvergleich mit LingPy und EDICTOR]. Department of Linguistic and Cultural Evolution: Max-Planck Institute for the Science of Human History.
3 Overview with Examples
3.1 Basic File Types
EDICTOR expects TSV files as input. The files should be separated by a tab-stop and contain a header line indicating the content of the individual columns. The first column should provide numerical identifiers of all rows in your data. Each row corresponds to one word. One of the remaining columns should be called DOCULECT and contain the name of the individual languages in your sample. Another column should be called CONCEPT and contain the concepts (or glosses) for individual words. The word form should be provided in segmented form (space being used to segment individual sounds) in a column TOKENS.
Note that EDICTOR accepts some alternative names, you also must not write them in capital letters, but we recommend strongly to adhere to these basic guidelines in order to make sure the tool works properly.
A sample file with just a few lines for inspection can be downloaded here and opened in EDICTOR three by clicking on the box below.
Illustration of File Formats
3.2 Opening Files in EDICTOR
EDICTOR opens two major ways to open a file and manipulate the data. You can open a file by starting from an empty EDICTOR instance, then clicking into the BROWSE FILE field on top left, and selecting the TSV file stored in your system. This will not upload your data to the server, but only make it accessible to the application in the browser on your system. You can test this by downloading this file and then selecting it, when opening a fresh instance.
The alternative way to open files is by passing the filename via the URL. This works, however, only, if you use the local version of EDICTOR that runs with Python on a local server. If you pass filenames to the standard URL (https://edictor.org/edictor.html), only files that have been uploaded to the server in the folder data
are accessible.
If you use the local EDICTOR version, EDICTOR will first search for files in the current working directory. If files cannot be found, EDICTOR will search in the data
directory.
Files are passed via the URL by adding the attribute file=filename
to the URL.
If you run EDICTOR locally, you can also open SQLite databases. In order to do so, you must make sure to have created a valid SQLITE database of an existing database file (tools like PyEDICTOR have options to export data to the SQLite format required by EDICTOR) and place the database into the sqlite
folder where your tool is installed. You must also make sure that access to the folder does not require root rights (which can be guaranteed by installing EDICTOR with a local virtual environment). SQLite files are opened from the URL by passing the arguments file
and remote_dbase
. The file
keyword here refers to the name of the table inside the SQLite database and the remote_dbase
keyword refers to the name of the table in the SQLite database that contains the data (which is stored in triples).
3.2 Editing Data in the Wordlist Panel
The first panel you see when having successfully opened a file in EDICTOR is the Wordlist panel. This panel offers several possibilities to edit your data, similar in type to a simple spreadsheet editor. If you open a wordlist file, you an directly edit any field in the Wordlist panel, except from the field that shows the identifier (ID) of the row. Since EDICTOR computes internal data representations from the content of the TSV file, however, you should not edit the fields DOCULECT and CONCEPT. Editing is as simple as clicking into a field and then modifying the content. By pressing ENTER the content modification is accepted, pressing ESCAPE will restore the original value. With the arrow keys, you can navigate up and down (equivalent of pressing ENTER, modified content will be accepted), and with CTRL in combination with the left and the right arrow, you can switch between columns. If you want to delete a row or add a new row to your data, you must press on the field of the ID, a new window will open then and ask you for confirmation or provide further instructions.
EDICTOR comes with a rudimentary routine that allows you to segment the data (similar to the functionality in LingPy to segment entries on phonetic transcriptions into their sounds). To trigger this functionality, you must insert an entry into the TOKENS field that is preceded by a space. Spaces to the left or to the right of entries in TOKENS are generally not accepted. Adding a space to the beginning of a phonetic transcription sequence thus informs EDICTOR to segment the data.
3.3 Assigning Words to Cognate Sets
There are several possibilities to assign words to cognate sets. The first and most important decision that you need to make is whether you want to annotate cognates on the level of the words in your data or on the level of morphemes. The former mode is called
full cognates in EDICTOR and you can make sure to use this mode by opening the SETTINGS panel and checking the checkbox
full for the
cognate and colexification mode. The latter mode is called
partial cognates in EDICTOR and can be checked out in the same way. You can also specify the mode from the URL, if you open the
edictor.html
file with the parameters
morphology_mode=partial
or
morphology_mode=full
. +++