APIs for different linguistic databases can be accessed with lingtypology.db_apis
.
import lingtypology.db_apis
Lingtypology attempts to provide unified API for given language databases. Therefore, classes in this module share some common attributes and methods. In this paragraph I will describe them and provide examples for Autotyp, Wals and Phoible.
from lingtypology.db_apis import Autotyp, Wals, Phoible
features_list
¶You can get the list of available features from the database using this attribute.
Autotyp().features_list[:10] #It's cutoff in order not to take took much space
Note: Phoible
has no features_list
attribute because there are no features. However, it has subsets_list
that shows list of available subsets of Phoible data.
Phoible().subsets_list
get_df
and get_json
¶These two methods access the database and return data as pandas.Series
or dict
. Example of usage:
Autotyp('Agreement', 'Clusivity').get_df().head()
Note: for Phoible
and Autotyp
you can use strip_na
parameter (list
, default: []
) to strip rows in which there is empty cell in the given columns. Compare the following.
No strip_na
(empty cells are replaced with '~N/A~'
):
Phoible().get_df().head()
tones
column given to strip_na
:
Phoible().get_df(strip_na=['tones']).head()
Note: By default when you call get_df
or get_json
it prints the citation. If you want to disable it, you shoud set the show_citation
to False
.
p = Phoible()
p.show_citation = False
p.get_df(strip_na=['tones']).head()
citation
¶You can get the citation for each database using citation
attribute.
E.g.:
from lingtypology.db_apis import Autotyp
print(Autotyp().citation)
Note: if you use Wals
, citation will be shown for every feature. If you want general citation for the whole Wals, use general_citation
.
w = Wals('1a', '2a')
print(w.citation)
print(w.general_citation)
It is possible to access Wals data (online) using lingtypology.db_apis.Wals
from lingtypology.db_apis import Wals
wals_page = Wals('1a', '2a').get_df()
wals_page.head()
Map example for feature 1A:
m = lingtypology.LingMap(wals_page.language)
m.add_custom_coordinates(wals_page.coordinates)
m.add_features(wals_page._1A)
m.legend_title = 'Consonant Inventory'
m.colors = lingtypology.gradient(5, 'yellow', 'green')
m.create_map()
It is possible to access Autotyp data (online) using lingtypology.db_apis
.
Unlike in Wals, each new tablename passed into Autotyp
gives several additional columns:
Autotyp_table = Autotyp('Gender', 'Agreement').get_df(strip_na=['Gender.binned4'])
Autotyp_table.head()
Now we can draw a map out of gender data from multiple languages.
m = lingtypology.LingMap(Autotyp_table.language)
m.add_features(Autotyp_table['Gender.binned4'])
m.colors = lingtypology.gradient(4, color1='yellow', color2='red')
m.legend_title = 'Genders'
m.create_map()
from lingtypology.db_apis import AfBo
adj = AfBo('adjectivizer').get_df()
adj.head()
m = lingtypology.LingMap(adj.language_recipient)
m.add_features(adj['adjectivizer'], numeric=True)
m.legend_title = 'Adj'
m.create_map()
from lingtypology.db_apis import Sails
To get a pandas.DataFrame
of features and descriptions:
Sails().features_descriptions.head()
Get description for particular features:
Sails().feature_descriptions('ICU10', 'ICU11')
To get the SAILS data as dict
, you can use get_json
method. To get data as pandas.DataFrame
you can run:
sails = Sails('ICU3', 'ICU4')
df = sails.get_df()
df.head()
Map example:
m = lingtypology.LingMap(df.language)
m.add_features(df.ICU3_desc)
m.legend_title = sails.feature_descriptions('ICU3').Description.at[0]
m.start_location = (9, -79)
m.start_zoom = 5
m.legend_position = 'bottomleft'
m.create_map()
from lingtypology.db_apis import Phoible
Unlike in other databases you do not pass features into Phoible. You should pass the subset. Take a look:
p = Phoible()
p.get_df().head()
There are several entries for different languages: it happens because Phoible data consists of several different subsets. You can get the list of available subsets:
p.subsets_list
... and pass them into the class:
p = Phoible(subset='SPA')
df = p.get_df(strip_na=['tones'])
df.head()
You can also get non-aggregated data by setting aggregated
to False
while initializing the class.
Phoible(aggregated=False).get_df().head()
Map example:
m = lingtypology.LingMap(df.language)
m.colormap_colors = ('white', 'red')
m.add_features(df.tones, numeric=True)
m.start_zoom = 2
m.legend_title = 'Tones'
m.legend_position = 'bottomleft'
m.create_map()
Another example (slow due to large amount of data):
df = Phoible(subset='UPSID', aggregated=False).get_df()
#Get all languages with ejectives
df = df[df.raisedLarynxEjective == '+']
#Remove duplicates
df = df.drop_duplicates(subset='Glottocode')
df.head()
m = lingtypology.LingMap(df.Glottocode, glottocode=True)
m.title = 'Languages with Ejectives'
m.tiles = 'Stamen Terrain'
m.radius = 5
m.opacity = 0.5
m.colors = ('blue',)
m.create_map()