Coverage for lingpy/compare/sanity.py : 98%

Hot-keys on this page
r m x p toggle line displays
j k next/prev highlighted chunk
0 (zero) top of page
1 (one) first highlighted chunk
""" Module provides basic checks for wordlists. """ unicode_literals, print_function, absolute_import, division)
[w for w in wordlist.get_list( col=taxA, flat=True, entry=concepts ) if w in wordlist.get_list( col=taxB, flat=True, entry=concepts)])
wordlist.cols}
"""Compute mutual coverage for all language pairs in your data.
Parameters ---------- wordlist : ~lingpy.basic.wordlist.Wordlist Your Wordlist object (or a descendant class). concepts : str (default="concept") The column which stores your concepts.
Returns ------- coverage : dict A dictionary of dictionaries whose value is the number of items two languages share.
Examples --------
Compute coverage for the KSL.qlc dataset::
>>> from lingpy.compare.sanity import mutual_coverage >>> from lingpy import * >>> from lingpy.tests.util import test_data >>> wl = Wordlist(test_data('KSL.qlc')) >>> cov = mutual_coverage(wl) >>> cov['English']['German'] 200
See also -------- mutual_coverage_check mutual_coverage_subset """
"""Check whether a given mutual coverage is fulfilled by the dataset.
Parameters ---------- wordlist : ~lingpy.basic.wordlist.Wordlist Your Wordlist object (or a descendant class). concepts : str (default="concept") The column which stores your concepts. threshold : int The threshold which should be checked.
Returns ------- c: bool True, if coverage is fulfilled for all language pairs, False if otherwise.
Examples -------- Compute minimal mutual coverage for the KSL dataset::
>>> from lingpy.compare.sanity import mutual_coverage >>> from lingpy import * >>> from lingpy.tests.util import test_data >>> wl = Wordlist(test_data('KSL.qlc')) >>> for i in range(wl.height, 1, -1): if mutual_coverage_check(wl, i): print('mutual coverage is {0}'.format(i)) break 200
See also -------- mutual_coverage mutual_coverage_subset """ return True
"""Compute maximal mutual coverage for all language in a wordlist.
Parameters ---------- wordlist : ~lingpy.basic.wordlist.Wordlist Your Wordlist object (or a descendant class). concepts : str (default="concept") The column which stores your concepts. threshold : int The threshold which should be checked.
Returns ------- coverage : tuple A tuple consisting of the number of languages for which the coverage could be found as well as a list of all pairings in which this coverage is possible. The list itself contains the mutual coverage inside each pair and the list of languages.
Examples -------- Compute all sets of languages with coverage at 200 for the KSL dataset::
>>> from lingpy.compare.sanity import mutual_coverage_subset >>> from lingpy import * >>> from lingpy.tests.util import test_data >>> wl = Wordlist(test_data('KSL.qlc')) >>> number_of_languages, pairs = mutual_coverage_subset(wl, 200) >>> for number_of_items, languages in pairs: print(number_of_items, ','.join(languages)) 200 Albanian,English,French,German,Hawaiian,Navajo,Turkish
See also -------- mutual_coverage mutual_coverage_check """
"""Check the number of synonyms per language and concept.
Parameters ---------- wordlist : ~lingpy.basic.wordlist.Wordlist Your Wordlist object (or a descendant class). concepts : str (default="concept") The column which stores your concepts. languages : str (default="doculect") The column which stores your language names.
Returns ------- synonyms : dict A dictionary with language and concept as key and the number of synonyms as value.
Examples -------- Calculate synonymy in KSL.qlc dataset::
>>> from lingpy.compare.sanity import synonymy >>> from lingpy import * >>> from lingpy.tests.util import test_data >>> wl = Wordlist(test_data('KSL.qlc')) >>> syns = synonymy(wl) >>> for a, b in syns.items(): if b > 1: print(a[0], a[1], b)
There is no case where synonymy exceeds 1 word per concept per language, since :evobib:`Kessler2001` was paying particular attention to avoid synonyms. """
|