lexnlp.extract.en.tests package

Submodules

lexnlp.extract.en.tests.test_acts module

class lexnlp.extract.en.tests.test_acts.TestGetActs(methodName='runTest')

Bases: lexnlp.extract.de.tests.test_amounts.AssertionMixin

test_ambiguous_cases()
test_correct_cases()
test_file_samples()
test_wrong_cases()

lexnlp.extract.en.tests.test_amounts module

Amount unit tests for English.

This module implements unit tests for the amount extraction functionality in English.

Todo:
  • More pathological and difficult cases

lexnlp.extract.en.tests.test_amounts.test_error_case_1()

Test encountered error case. :return:

lexnlp.extract.en.tests.test_amounts.test_error_case_2()

Test encountered error case. :return:

lexnlp.extract.en.tests.test_amounts.test_get_amount()

Test default get amount behavior. :return:

lexnlp.extract.en.tests.test_amounts.test_get_amount_non_round_float()

Test get amount behavior with source return. :return:

lexnlp.extract.en.tests.test_amounts.test_get_amount_source()

Test get amount behavior with source return. :return:

lexnlp.extract.en.tests.test_amounts_plain module

class lexnlp.extract.en.tests.test_amounts_plain.TestAmountsPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_amounts()
test_file_samples()
test_fraction_symbol()

lexnlp.extract.en.tests.test_citations module

Citation unit tests for English.

This module implements unit tests for the citation extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_citations.test_get_citations()

Test default get citation behavior. :return:

lexnlp.extract.en.tests.test_citations.test_get_citations_as_dict()

lexnlp.extract.en.tests.test_citations_plain module

class lexnlp.extract.en.tests.test_citations_plain.TestCitationsPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_citations()
test_file_samples()

lexnlp.extract.en.tests.test_conditions module

Condition unit tests for English.

This module implements unit tests for the condition extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_conditions.test_condition_fixed_example()

lexnlp.extract.en.tests.test_conditions_plain module

class lexnlp.extract.en.tests.test_conditions_plain.TestConditionsPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()

lexnlp.extract.en.tests.test_constraints module

Constraints unit tests for English.

This module implements unit tests for the constraint extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_constraints.test_constraint_fixed_example()

lexnlp.extract.en.tests.test_constraints_plain module

class lexnlp.extract.en.tests.test_constraints_plain.TestConstraintsPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_constraints()
test_file_samples()

lexnlp.extract.en.tests.test_courts module

Court/jurisdiction unit tests for English.

This module implements unit tests for the court/jurisdiction extraction functionality in English.

Todo:
  • Re-introduce known bad cases with better master data or more flexible approach

  • More pathological and difficult cases

class lexnlp.extract.en.tests.test_courts.TestParseEnCourts(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_parse_empty_text()
test_parse_simply_text()
lexnlp.extract.en.tests.test_courts.test_court_config_setup()

Test setup of CourtConfig object. :return:

lexnlp.extract.en.tests.test_courts.test_court_config_setup_wo_alias()
lexnlp.extract.en.tests.test_courts.test_courts()

Test court extraction. :return:

lexnlp.extract.en.tests.test_courts.test_courts_longest_match()

Tests the case when there are courts having names/aliases being one a substring of another. In such case the court having longest alias should be returned for each conflicting matching. But for the case when there is another match of the court having shorter alias in that conflict, they both should be returned. :return:

lexnlp.extract.en.tests.test_courts.test_courts_rs()

Test court extraction with return sources. :return:

lexnlp.extract.en.tests.test_cusip module

class lexnlp.extract.en.tests.test_cusip.TestGetCUSIP(methodName='runTest')

Bases: lexnlp.extract.de.tests.test_amounts.AssertionMixin

test_correct_cases()
test_file_samples()
test_wrong_cases()

lexnlp.extract.en.tests.test_dates module

Date unit tests for English.

This module implements unit tests for the date extraction functionality in English.

Todo:
  • Implement document-level date detection to identify anomalous dates

  • Better testing for exact test in return sources

  • Resolve example bad dates

  • More pathological and difficult cases

class lexnlp.extract.en.tests.test_dates.TestDates(methodName='runTest')

Bases: unittest.case.TestCase

check_dates_set(date_src: List[Tuple[int, int, int]])

Test date extraction with provided dates.

test_build_model()

Test build model by running default train. :return:

test_date_feature_1()

Test date feature engineering.

test_date_feature_1_bigram()

Test date feature engineering with bigrams.

test_date_may()

Test that ” may ” alone does not parse.

test_fixed_date_set()
test_fixed_dates()

Test date extraction from fixed examples.

test_fixed_dates_nonstrict()

Test date extraction from fixed examples.

test_fixed_dates_source()

Test date extraction from fixed examples with source.

test_fixed_raw_dates()

Test raw date extraction from fixed examples.

test_get_raw_dates()
test_random_date_set()
lexnlp.extract.en.tests.test_dates.expected_data_converter(expected)

lexnlp.extract.en.tests.test_dates_plain module

class lexnlp.extract.en.tests.test_dates_plain.TestDatesPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_another_may()
test_august()
test_date_first_aug()
test_dates()
test_dates_times()
test_file_samples()
test_fp()
test_moar_dates()
test_more_more_dates()
test_no_dates()
test_one_date_this()
test_section()
test_should_be_fixed()
test_two_dates_strict()
test_two_ranges()

lexnlp.extract.en.tests.test_definitions module

Definition unit tests for English.

This module implements unit tests for the definition extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

class lexnlp.extract.en.tests.test_definitions.TestEnglishDefinitions(methodName='runTest')

Bases: unittest.case.TestCase

test_abbr_strip()
test_annotations()
test_apostrophe_in_definition()
test_capitalized_false_positive()
test_capitalized_with_trigger()
test_capitalized_with_trigger_in_the_middle_of_sentense()
test_def_called()
test_definition_fixed()
test_definition_ml()
test_definition_quoted()
test_definition_quoted_new_line()
test_definitions_in_one_sentence()
test_definitions_in_sentences_text()
test_definitions_simple()
test_dot_in_definition()
test_emma()
test_enquoted()
test_fp_pronoun()
test_fp_service_words()
test_include_multitoken_definition()

I think that the text (each an “Obligation” and collectively, the “Obligations”) IS the definition. But the parser skips the text because it has more than MAX_TERM_TOKENS (presently, 5) words.

So, the behavior is changed: now 10 words are allowed because there are 2 possible “definitions”.

test_merge_defs()
test_merge_defs_consumed()
test_misbrackets()
test_newlines()
test_noun_pattern_false_positive()
test_obvious_embraced_definition()
test_parenthesis()
test_parse_in_extra_quotes()
test_parse_moodys()
test_process_ugly_braces_def()
test_quotes_removed()
test_reffered_to_def()
test_reffered_to_def_excess_words()
test_start_word_shall_be_false_positive()
test_the()
test_the_corporation_false_positive()
test_too_long_definition()
test_trigger_word_fullmatches()
test_trim_defined_term()
test_unbal_quotes()
test_unpared_brackets()

lexnlp.extract.en.tests.test_definitions_template module

class lexnlp.extract.en.tests.test_definitions_template.TestDefinitionsTemplate(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
lexnlp.extract.en.tests.test_definitions_template.get_definitions_sorted(text: str)

lexnlp.extract.en.tests.test_dict_entities module

Dict entity general unit tests.

class lexnlp.extract.en.tests.test_dict_entities.TestDictEntities(methodName='runTest')

Bases: unittest.case.TestCase

test_abbreviations_simple()
test_alias_is_blacklisted()
test_am_pm_none()
test_common_search_all_languages()
test_conflicts_equal_length_take_same_language()
test_conflicts_take_longest_match()
test_equal_aliases_in_dif_languages()
test_find_dict_entities_empty_text()
test_get_alias_id()
test_get_alias_text()
test_get_entity_id()
test_normalize_text()
test_plural_case_matching()
test_prepare_alias_blacklist_dict()

lexnlp.extract.en.tests.test_distance module

Distance unit tests for English.

This module implements unit tests for the distance extraction functionality in English.

Todo:
  • More pathological and difficult cases

lexnlp.extract.en.tests.test_distance.test_get_distance()

Test distance extraction. :return:

lexnlp.extract.en.tests.test_distance.test_get_distance_source()

Test distance extraction with source. :return:

lexnlp.extract.en.tests.test_distances_plain module

class lexnlp.extract.en.tests.test_distances_plain.TestDistancesPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_distances_digits()
test_distances_words()
test_file_samples()

lexnlp.extract.en.tests.test_durations module

Duration unit tests for English.

This module implements unit tests for the duration extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_durations.test_get_durations()

Test durations. :return:

lexnlp.extract.en.tests.test_durations.test_get_durations_source()

Test durations with source. :return:

lexnlp.extract.en.tests.test_durations_plain module

class lexnlp.extract.en.tests.test_durations_plain.TestDurationsPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_a_and_b()
test_durations_days()
test_durations_digits()
test_file_samples()

lexnlp.extract.en.tests.test_geoentities module

Geo entity unit tests for English.

This module implements unit tests for the geo entity extraction functionality in English.

lexnlp.extract.en.tests.test_geoentities.load_entities_dict()
lexnlp.extract.en.tests.test_geoentities.test_geoentities()
lexnlp.extract.en.tests.test_geoentities.test_geoentities_alias_filtering()
lexnlp.extract.en.tests.test_geoentities.test_geoentities_counting()
lexnlp.extract.en.tests.test_geoentities.test_geoentities_en_equal_match_take_lowest_id()
lexnlp.extract.en.tests.test_geoentities.test_geoentities_en_equal_match_take_top_prio()

lexnlp.extract.en.tests.test_geoentities_plain module

class lexnlp.extract.en.tests.test_geoentities_plain.TestGeoentitiesPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_multiline_address()
test_simple_address()
lexnlp.extract.en.tests.test_geoentities_plain.make_geoconfig()
lexnlp.extract.en.tests.test_geoentities_plain.parse_geo_annotations(text: str) → List[lexnlp.extract.common.annotations.geo_annotation.GeoAnnotation]

lexnlp.extract.en.tests.test_introductory_words_detector module

class lexnlp.extract.en.tests.test_introductory_words_detector.TestIntroductoryWordsDetector(methodName='runTest')

Bases: unittest.case.TestCase

test_negative()
test_negative_combined()
test_positive()

lexnlp.extract.en.tests.test_money module

Money unit tests for English.

This module implements unit tests for the money extraction functionality in English.

Todo:
  • More pathological and difficult cases

class lexnlp.extract.en.tests.test_money.MoneyTest(methodName='runTest')

Bases: unittest.case.TestCase

test_get_money_order()

At some moment there was a problem: get_money() was returning money in reversed order. This test is ensures the order is straight. :return:

test_get_money_problem1()

Problem: it was returning 23.6 instead of 23.62 for such cases. :return:

lexnlp.extract.en.tests.test_money.test_get_money()

Test money extraction. :return:

lexnlp.extract.en.tests.test_money.test_get_money_source()

Test money extraction with source. :return:

lexnlp.extract.en.tests.test_money_plain module

class lexnlp.extract.en.tests.test_money_plain.TestMoneyPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_money()
lexnlp.extract.en.tests.test_money_plain.get_money_annotations_sorted(text)

lexnlp.extract.en.tests.test_parsing_speed module

class lexnlp.extract.en.tests.test_parsing_speed.TestParsingSpeed(methodName='runTest')

Bases: unittest.case.TestCase

This method is not named as test_XXX because it is not intended for (automatic) regression tests

check_time(text: str, func: Callable, func_name: str, times: Dict[str, float]) → None
en_parsers_speed()

lexnlp.extract.en.tests.test_percent_plain module

class lexnlp.extract.en.tests.test_percent_plain.TestPercentPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_percent()
test_percent_amount()
test_percent_fraction()
test_percent_mix_fraction()

lexnlp.extract.en.tests.test_percents module

Percent unit tests for English.

This module implements unit tests for the percent extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_percents.test_get_percents()

Test default get percent behavior. :return:

lexnlp.extract.en.tests.test_percents.test_get_percents_source()

Test get percent behavior with source return. :return:

lexnlp.extract.en.tests.test_phone_plain module

class lexnlp.extract.en.tests.test_phone_plain.TestPhonePlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_phone()

lexnlp.extract.en.tests.test_pii module

PII unit tests for English.

This module implements unit tests for the PII extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • Add more PII examples

class lexnlp.extract.en.tests.test_pii.TestPII

Bases: object

test_path = '/home/alex/dev/michael/contraxsuite/lexpredict-lexnlp/test_data/lexnlp/extract/en/tests/test_pii/'
test_pii_list()
test_pii_list_source()
test_ssn_list()

Test SSN detection. :return:

test_ssn_list_source()

Test SSN detection. :return:

test_us_phone_list()

Test US phone number detection. :return:

test_us_phone_list_source()

Test US phone number detection. :return:

lexnlp.extract.en.tests.test_ratios module

Ratio unit tests for English.

This module implements unit tests for the ratio extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_ratios.test_get_ratios()

Test ratio extraction. :return:

lexnlp.extract.en.tests.test_ratios.test_get_ratios_source()

Test ratio extraction with source. :return:

lexnlp.extract.en.tests.test_ratios_plain module

class lexnlp.extract.en.tests.test_ratios_plain.TestRatiosPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_ratio_slash()
test_ratios()

lexnlp.extract.en.tests.test_regulations module

Regulation unit tests for English.

This module implements unit tests for the regulation extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

  • test_parse_comission should pick one and only one record

class lexnlp.extract.en.tests.test_regulations.TestRegulations(methodName='runTest')

Bases: unittest.case.TestCase

test_get_regulations_csv()

Test default get regulations behavior. :return:

test_parse_comission()

lexnlp.extract.en.tests.test_regulations_plain module

class lexnlp.extract.en.tests.test_regulations_plain.TestRegulationsPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_regulations()

lexnlp.extract.en.tests.test_span_tokenizer module

class lexnlp.extract.en.tests.test_span_tokenizer.TestSpanTokenizer(methodName='runTest')

Bases: unittest.case.TestCase

test_split_dont()
test_split_plain()
test_split_simplest_case()
test_split_with_quotes()

lexnlp.extract.en.tests.test_ssn_plain module

class lexnlp.extract.en.tests.test_ssn_plain.TestSsnPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_ssn()

lexnlp.extract.en.tests.test_trademarks module

Trademark unit tests for English.

This module implements unit tests for the Trademark extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_trademarks.test_trademarks()

lexnlp.extract.en.tests.test_trademarks_plain module

class lexnlp.extract.en.tests.test_trademarks_plain.TestTrademarksPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_annotation_coords()
test_file_samples()
test_trademarks()

lexnlp.extract.en.tests.test_urls module

Urls unit tests for English.

This module implements unit tests for the urls extraction functionality in English.

Todo:
  • Better testing for exact test in return sources

  • More pathological and difficult cases

lexnlp.extract.en.tests.test_urls.test_urls()

lexnlp.extract.en.tests.test_urls_plain module

class lexnlp.extract.en.tests.test_urls_plain.TestRatiosPlain(methodName='runTest')

Bases: unittest.case.TestCase

test_file_samples()
test_ratios()

Module contents