lexnlp.extract.en.tests package¶
Submodules¶
lexnlp.extract.en.tests.test_acts module¶
lexnlp.extract.en.tests.test_amounts module¶
Amount unit tests for English.
This module implements unit tests for the amount extraction functionality in English.
- Todo:
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_amounts.
test_error_case_1
()¶ Test encountered error case. :return:
-
lexnlp.extract.en.tests.test_amounts.
test_error_case_2
()¶ Test encountered error case. :return:
-
lexnlp.extract.en.tests.test_amounts.
test_get_amount
()¶ Test default get amount behavior. :return:
-
lexnlp.extract.en.tests.test_amounts.
test_get_amount_non_round_float
()¶ Test get amount behavior with source return. :return:
-
lexnlp.extract.en.tests.test_amounts.
test_get_amount_source
()¶ Test get amount behavior with source return. :return:
lexnlp.extract.en.tests.test_amounts_plain module¶
lexnlp.extract.en.tests.test_citations module¶
Citation unit tests for English.
This module implements unit tests for the citation extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_citations.
test_get_citations
()¶ Test default get citation behavior. :return:
-
lexnlp.extract.en.tests.test_citations.
test_get_citations_as_dict
()¶
lexnlp.extract.en.tests.test_citations_plain module¶
lexnlp.extract.en.tests.test_conditions module¶
Condition unit tests for English.
This module implements unit tests for the condition extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_conditions.
test_condition_fixed_example
()¶
lexnlp.extract.en.tests.test_conditions_plain module¶
lexnlp.extract.en.tests.test_constraints module¶
Constraints unit tests for English.
This module implements unit tests for the constraint extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_constraints.
test_constraint_fixed_example
()¶
lexnlp.extract.en.tests.test_constraints_plain module¶
lexnlp.extract.en.tests.test_copyright module¶
Copyright unit tests for English.
This module implements unit tests for the copyright extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_copyright.
test_copyright
()¶
lexnlp.extract.en.tests.test_copyright_plain module¶
-
class
lexnlp.extract.en.tests.test_copyright_plain.
TestCopyrightPlain
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_big_file
()¶
-
test_copyrights
()¶
-
test_file_samples
()¶
-
test_text_coords
()¶
-
-
lexnlp.extract.en.tests.test_copyright_plain.
get_copyright_verbose_annotations
(text: str) → Generator[[lexnlp.extract.common.annotations.copyright_annotation.CopyrightAnnotation, None], None]¶
lexnlp.extract.en.tests.test_courts module¶
Court/jurisdiction unit tests for English.
This module implements unit tests for the court/jurisdiction extraction functionality in English.
- Todo:
Re-introduce known bad cases with better master data or more flexible approach
More pathological and difficult cases
-
class
lexnlp.extract.en.tests.test_courts.
TestParseEnCourts
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_file_samples
()¶
-
test_parse_empty_text
()¶
-
test_parse_simply_text
()¶
-
-
lexnlp.extract.en.tests.test_courts.
test_court_config_setup
()¶ Test setup of CourtConfig object. :return:
-
lexnlp.extract.en.tests.test_courts.
test_court_config_setup_wo_alias
()¶
-
lexnlp.extract.en.tests.test_courts.
test_courts
()¶ Test court extraction. :return:
-
lexnlp.extract.en.tests.test_courts.
test_courts_longest_match
()¶ Tests the case when there are courts having names/aliases being one a substring of another. In such case the court having longest alias should be returned for each conflicting matching. But for the case when there is another match of the court having shorter alias in that conflict, they both should be returned. :return:
-
lexnlp.extract.en.tests.test_courts.
test_courts_rs
()¶ Test court extraction with return sources. :return:
lexnlp.extract.en.tests.test_cusip module¶
-
class
lexnlp.extract.en.tests.test_cusip.
TestGetCUSIP
(methodName='runTest')¶ Bases:
lexnlp.extract.de.tests.test_amounts.AssertionMixin
-
test_correct_cases
()¶
-
test_file_samples
()¶
-
test_wrong_cases
()¶
-
lexnlp.extract.en.tests.test_dates module¶
Date unit tests for English.
This module implements unit tests for the date extraction functionality in English.
- Todo:
Implement document-level date detection to identify anomalous dates
Better testing for exact test in return sources
Resolve example bad dates
More pathological and difficult cases
-
class
lexnlp.extract.en.tests.test_dates.
TestDates
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
check_dates_set
(date_src: List[Tuple[int, int, int]])¶ Test date extraction with provided dates.
-
test_build_model
()¶ Test build model by running default train. :return:
-
test_date_feature_1
()¶ Test date feature engineering.
-
test_date_feature_1_bigram
()¶ Test date feature engineering with bigrams.
-
test_date_may
()¶ Test that ” may ” alone does not parse.
-
test_fixed_date_set
()¶
-
test_fixed_dates
()¶ Test date extraction from fixed examples.
-
test_fixed_dates_nonstrict
()¶ Test date extraction from fixed examples.
-
test_fixed_dates_source
()¶ Test date extraction from fixed examples with source.
-
test_fixed_raw_dates
()¶ Test raw date extraction from fixed examples.
-
test_get_raw_dates
()¶
-
test_random_date_set
()¶
-
-
lexnlp.extract.en.tests.test_dates.
expected_data_converter
(expected)¶
lexnlp.extract.en.tests.test_dates_plain module¶
-
class
lexnlp.extract.en.tests.test_dates_plain.
TestDatesPlain
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_another_may
()¶
-
test_august
()¶
-
test_date_first_aug
()¶
-
test_dates
()¶
-
test_dates_times
()¶
-
test_file_samples
()¶
-
test_fp
()¶
-
test_moar_dates
()¶
-
test_more_more_dates
()¶
-
test_no_dates
()¶
-
test_one_date_this
()¶
-
test_section
()¶
-
test_should_be_fixed
()¶
-
test_two_dates_strict
()¶
-
test_two_ranges
()¶
-
lexnlp.extract.en.tests.test_definitions module¶
Definition unit tests for English.
This module implements unit tests for the definition extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
class
lexnlp.extract.en.tests.test_definitions.
TestEnglishDefinitions
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_abbr_strip
()¶
-
test_annotations
()¶
-
test_apostrophe_in_definition
()¶
-
test_capitalized_false_positive
()¶
-
test_capitalized_with_trigger
()¶
-
test_capitalized_with_trigger_in_the_middle_of_sentense
()¶
-
test_def_called
()¶
-
test_definition_fixed
()¶
-
test_definition_ml
()¶
-
test_definition_quoted
()¶
-
test_definition_quoted_new_line
()¶
-
test_definitions_in_one_sentence
()¶
-
test_definitions_in_sentences_text
()¶
-
test_definitions_simple
()¶
-
test_dot_in_definition
()¶
-
test_emma
()¶
-
test_enquoted
()¶
-
test_fp_pronoun
()¶
-
test_fp_service_words
()¶
-
test_include_multitoken_definition
()¶ I think that the text (each an “Obligation” and collectively, the “Obligations”) IS the definition. But the parser skips the text because it has more than MAX_TERM_TOKENS (presently, 5) words.
So, the behavior is changed: now 10 words are allowed because there are 2 possible “definitions”.
-
test_merge_defs
()¶
-
test_merge_defs_consumed
()¶
-
test_misbrackets
()¶
-
test_newlines
()¶
-
test_noun_pattern_false_positive
()¶
-
test_obvious_embraced_definition
()¶
-
test_parenthesis
()¶
-
test_parse_in_extra_quotes
()¶
-
test_parse_moodys
()¶
-
test_process_ugly_braces_def
()¶
-
test_quotes_removed
()¶
-
test_reffered_to_def
()¶
-
test_reffered_to_def_excess_words
()¶
-
test_start_word_shall_be_false_positive
()¶
-
test_the
()¶
-
test_the_corporation_false_positive
()¶
-
test_too_long_definition
()¶
-
test_trigger_word_fullmatches
()¶
-
test_trim_defined_term
()¶
-
test_unbal_quotes
()¶
-
test_unpared_brackets
()¶
-
lexnlp.extract.en.tests.test_definitions_template module¶
-
class
lexnlp.extract.en.tests.test_definitions_template.
TestDefinitionsTemplate
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_file_samples
()¶
-
-
lexnlp.extract.en.tests.test_definitions_template.
get_definitions_sorted
(text: str)¶
lexnlp.extract.en.tests.test_dict_entities module¶
Dict entity general unit tests.
-
class
lexnlp.extract.en.tests.test_dict_entities.
TestDictEntities
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_abbreviations_simple
()¶
-
test_alias_is_blacklisted
()¶
-
test_am_pm_none
()¶
-
test_common_search_all_languages
()¶
-
test_conflicts_equal_length_take_same_language
()¶
-
test_conflicts_take_longest_match
()¶
-
test_equal_aliases_in_dif_languages
()¶
-
test_find_dict_entities_empty_text
()¶
-
test_get_alias_id
()¶
-
test_get_alias_text
()¶
-
test_get_entity_id
()¶
-
test_normalize_text
()¶
-
test_plural_case_matching
()¶
-
test_prepare_alias_blacklist_dict
()¶
-
lexnlp.extract.en.tests.test_distance module¶
Distance unit tests for English.
This module implements unit tests for the distance extraction functionality in English.
- Todo:
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_distance.
test_get_distance
()¶ Test distance extraction. :return:
-
lexnlp.extract.en.tests.test_distance.
test_get_distance_source
()¶ Test distance extraction with source. :return:
lexnlp.extract.en.tests.test_distances_plain module¶
lexnlp.extract.en.tests.test_durations module¶
Duration unit tests for English.
This module implements unit tests for the duration extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_durations.
test_get_durations
()¶ Test durations. :return:
-
lexnlp.extract.en.tests.test_durations.
test_get_durations_source
()¶ Test durations with source. :return:
lexnlp.extract.en.tests.test_durations_plain module¶
lexnlp.extract.en.tests.test_geoentities module¶
Geo entity unit tests for English.
This module implements unit tests for the geo entity extraction functionality in English.
-
lexnlp.extract.en.tests.test_geoentities.
load_entities_dict
()¶
-
lexnlp.extract.en.tests.test_geoentities.
test_geoentities
()¶
-
lexnlp.extract.en.tests.test_geoentities.
test_geoentities_alias_filtering
()¶
-
lexnlp.extract.en.tests.test_geoentities.
test_geoentities_counting
()¶
-
lexnlp.extract.en.tests.test_geoentities.
test_geoentities_en_equal_match_take_lowest_id
()¶
-
lexnlp.extract.en.tests.test_geoentities.
test_geoentities_en_equal_match_take_top_prio
()¶
lexnlp.extract.en.tests.test_geoentities_plain module¶
-
class
lexnlp.extract.en.tests.test_geoentities_plain.
TestGeoentitiesPlain
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_file_samples
()¶
-
test_multiline_address
()¶
-
test_simple_address
()¶
-
-
lexnlp.extract.en.tests.test_geoentities_plain.
make_geoconfig
()¶
-
lexnlp.extract.en.tests.test_geoentities_plain.
parse_geo_annotations
(text: str) → List[lexnlp.extract.common.annotations.geo_annotation.GeoAnnotation]¶
lexnlp.extract.en.tests.test_introductory_words_detector module¶
lexnlp.extract.en.tests.test_money module¶
Money unit tests for English.
This module implements unit tests for the money extraction functionality in English.
- Todo:
More pathological and difficult cases
-
class
lexnlp.extract.en.tests.test_money.
MoneyTest
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_get_money_order
()¶ At some moment there was a problem: get_money() was returning money in reversed order. This test is ensures the order is straight. :return:
-
test_get_money_problem1
()¶ Problem: it was returning 23.6 instead of 23.62 for such cases. :return:
-
-
lexnlp.extract.en.tests.test_money.
test_get_money
()¶ Test money extraction. :return:
-
lexnlp.extract.en.tests.test_money.
test_get_money_source
()¶ Test money extraction with source. :return:
lexnlp.extract.en.tests.test_money_plain module¶
-
class
lexnlp.extract.en.tests.test_money_plain.
TestMoneyPlain
(methodName='runTest')¶ Bases:
unittest.case.TestCase
-
test_file_samples
()¶
-
test_money
()¶
-
-
lexnlp.extract.en.tests.test_money_plain.
get_money_annotations_sorted
(text)¶
lexnlp.extract.en.tests.test_parsing_speed module¶
-
class
lexnlp.extract.en.tests.test_parsing_speed.
TestParsingSpeed
(methodName='runTest')¶ Bases:
unittest.case.TestCase
This method is not named as test_XXX because it is not intended for (automatic) regression tests
-
check_time
(text: str, func: Callable, func_name: str, times: Dict[str, float]) → None¶
-
en_parsers_speed
()¶
-
lexnlp.extract.en.tests.test_percent_plain module¶
lexnlp.extract.en.tests.test_percents module¶
Percent unit tests for English.
This module implements unit tests for the percent extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_percents.
test_get_percents
()¶ Test default get percent behavior. :return:
-
lexnlp.extract.en.tests.test_percents.
test_get_percents_source
()¶ Test get percent behavior with source return. :return:
lexnlp.extract.en.tests.test_phone_plain module¶
lexnlp.extract.en.tests.test_pii module¶
PII unit tests for English.
This module implements unit tests for the PII extraction functionality in English.
- Todo:
Better testing for exact test in return sources
Add more PII examples
-
class
lexnlp.extract.en.tests.test_pii.
TestPII
¶ Bases:
object
-
test_path
= '/home/alex/dev/michael/contraxsuite/lexpredict-lexnlp/test_data/lexnlp/extract/en/tests/test_pii/'¶
-
test_pii_list
()¶
-
test_pii_list_source
()¶
-
test_ssn_list
()¶ Test SSN detection. :return:
-
test_ssn_list_source
()¶ Test SSN detection. :return:
-
test_us_phone_list
()¶ Test US phone number detection. :return:
-
test_us_phone_list_source
()¶ Test US phone number detection. :return:
-
lexnlp.extract.en.tests.test_ratios module¶
Ratio unit tests for English.
This module implements unit tests for the ratio extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_ratios.
test_get_ratios
()¶ Test ratio extraction. :return:
-
lexnlp.extract.en.tests.test_ratios.
test_get_ratios_source
()¶ Test ratio extraction with source. :return:
lexnlp.extract.en.tests.test_ratios_plain module¶
lexnlp.extract.en.tests.test_regulations module¶
Regulation unit tests for English.
This module implements unit tests for the regulation extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
test_parse_comission should pick one and only one record
lexnlp.extract.en.tests.test_regulations_plain module¶
lexnlp.extract.en.tests.test_span_tokenizer module¶
lexnlp.extract.en.tests.test_ssn_plain module¶
lexnlp.extract.en.tests.test_trademarks module¶
Trademark unit tests for English.
This module implements unit tests for the Trademark extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_trademarks.
test_trademarks
()¶
lexnlp.extract.en.tests.test_trademarks_plain module¶
lexnlp.extract.en.tests.test_urls module¶
Urls unit tests for English.
This module implements unit tests for the urls extraction functionality in English.
- Todo:
Better testing for exact test in return sources
More pathological and difficult cases
-
lexnlp.extract.en.tests.test_urls.
test_urls
()¶