15.1.17. crate_anon.anonymise.scrub¶
Copyright (C) 2015-2018 Rudolf Cardinal (rudolf@pobox.com).
This file is part of CRATE.
CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRATE. If not, see <http://www.gnu.org/licenses/>.
Scrubber classes for CRATE anonymiser.
-
class
crate_anon.anonymise.scrub.
NonspecificScrubber
(replacement_text: str, hasher: cardinal_pythonlib.hash.GenericHasher, anonymise_codes_at_word_boundaries_only: bool = True, anonymise_numbers_at_word_boundaries_only: bool = True, blacklist: crate_anon.anonymise.scrub.WordList = None, scrub_all_numbers_of_n_digits: List[int] = None, scrub_all_uk_postcodes: bool = False)[source]¶
-
class
crate_anon.anonymise.scrub.
PersonalizedScrubber
(replacement_text_patient: str, replacement_text_third_party: str, hasher: cardinal_pythonlib.hash.GenericHasher, anonymise_codes_at_word_boundaries_only: bool = True, anonymise_dates_at_word_boundaries_only: bool = True, anonymise_numbers_at_word_boundaries_only: bool = True, anonymise_numbers_at_numeric_boundaries_only: bool = True, anonymise_strings_at_word_boundaries_only: bool = True, min_string_length_for_errors: int = 4, min_string_length_to_scrub_with: int = 3, scrub_string_suffixes: List[str] = None, string_max_regex_errors: int = 0, whitelist: crate_anon.anonymise.scrub.WordList = None, nonspecific_scrubber: crate_anon.anonymise.scrub.NonspecificScrubber = None, debug: bool = False)[source]¶ Accepts patient-specific (patient and third-party) information, and uses that to scrub text.
-
add_value
(value: Any, scrub_method: crate_anon.anonymise.constants.SCRUBMETHOD, patient: bool = True, clear_cache: bool = True) → None[source]¶ Add a specific value via a specific scrub_method.
The patient flag controls whether it’s treated as a patient value or a third-party value.
-
-
class
crate_anon.anonymise.scrub.
ScrubberBase
(hasher: cardinal_pythonlib.hash.GenericHasher)[source]¶ Scrubber base class.
-
class
crate_anon.anonymise.scrub.
WordList
(filenames: Iterable[str] = None, words: Iterable[str] = None, replacement_text: str = '[---]', hasher: cardinal_pythonlib.hash.GenericHasher = None, suffixes: List[str] = None, at_word_boundaries_only: bool = True, max_errors: int = 0, regex_method: bool = False)[source]¶