API Reference
This section provides detailed information about the classes and functions provided by Zink.
Main Module (zink.zink)
- zink.zink.redact(text, categories=None, placeholder=None, use_cache=True, use_json_mapping=True, extractor=None, merger=None, replacer=None, auto_parallel=False, chunk_size=1000, max_workers=4)[source]
Module-level convenience function that uses a global instance for caching. If 'auto_parallel' is True and len(text) > chunk_size, concurrency-based pipeline is used. Otherwise single-pass logic is used.
- zink.zink.replace(text, categories=None, user_replacements=None, ensure_consistency=True, use_cache=True, use_json_mapping=True, extractor=None, merger=None, replacer=None, auto_parallel=False, chunk_size=1000, max_workers=4)[source]
Module-level convenience function that uses a global instance for caching.
- zink.zink.replace_with_my_data(text, categories=None, user_replacements=None, ensure_consistency=True, use_json_mapping=True, extractor=None, merger=None, replacer=None, auto_parallel=False, chunk_size=1000, max_workers=4)[source]
Module-level convenience function. Typically 'replace_with_my_data' does NOT rely on caching, but we might still want concurrency for large texts if 'auto_parallel' is True.
Extractor Module (zink.extractor)
- class zink.extractor.EntityExtractor(model_name='deepanwa/NuNerZero_onnx')[source]
Bases:
object
- predict(text, labels=None)[source]
Predict entities in the given text.
- Parameters:
text (str) -- The input text.
labels (list of str,) -- Only entities with these labels will be returned. If None, all detected entities are returned.
- Returns:
A list of dictionaries, each containing 'start', 'end', 'label', and 'text'.
- Return type:
list of dict
Merger Module (zink.merger)
Result Module (zink.result)
- class zink.result.PseudonymizationResult(original_text: str, anonymized_text: str, replacements: ~typing.List[~zink.result.ReplacementDetail] = <factory>, features: ~typing.Dict = <factory>)[source]
Bases:
object
Result of the pseudonymization process.
- anonymized_text: str
- features: Dict
- original_text: str
- replacements: List[ReplacementDetail]
Replacer Subpackage (zink.replacer)
- class zink.replacer.EntityReplacer(use_json_mapping=False)[source]
- replace_entities(entities, text, user_replacements=None)[source]
Replace entities in the text with pseudonyms, with randomized replacements.
- Parameters:
entities (list of dict) -- A list of dictionaries, each containing 'start', 'end', 'label', and 'text'.
text (str) -- The original text.
user_replacements (dict,) -- A dictionary of user-defined replacements for specific entity labels. If provided, these will override the JSON-based mappings.
- Returns:
The text with entities replaced by pseudonyms.
- Return type:
str
- replace_entities_ensure_consistency(entities, text, user_replacements=None)[source]
Replace entities in the text with pseudonyms, ensuring consistent replacements.
- Parameters:
entities (list of dict) -- A list of dictionaries, each containing 'start', 'end', 'label', and 'text'.
text (str) -- The original text.
user_replacements (dict,) -- A dictionary of user-defined replacements for specific entity labels. If provided, these will override the JSON-based mappings.
- Returns:
The text with entities replaced by pseudonyms.
- Return type:
str
This subpackage provides various replacement strategies. It is used internally by the main zink.replace function.