Anonymize: Realistic random data

The Anonymize extension lets generate anonymous data that resembles real data.

class recipe.Anonymize(*args, **kwargs)[source]

Allows ingredient values in recipes to be anonymized.

Anonymize extends Ingredient to support an anonymizer formatting string property that uses Faker providers (https://faker.readthedocs.io/en/master/providers.html) to generate a string value.

Alternatively, the Ingredient.anonymizer property can be a function that takes a value and returns an anonymized value.

Here is an example of both types of anonymizers:

# A formatting string
Dimension(MyTable.student_name, anonymizer='{fake:name}')

# A function that reverses the string
Dimension(MyTable.student_name, anonymizer=lambda v: v[::-1])

Anonymizer strings can contain multiple generators

Anonymizer strings are python format strings:

anonymizer='{fake:random_uppercase_letter}{fake:random_uppercase_letter}{fake:random_uppercase_letter} - {fake:name}'

will generate values like ‘ABC - Bob Smith’

Parameterized anonymizer strings

Parameters can optionally be provided following a bar “|” in a key1=value1,key2=value2 format. For example:

anonymizer='{fake:lexify|text=????,letters=abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ}'

is the same as the following.:

fake = Faker()
fake.lexify(text="????", letters="abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ")

Details

This flips the anonymize flag on all Ingredients used in the recipe.

Injects an ingredient.meta._anonymize boolean property on each used ingredient.

Anonymize should occur last in the list of extension_classes.

add_ingredients()[source]

Put the anonymizers in the last position of formatters

anonymize(value)[source]

Should this recipe be anonymized. Default is False