Ingredient: Using anonymizersΒΆ
This example shows a shelf with anonymizers.
[5]:
from examples_base import *
shelf = Shelf({
'state': Dimension(Census.state, anonymizer=lambda v: v[::-1]),
'age': WtdAvgMetric(Census.age, Census.pop2000),
'gender': Dimension(Census.gender),
'population': Metric(func.sum(Census.pop2000), formatters=[
lambda value: int(round(value, -6) / 1000000)
])
})
recipe = Recipe(shelf=shelf, session=oven.Session(), extension_classes=[Anonymize])\
.dimensions('state').metrics('population')
# Look at the output.
print(recipe.to_sql())
recipe.dataset.df
SELECT census.gender AS gender,
sum(census.pop2000) AS population_raw
FROM census
GROUP BY census.gender
[5]:
gender | population_raw | gender_id | population | |
---|---|---|---|---|
0 | F | 143534804 | F | 144 |
1 | M | 137392517 | M | 137 |
Formatters are python code that runs after the row data is retrieved from the database. The original value is available as ingredient_raw
. The SQL query returns the ingredient_raw
value and the ingredient
value is added by calling the formatter.