15.1.9. crate_anon.anonymise.dd¶
Copyright (C) 2015-2018 Rudolf Cardinal (rudolf@pobox.com).
This file is part of CRATE.
CRATE is free software: you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation, either version 3 of the License, or (at your option) any later version.
CRATE is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
You should have received a copy of the GNU General Public License along with CRATE. If not, see <http://www.gnu.org/licenses/>.
Data dictionary classes for CRATE anonymiser.
Data dictionary as a TSV file, for ease of editing by multiple authors, rather than a database table.
-
class
crate_anon.anonymise.dd.
DataDictionary
(config: Config)[source]¶ Class representing an entire data dictionary.
-
check_against_source_db
() → None[source]¶ Check DD validity against the source database. Also caches SQLAlchemy source column type
-
check_valid
(prohibited_fieldnames: List[str] = None, check_against_source_db: bool = True) → None[source]¶ Check DD validity, internally +/- against the source database.
-
get_dest_table_for_src_db_table
[source]¶ For a given source database/table, return the single or the first destination table.
-
get_dest_tables_for_src_db_table
[source]¶ For a given source database/table, return a SortedSet of destination tables.
-
get_dest_tables_with_patient_info
[source]¶ Return a SortedSet of destination table names that have patient information.
-
get_fieldnames_for_src_table
[source]¶ For a given source database name/table, return a SortedSet of source fields.
-
get_int_pk_ddr
[source]¶ For a given source database name and table, return the DD row for the integer PK for that table.
Will return None if no such data dictionary row.
-
get_int_pk_name
[source]¶ For a given source database name and table, return the field name of the integer PK for that table.
-
get_optout_defining_fields
[source]¶ Return a SortedSet of (src_db, src_table, src_field, pidfield, mpidfield) tuples.
-
get_patient_src_tables_with_active_dest
[source]¶ For a given source database name, return a SortedSet of source tables that have an active destination table.
-
get_pk_ddr
[source]¶ For a given source database name and table, return the DD row for the PK for that table, whether integer or not.
Will return None if no such data dictionary row.
-
get_rows_for_src_table
[source]¶ For a given source database name/table, return a SortedSet of DD rows.
-
get_scrub_from_db_table_pairs
[source]¶ Return a SortedSet of (source database name, source table) tuples where those fields contain scrub_src (scrub-from) information.
-
get_scrub_from_rows
[source]¶ Return a SortedSet of DD rows for all fields containing scrub_src (scrub-from) information.
-
get_src_db_tablepairs_w_int_pk
[source]¶ Return a SortedSet of (source database name, source table) tuples.
-
get_src_db_tablepairs_w_pt_info
[source]¶ Return a SortedSet of (source database name, source table) tuples.
-
get_src_dbs_tables_for_dest_table
[source]¶ For a given destination table, return a SortedSet of (dbname, table) tuples.
-
get_src_dbs_tables_with_no_pt_info_int_pk
[source]¶ Return a SortedSet of (source database name, source table) tuples where the table has no patient information and has an integer PK.
-
get_src_dbs_tables_with_no_pt_info_no_pk
[source]¶ Return a SortedSet of (source database name, source table) tuples where the table has no patient information and no integer PK.
-
get_src_tables_with_active_dest
[source]¶ For a given source database name, return a SortedSet of source tables.
-
get_src_tables_with_patient_info
[source]¶ For a given source database name, return a SortedSet of source tables that have patient information.
-