BanzaiDB.fabfile package

Submodules

BanzaiDB.fabfile.variants module

BanzaiDB.fabfile.variants.extract_positions(position_list, strain_set, verbose)[source]
BanzaiDB.fabfile.variants.fetch_given_strain_position(strain, position)[source]

With a strainID and a ‘change’ position return known details

Prints the position, locus tag, product, class and subclass

Parameters:
  • strain (string) – the strain ID
  • position (int) – the position relative to the reference
Returns:

a dictionary (JSON)

BanzaiDB.fabfile.variants.filter_counts(list_of_elements, minimum)[source]

Filter out elements in a list that are not observed a minimum of times

Parameters:
  • list_of_elements (list) – a list of for example positions
  • minimum (int) – the miminum number of times an value must be observed
Returns:

a dictionary of value:observation key value pairs

BanzaiDB.fabfile.variants.get_num_strains()[source]

Get the number of strains in the study

It will query all strains in the database and will factor if the reference has been included in the run (will remove it from the count)

Returns:the number of strains as an int
BanzaiDB.fabfile.variants.get_required_strains(strains)[source]

Returns a list of strains stored in the database if argument strains=None

If argument strains=None we actually query the database

If argument strains is not None we actually just spit the strain string on the space delimiter.

Parameters:strains (string or None) – a string of strain IDs
Returns:a list of strains (if None, those all stored in the database)
BanzaiDB.fabfile.variants.get_variants_by_keyword[source]

Return variants with a match in the “Product” with the regular_expression

Supported regular expression syntax: https://code.google.com/p/re2/wiki/Syntax

By default: print (in CSV) results with headers: StrainID, Position, LocusTag, Class, SubClass

Parameters:
  • regular_expression
  • ROW – [def = ‘Product’] toggle searching of other table headers
  • verbose – [def = True] toggle if printing results
  • plucking – [def = ‘StrainID Position LocusTag Class SubClass’] toggle headers based on table headers
Returns:

List containing JSON elements with the data: ‘StrainID’, ‘Position’, ‘LocusTag’, ‘Class’, ‘SubClass’ for each result

BanzaiDB.fabfile.variants.get_variants_in_range[source]

Return all the variants in given [start:end] range (inclusive of)

By default: print (in CSV) results with headers: StrainID, Position, LocusTag, Class, SubClass

Examples:

# All variants in the 1Kb range of 60K-61K
fab variants.get_variants_in_range:60000,61000

#Nail down on a particular position and redefine the output
fab variants.get_variants_in_range:60760,60760,plucking='StrainID Position Class Product'
Parameters:
  • start – the genomic location start
  • end – the genomic location end
  • verbose – [def = True] toggle if printing results
  • plucking – [def = ‘StrainID Position LocusTag Class SubClass’] toggle headers based on table values
Returns:

List containing JSON elements with the data: ‘StrainID’, ‘Position’, ‘LocusTag’, ‘Class’, ‘SubClass’ for each result

BanzaiDB.fabfile.variants.list_membership(combined, list1, list2)[source]
BanzaiDB.fabfile.variants.plot_variant_positions[source]

Generate a PDF of SNP positions for given strains using GenomeDiagram

Places the reference features on the outerring

User has to provide a space dlimited list of strains (see warning below)

BanzaiDB.fabfile.variants.position_counter(strains)[source]

Pull all the positions that we observe changes

Note

This query could be sped up?

BanzaiDB.fabfile.variants.strain_variant_stats[source]

Print the number of variants and variant classes for all strains

Example usage:

fab variants.strain_variant_stats
fab variants.strain_variant_stats:'AEXT01-FSL-S3-026 QMA0306.gz'
Parameters:
  • strains – [def = None] Print info about all strains unless given a space delimited list of specific strains
  • verbose – [def = True] print to STDOUT
Returns:

a list of results in CSV

BanzaiDB.fabfile.variants.variant_hotspots[source]

Return the (default = 100) prevalent variant positions

Example usage:

fab variants.variant_hotspots
fab variants.variant_hotspots:250
Parameters:most_prevalent – [def = 100]
BanzaiDB.fabfile.variants.variant_positions_within_atleast[source]

Return positions that have at least this many variants

By default the minimum number will be equal to all the strains in the study.

Example usage:

fab variants.variant_positions_within_atleast fab variants.variant_positions_within_atleast:16
Parameters:minimum_at_position – [def = None] minimum number of variants conserved in N strains at this positions
BanzaiDB.fabfile.variants.what_differentiates_strains[source]

Get variant positions that differentiate two given sets of strains

Example usage:

fab $BANZAIDB_LOCATION/fabfile/' variants.what_differentiates_strains:ASCC880519,'ASCC881171 ASCC881475'
Parameters:
  • strain_set1 – a space delimited string of strains that belong in set1
  • strain_set2 – a space delimited string of strains that belong in set2
Returns:

2 lists of JSON that define the variants that are unique to set1 (1st) and set2 (2nd)

Module contents

Table Of Contents

Previous topic

BanzaiDB package

Next topic

BanzaiDB Developer HOWTO

This Page