Package tfasta
[hide private]
[frames] | no frames]

Package tfasta

source code

tfasta: Parses and creates fasta files Copyright (c) 2014, James C. Stroud; All rights reserved.


Version: 0.3.1

Submodules [hide private]

Classes [hide private]
  FastaTemplate
This class encapsulates template information for parsing fasta files.
Functions [hide private]
 
fasta_parser(filename, template=None, greedy=None, dogaps=False)
Given a filename, returns an iterator that iterates over the fasta file.
source code
str
make_fasta(name, seq, width=60)
Give it a sequence name and a sequence (seq) and it returns a fasta representation as a str.
source code
str
make_fasta_from_dict(adict, width=60)
Give it a dict of sequences keyed by name of the sequence and it returns a fasta representation as a str.
source code
 
string_fasta_parser(astr, template=None, dogaps=False)
Given astr (string of fasta), returns an iterator that iterates over the fasta file.
source code
 
io_fasta_parser(fastafile, template=None, dogaps=False)
Helper generator function for fasta_parser and string_fasta_parser.
source code
Variables [hide private]
  T_DEF = TEMPLATES ['default']
  T_SWISS = TEMPLATES ['swissprot']
  T_PDB = TEMPLATES ['pdb']
  T_NR = TEMPLATES ['nr']
  T_NRBLAST = TEMPLATES ['nrblast']
  FASTA_WIDTH = 60
  __package__ = 'tfasta'
Function Details [hide private]

fasta_parser(filename, template=None, greedy=None, dogaps=False)

source code 

Given a filename, returns an iterator that iterates over the fasta file. It will yield dictionaries keyed according to the fields in template. These dictionaries will all also include a sequence keyed by "sequence". Yielding dictionaries allows for flexibility in the types of fasta files parsed.

File format testing is not done, so make sure its a fasta file.

Parameters:
  • filename (str) - name of the fasta file
  • template (FastaTemplate) - instance of FastaTemplate class--choose from TEMPLATES or define your own.
  • greedy (bool) - a bool specifying whether to read the whole fasta file in at once. Set to True for many smaller files or to False for a few or one REALLY big ones.
  • dogaps - a bool specifying whether to keep "-" in the sequence after parsing the file
    • if False, then gaps are ignored
    • handy if processing an alignment

make_fasta(name, seq, width=60)

source code 

Give it a sequence name and a sequence (seq) and it returns a fasta representation as a str.

Parameters:
  • name (str) - name of sequence
  • seq (str) - sequence as a str
Returns: str
a string representation of a fasta record

make_fasta_from_dict(adict, width=60)

source code 

Give it a dict of sequences keyed by name of the sequence and it returns a fasta representation as a str.

Parameters:
  • adict (dict) - dict of sequences keyed by name
Returns: str
fasta representation of sequences as a str

string_fasta_parser(astr, template=None, dogaps=False)

source code 

Given astr (string of fasta), returns an iterator that iterates over the fasta file. It will yield dictionaries keyed according to the fields in template. These dictionaries will all also include a sequence keyed by "sequence". Yielding dictionaries allows for flexibility in the types of fasta files parsed.

This function will do its best to remove unneeded whitespace, including line breaks.

Beyond simple extra whitespace, the `astr` should be properly formatted fasta text.

Parameters:
  • astr (str) - fasta text
  • template (FastaTemplate) - instance of FastaTemplate class--choose from TEMPLATES or define your own.
  • dogaps - a bool specifying whether to keep "-" in the sequence after parsing the file
    • if False, then gaps are ignored
    • handy if processing an alignment

io_fasta_parser(fastafile, template=None, dogaps=False)

source code 

Helper generator function for fasta_parser and string_fasta_parser.

Given fastafile (file-like object, open for reading), returns an iterator that iterates over the fasta file. It will yield dictionaries keyed according to the fields in template. These dictionaries will all also include a sequence keyed by "sequence".

Parameters:
  • fastafile - file-like object containing fasta text, opened for reading
  • template (FastaTemplate) - instance of FastaTemplate class--choose from TEMPLATES or define your own.
  • dogaps - a bool specifying whether to keep "-" in the sequence after parsing the file
    • if False, then gaps are ignored
    • handy if processing an alignment