Bibulous is a drop-in replacement for BibTeX, with the primary advantage that the bibliography template format is compact and very easy to modify.
The basic program flow upon object initialization is as follows:
Reduce the case of the string to lower case, except for the first character in the string, and except if any given character is at nonzero brace level.
Parameters : | s : str
|
---|---|
Returns : | t : str
|
Split a string into tokens, taking care not to allow the separator to act unless at brace level zero.
Parameters : | s : str
|
---|---|
Returns : | tokens : list of str
|
Parse a name field (“author” or “editor”) of a BibTeX entry into a list of dicts, one for each person.
Parameters : | namefield : str
key : str, optional
disable : list of int, optional
|
---|---|
Returns : | namelist : list
|
Take a BibTeX string representing a single person’s name and parse it into its first, middle, last, etc pieces.
So, we can separate these three categories by counting the number of commas that appear.
Parameters : | namestr : str
disable : list of int, optional
|
---|---|
Returns : | namedict : dict
|
From an input name element (first, middle, prefix, last, or suffix) , convert it to its initials.
Parameters : | name : str
options : dict of bools, optional
|
---|---|
Returns : | new_name : str
|
Generate a list of level numbers for each character in a string.
Parameters : | s : str
delims : tuple of two strings
operator : str
|
---|---|
Returns : | levels : list of ints
|
A debugging tool for showing delimiter levels and the input string next to one another.
Parameters : | s : str
levels : list of ints
|
---|
Return a list which gives the “quotation level” of each character in the string.
Parameters : | s : str
disable : list of ints, optional
|
---|---|
Returns : | alevels : list of ints
blevels : list of ints
clevels : list of ints
|
Notes
When using double-quotes, it is easy to break the parser, so they should be used only sparingly.
Split a string at locations given by a list of indices.
This can be used more flexibly than Python’s native string split() function, when the character you are splitting on is not always a valid splitting location.
Parameters : | s : str
ilist : list
|
---|---|
Returns : | slist : list of str
|
Split a string using more than one separator.
Copied from http://stackoverflow.com/questions/1059559/python-strings-split-with-multiple-separators.
Parameters : | s : str
sep : list of str
|
---|---|
Returns : | res : list
|
This function will return the input string if it finds there are no nested operators inside (i.e. when the number of delimiters found is < 2).
Parameters : | s : str
delims : tuple of two strings
odd_operator : str
even_operator : str
disable : list of int, optional
|
---|---|
Returns : | s : str
|
Find nested quotes within strings and, if necessary, replace them with the proper nesting (i.e. outer quotes use ``...'' while inner quotes use `...').
Parameters : | s : str
disable : list of int, optional
|
---|---|
Returns : | s : str
|
Remove the LaTeX-based formatting elements from a string so that a sorting function can use alphanumerical sorting on the string.
Parameters : | s : str
|
---|---|
Returns : | p : str
|
Translate LaTeX-markup special characters to their Unicode equivalents.
Parameters : | s : str
|
---|---|
Returns : | s : str
|
From the middle name of a single person, check if any of the names should be placed into the “prefix” and move them there.
Parameters : | namedict : dict
|
---|---|
Returns : | namedict : dict
|
Given a bibliography entry’s edition number, format it as an ordinal (i.e. “1st”, “2nd” instead of “1”, “2”) in the way that it will appear on the formatted page.
Parameters : | edition_field : str
disable : list of int, optional
|
---|---|
Returns : | edition_ordinal_str : str
|
Write a bibliography database dictionary into a .bib file.
Parameters : | filename : str
bibdata : dict
abbrevs : dict, optional
|
---|
Given a string containing the “pages” field of a bibliographic entry, figure out the start and end pages.
Parameters : | pages_str : str
citekey : str, optional
disable : list of int, optional
|
---|---|
Returns : | startpage : str
endpage : str
|
Given a string containing either a single “name” > “abbreviation” pair or a list of such pairs, parse the string into a dictionary of names and abbreviations.
Parameters : | abbrevstr : str
|
---|---|
Returns : | nameabbrev_dict : dict
|
Given a key that matches an already-present key in the input dictionary, generate a new key by appending zeros to the key string.
Parameters : | sortkey : str
sortdict : dict
|
---|---|
Returns : | newkey : str
|
Remove elements from a Python script which are provide the most egregious security flaws; also replace some identifiers with their correct namespace representation.
Parameters : | line : str
|
---|---|
Returns : | filtered : str
|
Check is an input string represents an integer value. Although a trivial function, it will be useful for user scripts.
Parameters : | s : str
|
---|---|
Returns : | is_integer : bool
|
Print a warning message, with the option to disable any given message.
Parameters : | msg : str
disable : list of int, optional
|
---|
Create an alpha-style citation key (typically the first three letters of the author’s last name, followed by the last two numbers of the year).
Parameters : | entry : dict
|
---|---|
Returns : | alpha : str
|
Split a string, but only if the splitting character is at level 0 or 1 and not higher.
Parameters : | s : str
splitchar : str
levels : list of ints
|
---|---|
Returns : | split_list : list of str
|
Split the variable name into “name” (left-hand-side part), “iterator” (middle part), and “remainder” (the right- hand-side part).
With these three elements, we will know how to build a template variable inside the implicit loop.
Parameters : | variable : str
|
---|---|
Returns : | var_dict : dict
|
Get a list of the indxed variables within a template.
Parameters : | templatestr : str
|
---|---|
Returns : | indexed_vars : list of str
|
Get the number of names associated with a given entry, assuming priority to authornames and then to editornames.
Parameters : | entry : dict
templatestr : str
|
---|---|
Returns : | namelist : list of dicts
|
Format a list of dictionaries (one dict for each person) into a long string, with the format according to the directives in the bibliography style template.
Parameters : | namelist : str
nametype : str, {‘author’, ‘editor’}, optional
options : dict, optional
|
---|---|
Returns : | namestr : str
|
Convert a name dictionary into a formatted name string.
Parameters : | namedict : dict
options : dict, optional
|
---|---|
Returns : | namestr : str
|
Bibdata is a class to hold all data related to a bibliography database, a citation list, and a style template.
To initialize the class, either call it with the filename of the .aux file containing the relevant file locations (for the .bib database files and the .bst template files) or simply call it with a list of all filenames to be used (i.e. database_name.bib, style_template_name.bst and main_filename.aux). The output file (the LaTeX- formatted bibliography) is assumed to have the same filename root as the .aux file, but with .bbl as its extension.
Attributes
abbrevs | dict | The list of abbreviations given in the bibliography database file(s). The dictionary keys are the abbreviations, and the values are their full forms. |
bibdata | dict | The database of bibliography entries and fields derived from parsing the bibliography database file(s). |
bstdict | dict | The style template for formatting the bibliography. The dictionary keys are the entrytypes, with the dictionary values their string template. |
citedict | dict | The dictionary of citation keys, and their corresponding numerical order of citation. |
debug | bool | Whether to turn on debugging features. |
filedict | dict | The ditionary of filenames associated with the bibliographic data. The dictionary consists of keys bib, bst, aux, tex, and bbl. The first two are lists of filenames, while the others contain only a single filename. |
filename | str | (For error messages and debugging) The name of the file currently being parsed. |
i | int | (For error messages and debugging) The line of the file currently being parsed. |
options | dict | The dictionary containing the various option settings from the style template (BST) files. |
specials | dict | The dictionary containing the special templates from the BST file(s). |
abbrevkey_pattern | compiled regular expression object | The regex used to search for abbreviation keys. |
anybrace_pattern | compiled regular expression object | The regex used to search for curly braces { or }. |
anybraceorquote_pattern | compiled regular expression object | The regex used to search for curly braces or for double-quotes, i.e. {, }, or “. |
endbrace_pattern | compiled regular expression object | The regex used to search for an ending curly brace, i.e. ‘}’. |
quote_pattern | compiled regular expression object | The regex used to search for a double-quote, i.e. “. |
startbrace_pattern | compiled regular expression object | The regex used to search for a starting curly brace, {. |
culldata | bool | Whether to cull the database so that only cited entries are parsed. Setting this to False means that the entire BIB file database will be parsed. When True, the BIB file parser will only parse those entries corresponding to keys in the citedict. Setting this to True provides significant speedups for large databases. |
parse_only_entrykeys | bool | When comparing a database file against a citation list, all we are initially interested in are the entrykeys and not the data. So, in our first pass through the database, we can use this flag to skip the data and get only the keys themselves. |
Methods
parse_bibfile(filename) | Parse a ”.bib” file to generate a dictionary representing a bibliography database. |
parse_bibentry(entrystr, entrytype) | Given a string representing the entire contents of the BibTeX-format bibliography entry, parse the contents and |
parse_bibfield(entrystr[, entrykey]) | For a given string representing the raw contents of a BibTeX-format bibliography entry, parse the contents into |
parse_auxfile(filename[, debug]) | Read in an .aux file and convert the citation{} entries found there into a dictionary of citekeys and citation order number. |
parse_bstfile(filename) | Convert a Bibulous-type bibliography style template into a dictionary. |
write_bblfile([filename, write_preamble, ...]) | Given a bibliography database bibdata, a dictionary containing the citations called out citedict, and a |
create_citation_list() | Create the list of citation keys, sorted into the proper order. |
format_bibitem(citekey[, debug]) | Create the “ibitem{...}” string to insert into the ”.bbl” file. |
generate_sortkey(citekey) | From a bibliography entry and the formatting template options, generate a sorting key for the entry. |
insert_crossref_data(entrykey[, fieldname]) | Insert crossref info into a bibliography database dictionary. |
write_citeextract(outputfile[, write_abbrevs]) | Extract a sub-database from a large bibliography database, with the former containing only those entries cited in the .aux file (and any relevant cross-references). |
write_authorextract(searchname[, ...]) | Extract a sub-database from a large bibliography database, with the former containing only those entries citing the given author/editor. |
replace_abbrevs_with_full(fieldstr, resultstr) | Given an input str, locate the abbreviation key within it and replace the abbreviation with its full form. |
generate_bibitem_label(citekey) | Generate the bibitem label. |
get_bibfilenames(filename) | If the input is a filename ending in .aux, then read through the .aux file and locate the lines ibdata{...} and ibstyle{...} to get the filename(s) for the bibliography database and style template. |
check_citekeys_in_datakeys() | Check to see if all of the citation keys (from the AUX file) exist within the current set of database entrykeys. |
add_crossrefs_to_searchkeys() | Add any cross-referenced entrykeys into the searchkeys, the list which is used to cull the database so that |
insert_specials(entrykey) | Insert “special” fields into a database entry. |
validate_templatestr(templatestr, entrytype) | Validate the template string so that it contains no formatting errors. |
fillout_implicit_loop(templatestr, entrykey) | From a template containing an implicit loop (‘...’ notation), build a full-size template without an ellipsis. |
template_substitution(templatestr, entrykey) | Substitute database entry variables into template string. |
insert_title_into_template(title_var, ...) | Insert the title field into a template string. |
remove_nested_template_options_brackets(...) | Given a template string, go through each options sequence [...] and search for undefined variables. |
remove_template_options_brackets(...) | Given a template string, go through each options sequence [...] and search for undefined variables. |
simplify_template_bracket(templatestr, ...) | From an “options train” [...|...|...], find the first fully defined block in the sequence. |
get_variable(bibentry, variable[, options]) | Get the variable (i.e. |
get_indexed_variable(field, indexer[, ...]) | Get the result of dot-indexing into a field. |
Add any cross-referenced entrykeys into the searchkeys, the list which is used to cull the database so that only necessary entries are parsed.
Check to see if all of the citation keys (from the AUX file) exist within the current set of database entrykeys.
Returns : | is_complete : bool
|
---|
From a template containing an implicit loop (‘...’ notation), build a full-size template without an ellipsis.
Right now, the code only allows one implicit loop in any given template.
Parameters : | templatestr : str
|
---|---|
Returns : | new_templatestr : str
|
Create the “ibitem{...}” string to insert into the ”.bbl” file.
This is the workhorse function of Bibulous. For a given citation key, find the resulting entry in the bibliography database. From the entry’s entrytype, lookup the relevant template in bstdict and start replacing template variables with formatted elements of the database entry. Once you’ve replaced all template variables, you’re done formatting that entry. Write the result to the BBL file.
Parameters : | citekey : str
|
---|---|
Returns : | itemstr : str
|
Generate the bibitem label.
Parameters : | citekey : str
|
---|---|
Returns : | citelabel : str
|
From a bibliography entry and the formatting template options, generate a sorting key for the entry.
Parameters : | citekey : str
|
---|---|
Returns : | sortkey : str
|
If the input is a filename ending in .aux, then read through the .aux file and locate the lines ibdata{...} and ibstyle{...} to get the filename(s) for the bibliography database and style template.
Also determine whether to set the culldata flag. If the input is a single AUX filename, then the default is to set culldata=True. If the input is a list of filenames, then assume that this is the complete list of files to use (i.e. ignore the contents of the AUX file except for generating the citedict), and set culldata=False.
Parameters : | filename : str
|
---|---|
Returns : | filedict : dict
|
Get the result of dot-indexing into a field. This can be accessing an element of a list or dictionary, or the result of operating on a string with a function.
Parameters : | field : str
indexer : str
entrykey : str
options : dict
|
---|---|
Returns : | result : str
|
Get the variable (i.e. entry field) from within the current bibliography entry.
Parameters : | bibentry : dict
variable : str
options : dict
|
---|---|
Returns : | result : str
|
Insert crossref info into a bibliography database dictionary.
Loop through a bibliography database dictionary and, for each entry which has a “crossref” field, locate the crossref entry and insert any missing bibliographic information into the main entry’s fields.
Parameters : | entrykey : str
fieldname : str, optional
|
---|---|
Returns : | foundit : bool
|
Insert “special” fields into a database entry.
Parameters : | entrykey : str
|
---|
Insert the title field into a template string.
This requires more work than simply performing a string replacement, because there can be punctuation conflicts when the title itself ends with punctuation, while the template itself has punctuation immediately following the title.
Parameters : | title_var : str
templatestr : str
bibentry : dict
|
---|---|
Returns : | templatestr : str
|
Read in an .aux file and convert the citation{} entries found there into a dictionary of citekeys and citation order number.
Parameters : | filename : str
|
---|
Given a string representing the entire contents of the BibTeX-format bibliography entry, parse the contents and place them into the bibliography preamble string, the set of abbreviations, and the bibliography database dictionary.
Parameters : | entrystr : str
entrytype : str
|
---|
For a given string representing the raw contents of a BibTeX-format bibliography entry, parse the contents into a dictionary of key:value pairs corresponding to the field names and field values.
Parameters : | entrystr : str
entrykey : str
|
---|---|
Returns : | fd : dict
|
Parse a ”.bib” file to generate a dictionary representing a bibliography database.
Parameters : | filename : str
|
---|
Convert a Bibulous-type bibliography style template into a dictionary.
The resulting dictionary consists of keys which are the various entrytypes, and values which are the template strings. In addition, any formatting options are stored in the “options” key as a dictionary of option_name:option_value pairs.
Parameters : | filename : str
|
---|
Given a template string, go through each options sequence [...] and search for undefined variables. In each sequence, return only the block of each sequence in which all variables are defined (i.e. with outer braces removed).
Parameters : | templatestr : str
entry : dict
variables : list of str
|
---|
Given a template string, go through each options sequence [...] and search for undefined variables. In each sequence, return only the block of each sequence in which all variables are defined.
Parameters : | templatestr : str
entry : dict
variables : list of str
|
---|
Given an input str, locate the abbreviation key within it and replace the abbreviation with its full form.
Once the abbreviation key is found, remove it from the “fieldstr” and add the full form to the “resultstr”.
Parameters : | fieldstr : str
resultstr : str
|
---|---|
Returns : | fieldstr : str
resultstr : str
end_of_field : bool
|
From an “options train” [...|...|...], find the first fully defined block in the sequence.
A style template string contains grammatical features of the form [...|...|...], which we can call options sequences. Each “block” in the sequence (divided from the others by a | symbol), contains fields which, if defined, replace the entire options sequence in the returned string.
When the options sequence ends with “|]” (i.e. at least one of the blocks is required to be defined) but we find that none of the blocks have all their variables defined, then we replace the entire block with the “undefstr”.
Parameters : | templatestr : str
variables : list of str
bibentry : dict
|
---|---|
Returns : | arg : str
|
Substitute database entry variables into template string.
Parameters : | templatestr : str
entrykey : dict
|
---|---|
Returns : | templatestr : str
|
Validate the template string so that it contains no formatting errors.
Parameters : | templatestr : str
entrytype : str
|
---|---|
Returns : | okay : bool
|
Extract a sub-database from a large bibliography database, with the former containing only those entries citing the given author/editor.
Parameters : | searchname : str or dict
outputfile : str, optional
write_abbrevs : bool
|
---|
Given the input database file(s) and style file(s), write out an AUX file containing citations to all unique database entries.
This function is only provided as a utility, and is not actually used except during troubleshooting.
Parameters : | filename : str
|
---|
Given a bibliography database bibdata, a dictionary containing the citations called out citedict, and a bibliography style template bstdict write the LaTeX-format file for the formatted bibliography.
Start with the preamble and then loop over the citations one by one, formatting each entry one at a time, and put end{thebibliography} at the end when done.
Parameters : | filename : str, optional
write_preamble : bool, optional
write_postamble : bool, optional
bibsize : str, optional
|
---|
Extract a sub-database from a large bibliography database, with the former containing only those entries cited in the .aux file (and any relevant cross-references).
Parameters : | filedict : str
outputfile : str, optional
write_abbrevs : bool
|
---|