pycrossword  0.3
Pure-Python implementation of a crossword puzzle generator and editor
Public Member Functions | Public Attributes | List of all members
pycross.dbapi.Sqlitedb Class Reference

SQLite database driver implementation wrapping the standard Python sqlite3 methods. More...

Public Member Functions

def __init__ (self, dbname=None)
 Constructor initializes DB driver connection. More...
 
def __del__ (self)
 Destructor disconnects from DB. More...
 
def setpath (self, dbname, fullpath=False, recreate=False, connect=True)
 Initializes the path to the DB file and establishes a connection if required. More...
 
def connect (self)
 Connects to the DB file (Sqlitedb::dbpath). More...
 
def disconnect (self, commit_trailing=True)
 Disconnects from the currently open DB. More...
 
def create_db (self, overwrite=False)
 Creates the DB in Sqlitedb::dbpath, optionally overwriting the existing file. More...
 
def create_tables (self)
 Creates the default table structure in the DB. More...
 
def get_pos (self)
 Retrieves the list of parts of speech present in the DB. More...
 
def standard_posrules (self, lang)
 Returns the default Hunspell-formatted metadata patterns for the three common parts of speech (noun, verb, adjective). More...
 
def standard_replacements (self, lang)
 Returns the default replacement rules for a language to use in Hunspell imports. More...
 
def add_from_hunspell (self, dicfile, posrules, posrules_strict=True, posdelim='/', lcase=True, replacements=None, remove_hyphens=True, filter_out=None, commit_each=1000, on_word=None, on_commit=None)
 Imports a Hunspell-formatted dictionary file into the DB. More...
 
def add_all_from_hunspell (self, languages=None, on_commit=None, on_dict_add=None)
 Imports all Hunspell-formatted dictionaries found in 'assets/dic'. More...
 

Public Attributes

 dbpath
 str full path to the DB More...
 
 conn
 internal DB connection object (SQLite driver) More...
 

Detailed Description

SQLite database driver implementation wrapping the standard Python sqlite3 methods.

Some handy methods are added to connect / disconnect to / from the DB, create / recreate the DB with the default set of tables (to use as a word source), and import Hunspell dictionary data.

Constructor & Destructor Documentation

◆ __init__()

def pycross.dbapi.Sqlitedb.__init__ (   self,
  dbname = None 
)

Constructor initializes DB driver connection.

Parameters
dbnamestr path to database file (*.db) or an abbreviated language name for preinstalled DB files stored in 'assets/dic', e.g. 'en' (='assets/dic/en.db')

◆ __del__()

def pycross.dbapi.Sqlitedb.__del__ (   self)

Destructor disconnects from DB.

Member Function Documentation

◆ add_all_from_hunspell()

def pycross.dbapi.Sqlitedb.add_all_from_hunspell (   self,
  languages = None,
  on_commit = None,
  on_dict_add = None 
)

Imports all Hunspell-formatted dictionaries found in 'assets/dic'.

Warning
All imported dictionary files must have the '.dic' extension.
Parameters
languagesiterable list of languages to import, e.g. ['en', 'fr'] (others found will be skipped). Default = None (import all found dictionaries)
on_commitcallable callback function to be called when a next portion of records is written to the DB.
Callback prototype is:
on_commit(records_committed: int, dic_file: str) -> None
on_dict_addcallable callback function to be called when a next dictionary has been imported.
Callback prototype is:
on_dict_add(dic_file: str, lang: str, records_from_file: int, total_records: int) -> None
Returns
int number of words imported from the dictionaries (aggregate)
See also
add_from_hunspell()

◆ add_from_hunspell()

def pycross.dbapi.Sqlitedb.add_from_hunspell (   self,
  dicfile,
  posrules,
  posrules_strict = True,
  posdelim = '/',
  lcase = True,
  replacements = None,
  remove_hyphens = True,
  filter_out = None,
  commit_each = 1000,
  on_word = None,
  on_commit = None 
)

Imports a Hunspell-formatted dictionary file into the DB.

Hunspell dictionaries can be downloaded from LibreOffice or Github. Default dictionaries and prebuilt SQLite databases are found in assets/dic.

Parameters
dicfilestr path to imported dictionary file (*.dic).
Warning
The file must be in plain text format, with each word on a new line, optionally followed by a slash (see 'posdelim' argument) and meta-data (parts of speech etc.)
Parameters
posrulesdict part-of-speech regular expression parsing rules in the format:
{'N': 'regex for nouns', 'V': 'regex for verb', ...}
     Possible keys are: 'N' [noun], 'V' [verb], 'ADV' [adverb], 'ADJ' [adjective], 
     'P' [participle], 'PRON' [pronoun], 'I' [interjection], 
     'C' [conjuction], 'PREP' [preposition], 'PROP' [proposition], 
     'MISC' [miscellaneous / other], 'NONE' [no POS]
 
posrules_strictbool if True (default), only the parts of speech present in posrules dict will be imported [all other words will be skipped]. If False, such words will be imported with 'MISC' and 'NONE' POS markers.
posdelimstr delimiter delimiting the word and its part of speech [default = '/']
lcasebool if True (default), found words will be imported in lower case; otherwise, the original case will remain
replacementsdict: character replacement rules in the format:
{'char_from': 'char_to', ...}
Default = None (no replacements)
remove_hyphensbool if True (default), all hyphens ['-'] will be removed from the words
filter_outdict regex-based rules to filter out [exclude] words in the format:
{'word': ['regex1', 'regex2', ...], 'pos': ['regex1', 'regex2', ...]}
These words will not be imported. One of the POS rules can be used to screen off specific parts of speech. Match rules for words will be applied AFTER replacements and in the sequential order of the regex list. Default = None (no filter rules apply).
commit_eachint threshold of insert operations after which the transaction will be committed (default = 1000)
on_wordcallable callback function to be called when a word is imported into the DB.
Callback prototype is:
on_word([word: str, part_of_speech: str, records_committed: int]) -> None
on_commitcallable: callback function to be called when a next portion of records is written to the DB. Callback prototype is:
on_commit(records_committed: int, dic_file: str) -> None
Returns
int number of words imported from the dictionary
See also
add_all_from_hunspell()

◆ connect()

def pycross.dbapi.Sqlitedb.connect (   self)

Connects to the DB file (Sqlitedb::dbpath).

Returns
bool True on success, False on failure

◆ create_db()

def pycross.dbapi.Sqlitedb.create_db (   self,
  overwrite = False 
)

Creates the DB in Sqlitedb::dbpath, optionally overwriting the existing file.

Parameters
overwritebool True to overwrite the existing file (default = False)
Warning
If set to True, all data in the DB file (if present) will be lost!
Returns
bool True on success, False on failure

◆ create_tables()

def pycross.dbapi.Sqlitedb.create_tables (   self)

Creates the default table structure in the DB.

Returns
bool True on success, False on failure
See also
utils::globalvars::SQL_TABLES

◆ disconnect()

def pycross.dbapi.Sqlitedb.disconnect (   self,
  commit_trailing = True 
)

Disconnects from the currently open DB.

Parameters
commit_trailingbool True (default) to commit all pending changes to the DB before disconnecting

◆ get_pos()

def pycross.dbapi.Sqlitedb.get_pos (   self)

Retrieves the list of parts of speech present in the DB.

Returns
list parts of speech in the short form, e.g. ['N', 'V']

◆ setpath()

def pycross.dbapi.Sqlitedb.setpath (   self,
  dbname,
  fullpath = False,
  recreate = False,
  connect = True 
)

Initializes the path to the DB file and establishes a connection if required.

Parameters
dbnamestr path to database file (*.db) or an abbreviated language name - see init()
fullpathbool True to indicate that the 'dbname' argument is the full file path (default = False)
recreatebool True to recreate the database file with the default table structure (default = False).
Warning
If set to True, all data in the DB file (if present) will be lost!
Parameters
connectbool True (default) to attempt connecting to the DB immediately
Returns
bool True on success, False on failure

◆ standard_posrules()

def pycross.dbapi.Sqlitedb.standard_posrules (   self,
  lang 
)

Returns the default Hunspell-formatted metadata patterns for the three common parts of speech (noun, verb, adjective).

The returned patterns depend on the language.

Parameters
langstr language for which the matching patterns are requested, e.g. 'en' or 'ru'
Returns
dict POS to regex pattern matching table in the format:
{'N': 'regex pattern for nouns', 'V': 'regex pattern for verbs', 'ADJ': 'regex pattern for adjectives'}
If the language is invalid (none of 'en', 'ru', 'fr' or 'de'), None is returned.
Reimplement this method as needed to support other languages / parts of speech formats.
See also
add_from_hunspell()

◆ standard_replacements()

def pycross.dbapi.Sqlitedb.standard_replacements (   self,
  lang 
)

Returns the default replacement rules for a language to use in Hunspell imports.

Parameters
langstr language for which the matching patterns are requested, e.g. 'en' or 'ru'
Returns
dict default replacements in the format:
{'character to replace': 'replacement character'}
If the language is invalid (currently only 'ru' or 'fr'), None is returned.
Reimplement this method as needed to add other languages / replaced characters.
See also
add_from_hunspell()

Member Data Documentation

◆ conn

pycross.dbapi.Sqlitedb.conn

internal DB connection object (SQLite driver)

◆ dbpath

pycross.dbapi.Sqlitedb.dbpath

str full path to the DB


The documentation for this class was generated from the following file: