midgard.parsers._parser
Basic functionality for parsing datafiles, extended by individual parsers
Description:
This module contains functions and classes for parsing datafiles. It should typically be used by calling
parsers.parse_file
:
Example:
from midgard import parsers
my_new_parser = parsers.parse_file('my_new_parser', 'file_name.txt', ...)
my_data = my_new_parser.as_dict()
Parser
Parser(file_path:Union[str, pathlib.Path], encoding:Union[str, NoneType]=None, logger:Union[Callable[[str], NoneType], NoneType]=<built-in function print>) -> None
An abstract base class that has basic methods for parsing a datafile
This class provides functionality for parsing a file. You should inherit from one of the specific parsers like for instance ChainParser, LineParser, SinexParser etc
Attributes:
file_path (Path): Path to the datafile that will be read. file_encoding (String): Encoding of the datafile. parser_name (String): Name of the parser (as needed to call parsers.parse_...). data_available (Boolean): Indicator of whether data are available. data (Dict): The (observation) data read from file. meta (Dict): Metainformation read from file.
Parser.as_dataframe()
as_dataframe(self, index:Union[str, List[str], NoneType]=None) -> pandas.core.frame.DataFrame
Return the parsed data as a Pandas DataFrame
This is a basic implementation, assuming the self.data
-dictionary has
a simple structure. More advanced parsers may need to reimplement this
method.
Args:
index
: Optional name of field to use as index. May also be a list of strings.
Returns:
Pandas DataFrame with the parsed data.
Parser.as_dataset()
as_dataset(self) -> NoReturn
Return the parsed data as a Midgard Dataset
This is a basic implementation, assuming the self.data
-dictionary has
a simple structure. More advanced parsers may need to reimplement this
method.
Returns:
Dataset
: The parsed data.
Parser.as_dict()
as_dict(self, include_meta:bool=False) -> Dict[str, Any]
Return the parsed data as a dictionary
This is a basic implementation, simply returning a copy of self.data. More advanced parsers may need to reimplement this method.
Args:
include_meta
: Whether to include meta-data in the returned dictionary (default: False).
Returns:
Dictionary with the parsed data.
Parser.parse()
parse(self) -> 'Parser'
Parse data
This is a basic implementation that carries out the whole pipeline of reading and parsing datafiles including calculating secondary data.
Subclasses should typically implement (at least) the read_data
-method.
Parser.postprocess_data()
postprocess_data(self) -> None
Do simple manipulations on the data after they are read
Simple manipulations of data may be performed in postprocessors after they are read. They should be kept simple so that a parser returns as true representation of the data file as possible. Advanced calculations may be done inside apriori classes or similar.
To add a postprocessor, define it in its own method, and override the
setup_postprocessors
-method to return a list of all postprocessors.
Parser.read_data()
read_data(self) -> None
Read data from the data file
Data should be read from self.file_path
and stored in the dictionary
self.data
. A description of the data may be placed in the dictionary
self.meta
. If data are not available for some reason,
self.data_available
should be set to False.
Parser.setup_parser()
setup_parser(self) -> Any
Set up information needed for the parser
Parser.setup_postprocessors()
setup_postprocessors(self) -> List[Callable[[], NoneType]]
List postprocessors that should be called after parsing
Parser.update_dataset()
update_dataset(self, dset:Any) -> NoReturn
Update the given dataset with the parsed data
This is a basic implementation, assuming the self.data
-dictionary has
a simple structure. More advanced parsers may need to reimplement this
method.
Args:
dset
: The dataset to update with parsed data.