midgard.parsers._parser

Basic functionality for parsing datafiles, extended by individual parsers

Description:

This module contains functions and classes for parsing datafiles. It should typically be used by calling parsers.parse_file:

Example:

from midgard import parsers
my_new_parser = parsers.parse_file('my_new_parser', 'file_name.txt', ...)
my_data = my_new_parser.as_dict()

Parser

Parser(file_path:Union[str, pathlib.Path], encoding:Union[str, NoneType]=None, logger:Union[Callable[[str], NoneType], NoneType]=<built-in function print>) -> None

An abstract base class that has basic methods for parsing a datafile

This class provides functionality for parsing a file. You should inherit from one of the specific parsers like for instance ChainParser, LineParser, SinexParser etc

Attributes:

file_path (Path): Path to the datafile that will be read. file_encoding (String): Encoding of the datafile. parser_name (String): Name of the parser (as needed to call parsers.parse_...). data_available (Boolean): Indicator of whether data are available. data (Dict): The (observation) data read from file. meta (Dict): Metainformation read from file.

Parser.as_dataframe()

as_dataframe(self, index:Union[str, List[str], NoneType]=None) -> pandas.core.frame.DataFrame

Return the parsed data as a Pandas DataFrame

This is a basic implementation, assuming the self.data-dictionary has a simple structure. More advanced parsers may need to reimplement this method.

Args:

Returns:

Pandas DataFrame with the parsed data.

Parser.as_dataset()

as_dataset(self) -> NoReturn

Return the parsed data as a Midgard Dataset

This is a basic implementation, assuming the self.data-dictionary has a simple structure. More advanced parsers may need to reimplement this method.

Returns:

Parser.as_dict()

as_dict(self, include_meta:bool=False) -> Dict[str, Any]

Return the parsed data as a dictionary

This is a basic implementation, simply returning a copy of self.data. More advanced parsers may need to reimplement this method.

Args:

Returns:

Dictionary with the parsed data.

Parser.parse()

parse(self) -> 'Parser'

Parse data

This is a basic implementation that carries out the whole pipeline of reading and parsing datafiles including calculating secondary data.

Subclasses should typically implement (at least) the read_data-method.

Parser.postprocess_data()

postprocess_data(self) -> None

Do simple manipulations on the data after they are read

Simple manipulations of data may be performed in postprocessors after they are read. They should be kept simple so that a parser returns as true representation of the data file as possible. Advanced calculations may be done inside apriori classes or similar.

To add a postprocessor, define it in its own method, and override the setup_postprocessors-method to return a list of all postprocessors.

Parser.read_data()

read_data(self) -> None

Read data from the data file

Data should be read from self.file_path and stored in the dictionary self.data. A description of the data may be placed in the dictionary self.meta. If data are not available for some reason, self.data_available should be set to False.

Parser.setup_parser()

setup_parser(self) -> Any

Set up information needed for the parser

Parser.setup_postprocessors()

setup_postprocessors(self) -> List[Callable[[], NoneType]]

List postprocessors that should be called after parsing

Parser.update_dataset()

update_dataset(self, dset:Any) -> NoReturn

Update the given dataset with the parsed data

This is a basic implementation, assuming the self.data-dictionary has a simple structure. More advanced parsers may need to reimplement this method.

Args: