The Collins Economics Result Object (CERO)

A core concept in the operation of ConCERO is that of a ‘Collins Economic Results Object’ - a CERO - which serves as a standard format for data-interchange between economic modelling programs. Conceptually, the CERO is a set of instances of a ‘fundamental data type’, a discussion of which can be found in the ConCERO’s Design Philosophy documentation.

Software-wise, the CERO is a pandas.DataFrame with some additional constraints. Those constraints are:

  • cero.index must be an instance of the pandas.Index class, and
  • cero.columns must be an instance of the pandas.DatetimeIndex class, and
  • both cero.index and cero.columns values must be unique and
  • all index values must be valid identifiers (see below) and
  • cero data/array values must all be of 32-bit floating-point type (specifically, be instances of a subclass of the numpy.float32 class),

where cero is a CERO. The values of cero.index are referred as identifiers.

CERO Identifiers

As mentioned previously, values of the index of a CERO are referred to as identifiers. Identifiers are subject to a couple of restrictions. They are:

  • The identifier must be unique - that is, no other value of cero.index can be exactly the same.

  • The identifier must be either:
    • a string (str) with no commas, or
    • a tuple of strings, where each string does not have any commas.

The comma constraint is a result of how ConCERO interprets commas when reading YAML files - ConCERO interprets commas as a string-splitting character. Thus, if a configuration file contains the string:

"hello,world"

in the context of CERO identifiers, then this will be interpreted as the python tuple:

('hello','world')

Note also that any white spaced is stripped when the string is split, so the string:

"hello, world"

also becomes:

('hello','world')

and this:

" L_OUTPUT, Electricity, AUS"

becomes:

("L_OUTPUT","Electricity","AUS")

The advantage of the tuple form of identifier is that it preserves ordered relationships, even though that ordered relationship has no meaning within the CERO itself. This is necessary to store data that is more than 2-dimensional in nature in 2-dimensions. It also allows for the implementation of sets (see Sets),which provide the user with significant flexibility and power with respect to selecting identifiers of interest. In summary, sets allow the user to select large amounts of identifiers by just listing sets, as opposed to all the identifiers.

Technical Reference

The functions listed below may be of interest if users wish to directly interact with a CERO (a pandas.DataFrame with additional constraints).

class cero.CERO[source]
exception CEROIndexConflict[source]
exception EmptyCERO[source]
exception InvalidCERO[source]
static combine_ceros(ceros: list, overwrite: bool = True, verify_cero: bool = True) → pandas.core.frame.DataFrame[source]

Combine multiple CEROs (provided as a list) into a common CERO. If overwrite is True, a CERO that is later in ceros (i.e. has a higher index) will overwrite the merger of all preceding CEROs. If overwrite is False and duplicate indices are detected, an CERO.CEROIndexConflict exception will be raised.

If verify_cero is True, then a check is performed before and after combination to ensure that only CEROs are combined with other CEROs, to form a CERO. By disabling this, combine_ceros can be applied to pandas.DataFrames as well.

static create_cero_index(values: List[Union[str, tuple]])[source]

Creates pandas.Index object that adheres to CERO constraints.

static create_empty()[source]

Returns empty CERO.

static is_cero(obj, raise_exception=True, empty_ok=True)[source]

Tests obj to identify if it has all the properties of a CERO.

Parameters:
  • obj – The object that may or may not be a CERO.
  • raise_exception – If True will raise an exception on the event that obj is not a CERO (the default behaviour). Otherwise, False is returned in the event obj is not a CERO.
  • empty_ok – If False, obj must have at least one value that is not an NaN to qualify as a CERO. True by default.
Returns:

static read_csv(csv_file)[source]

Reads CEROs that have been exported to csv file. It is assumed that ‘;’ are used to seperate the fields (if more than one) of the identifier.

Parameters:csv_file (str) – Path to the file containing the CERO.
Return pandas.DataFrame:
 The imported CERO.
static read_xlsx(xlsx_file, *args, **kwargs)[source]

Reads CEROs that have been exported to xlsx files.

Parameters:file ((str)) – Name of xlsx file that CERO resides in.
static rename_index_values(cero: pandas.core.frame.DataFrame, map_dict: dict, inplace: bool = True)[source]
Parameters:
  • cero – The CERO object to rename the index values of. The order of the CERO.index imposes order on the mapping operation - that is, the CERO retains its original order.
  • map_dict – A dict of (old name, new name) are (key, value) pairs.
Returns:

Created on Wed Dec 20 10:20:32 2017

@author: Lyle Collins @email: Lyle.Collins@csiro.au