pyjstat

pyjstat is a python module for JSON-stat formatted data manipulation.

This module allows reading and writing JSON-stat [1] format with python, using data frame structures provided by the widely accepted pandas library [2]. The JSON-stat format is a simple lightweight JSON format for data dissemination. Pyjstat is inspired in rjstat [3], a library to read and write JSON-stat with R, by ajschumacher.

pyjstat is written and maintained by Miguel Expósito Martín and is distributed under the Apache 2.0 License (see LICENSE file).

[1]http://json-stat.org/ for JSON-stat information
[2]http://pandas.pydata.org for Python Data Analysis Library information
[3]https://github.com/ajschumacher/rjstat for rjstat library information

Example

Importing a JSON-stat file into a pandas data frame can be done as follows:

import urllib2
import json
import pyjstat
results = pyjstat.from_json_stat(json.load(urllib2.urlopen(
'http://json-stat.org/samples/oecd-canada.json')))
print results
pyjstat.check_input(naming)

Check and validate input params.

Parameters:naming (string) – a string containing the naming type (label or id).
Returns:Nothing
Raises :ValueError – if the parameter is not in the allowed list.
pyjstat.from_json_stat(datasets, naming='label')

Decode JSON-stat format into pandas.DataFrame object

Parameters:
  • datasets (OrderedDict) – data in JSON-stat format, previously deserialized to a python object by json.load() or json.loads(), for example.
  • naming (string, optional) – dimension naming. Possible values: ‘label’ or ‘id.’
Returns:

output – list of pandas.DataFrame with imported data.

Return type:

list

pyjstat.get_df_row(dimensions, naming='label', i=0, record=None)

Generate row dimension values for a pandas dataframe.

Parameters:
  • dimensions (list) – list of pandas dataframes with dimension labels generated by get_dim_label or get_dim_index methods.
  • naming (string, optional) – dimension naming. Possible values: ‘label’ or ‘id’.
  • i (int) – dimension list iteration index. Default is 0, it’s used in the recursive calls to the method.
  • record (list) – list of values representing a pandas dataframe row, except for the value column. Default is empty, it’s used in the recursive calls to the method.
Yields :

list – list with pandas dataframe column values except for value column

pyjstat.get_dim_index(js_dict, dim)

Get index from a given dimension.

Parameters:
  • js_dict (dict) – dictionary containing dataset data and metadata.
  • dim (string) – dimension name obtained from JSON file.
Returns:

dim_index – DataFrame with index-based dimension data.

Return type:

pandas.DataFrame

pyjstat.get_dim_label(js_dict, dim)

Get label from a given dimension.

Parameters:
  • js_dict (dict) – dictionary containing dataset data and metadata.
  • dim (string) – dimension name obtained from JSON file.
Returns:

dim_label – DataFrame with label-based dimension data.

Return type:

pandas.DataFrame

pyjstat.get_dimensions(js_dict, naming)

Get dimensions from input data.

Parameters:
  • js_dict (dict) – dictionary containing dataset data and metadata.
  • naming (string, optional) – dimension naming. Possible values: ‘label’ or ‘id’.
Returns:

dimensions – list of pandas data frames with dimension category data. dim_names (list): list of strings with dimension names.

Return type:

list

pyjstat.get_values(js_dict)

Get values from input data.

Parameters:js_dict (dict) – dictionary containing dataset data and metadata.
Returns:values – list of dataset values.
Return type:list
pyjstat.to_json_stat(input_df, value='value')

Encode pandas.DataFrame object into JSON-stat format. The DataFrames must have exactly one value column.

Parameters:
  • df (pandas.DataFrame) – pandas data frame (or list of data frames) to
  • value (string) – name of value column.
Returns:

output – String with JSON-stat object.

Return type:

string

pyjstat.uniquify(seq)

Return unique values in a list in the original order. See: http://www.peterbe.com/plog/uniqifiers-benchmark

Parameters:seq (list) – original list.
Returns:list without duplicates preserving original order.
Return type:list

Previous topic

Welcome to pyjstat’s documentation!

This Page