pyjstat is a python module for JSON-stat formatted data manipulation.
This module allows reading and writing JSON-stat [1] format with python, using data frame structures provided by the widely accepted pandas library [2]. The JSON-stat format is a simple lightweight JSON format for data dissemination. Pyjstat is inspired in rjstat [3], a library to read and write JSON-stat with R, by ajschumacher.
pyjstat is written and maintained by Miguel Expósito Martín and is distributed under the Apache 2.0 License (see LICENSE file).
[1] | http://json-stat.org/ for JSON-stat information |
[2] | http://pandas.pydata.org for Python Data Analysis Library information |
[3] | https://github.com/ajschumacher/rjstat for rjstat library information |
Example
Importing a JSON-stat file into a pandas data frame can be done as follows:
import urllib2
import json
import pyjstat
results = pyjstat.from_json_stat(json.load(urllib2.urlopen(
'http://json-stat.org/samples/oecd-canada.json')))
print results
Custom JSON encoder class for Numpy data types.
Check and validate input params.
Parameters: | naming (string) – a string containing the naming type (label or id). |
---|---|
Returns: | Nothing |
Raises: | ValueError – if the parameter is not in the allowed list. |
Decode JSON-stat formatted data into pandas.DataFrame object.
Parameters: |
|
---|---|
Returns: | results – list of pandas.DataFrame with imported data. |
Return type: | list |
Decode JSON-stat dict into pandas.DataFrame object. Helper method that should be called inside from_json_stat().
Parameters: |
|
---|---|
Returns: | output – pandas.DataFrame with converted data. |
Return type: | DataFrame |
Generate row dimension values for a pandas dataframe.
Parameters: |
|
---|---|
Yields: | list – list with pandas dataframe column values except for value column |
Get index from a given dimension.
Parameters: |
|
---|---|
Returns: | dim_index – DataFrame with index-based dimension data. |
Return type: | pandas.DataFrame |
Get label from a given dimension.
Parameters: |
|
---|---|
Returns: | dim_label – DataFrame with label-based dimension data. |
Return type: | pandas.DataFrame |
Get dimensions from input data.
Parameters: |
|
---|---|
Returns: | dimensions – list of pandas data frames with dimension category data. dim_names (list): list of strings with dimension names. |
Return type: | list |
Get values from input data.
Parameters: |
|
---|---|
Returns: | values – list of dataset values. |
Return type: | list |
Convert variable to integer or string depending on the case.
Parameters: | variable (string) – a string containing a real string or an integer. |
---|---|
Returns: | variable – an integer or a string, depending on the content of variable. |
Return type: | int, string |
Parameters: |
|
---|---|
Returns: | output – String with JSON-stat object. |
Return type: | string |
Convert variable to integer or string depending on the case.
Parameters: | variable (string) – a string containing a real string or an integer. |
---|---|
Returns: | variable – an integer or a string, depending on the content of variable. |
Return type: | int, string |
Return unique values in a list in the original order. See: http://www.peterbe.com/plog/uniqifiers-benchmark
Parameters: | seq (list) – original list. |
---|---|
Returns: | list without duplicates preserving original order. |
Return type: | list |