Code documentation

findagg.py

platform:Unix
synopsis:CMIP5 aggregation discovery upon local TDS IPSL-ESGF datanode or CICLAD filesystem.
class findagg.InstituteInfo(name)[source]

Gives the list of models from an institute regarding to the DRS.

Parameters:name (str) – The institute to process
Returns:The models from the institute
Return type:list
class findagg.ProcessingContext(args, requirements)[source]

Encapsulates the following processing context/information for main process:

Attribute Type Description
self.ensembles list Ensembles from request
self.experiments list Experiments from request
self.institute str Institute in process
self.institutes list institutes from a directory
self.model str Model in process
self.outputfile str Output file for available aggregations
self.pool pool object Pool of workers (from multithreading)
self.urls list URLs list to call
self.variables list Variables from request
self.verbose boolean True if verbose mode
self.miss boolean True if output missing data
self.xml boolean True to scan XML aggregations
self.tds boolean True to scan THREDDS aggregations
self.inter boolean True to scan both aggregations types
Parameters:

args (dict) – Parsed command-line arguments

Returns:

The processing context

Return type:

dict

Raises:
  • Error – If no --tds or --xml flag is set
  • Error – If the --inter option is set without both of --tds and --xml flags
findagg.get_args()[source]

Returns parsed command-line arguments. See find_agg -h for full description.

findagg.init_logging(logdir)[source]

Initiates the logging configuration (output, message formatting). In the case of a logfile, the logfile name is unique and formatted as follows: name-YYYYMMDD-HHMMSS-PID.log

Parameters:logdir (str) – The relative or absolute logfile directory. If None the standard output is used.
findagg.get_requirements(path)[source]

Loads the requierements from the JSON template.

Parameters:path (str) – The path of the JSON file with requirements
Returns:The configuration information
Return type:dict
Raises Error:If the JSON file parsing fails
findagg.get_ensembles_list(ctx)[source]

Returns the ensembles list given an institute and a model.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
Returns:The ensembles list without duplicates
Return type:list
findagg.get_aggregation_urls(ctx)[source]

Yields the aggregations urls for testing.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
Returns:An iterator on rebuild urls
Return type:iter
findagg.get_aggregation_xmls(ctx)[source]

Like get_aggregation_urls(), but returns an iterator on rebuild xml paths.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
Returns:An iterator on rebuild xml paths
Return type:iter
findagg.test_url(url)[source]

Tests an url response.

Parameters:url (str) – The url to test
Returns:True if the aggregation url exists
Return type:boolean
Raises Error:If an HTTP request fails
findagg.test_xml(xml)[source]

Like test_url(), but tests if an xml path exists.

Parameters:xml (str) – The xml path to test
Returns:True if the xml aggregation exists
Return type:boolean
findagg.all_urls_exist(ctx)[source]

Returns a flag indicating whether all urls exist or not.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
Returns:True if all aggregation urls exist
Return type:boolean
findagg.all_xmls_exist(ctx)[source]

Like all_urls_exist(), but returns a flag indicating whether all xml paths exist or not.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
Returns:True if all xml aggregation exist
Return type:boolean
findagg.write_urls(ctx)[source]

Writes all available aggregations into output file.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
findagg.write_xmls(ctx)[source]

Like write_urls(), but writes available xml paths into the output file.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
findagg.url2path(url)[source]

Converts an aggregation url into a file path.

Parameters:url (str) – The url to convert
Returns:The corresponding path on filesystem
Return type:str
findagg.get_missing_tree(url)[source]

Returns the master missing tree where the data should be.

Parameters:url (str) – The url to convert using url2path()
Returns:The child tree where data should be on the filesystem
Return type:str
findagg.get_missing_data(ctx)[source]

Writes the sorted list of missing data.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
findagg.get_missing_urls(ctx)[source]

Like get_missing_data(), but writes the sorted list of missing aggregations urls.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
findagg.get_missing_xmls(ctx)[source]

Like get_missing_urls(), but writes the sorted list of missing xml paths.

Parameters:ctx (dict) – The processing context (as a ProcessingContext() class instance)
findagg.main()[source]
Main process that:
  • Instanciates processing context,
  • Tests all THREDDS aggregations URL,
  • Tests all XML aggregations paths,
  • Checks if data exist when aggregation is missing,
  • Prints or logs the search results.

Module author: Levavasseur Guillaume (CNRS/IPSL) <glipsl@ipsl.jussieu.fr>

Table Of Contents

Previous topic

Changelog

This Page