Make GTFS 2.0.0 documentation

Module constants

make_gtfs.constants.BUFFER = 10

Meters to buffer trip paths to find stops

make_gtfs.constants.PROTOFEED_ATTRS = ['frequencies', 'meta', 'service_windows', 'shapes', 'shapes_extra', 'stops']
make_gtfs.constants.SEP = '-'

Character to separate different chunks within an ID

Module protofeed

class make_gtfs.protofeed.ProtoFeed(frequencies=None, meta=None, service_windows=None, shapes=None, stops=None)

Bases: object

A ProtoFeed instance holds the source data from which to build a GTFS feed, plus a little metadata.

Attributes are

  • service_windows: DataFrame
  • frequencies: DataFrame; has speeds filled in
  • meta: DataFrame
  • shapes: GeoDataFrame
  • shapes_extra: dictionary of the form <shape ID> -> <trip directions using the shape (0, 1, or 2)>
copy()

Return a copy of this ProtoFeed, that is, a feed with all the same attributes.

make_gtfs.protofeed.read_protofeed(path)

Read the data files at the given directory path (string or Path object) and build a ProtoFeed from them. Validate the resulting ProtoFeed. If invalid, raise a ValueError specifying the errors. Otherwise, return the resulting ProtoFeed.

The data files needed to build a ProtoFeed are

  • frequencies.csv: (required) A CSV file containing route frequency information. The CSV file contains the columns
    • route_short_name: (required) String. A unique short name for the route, e.g. ‘51X’
    • route_long_name: (required) String. Full name of the route that is more descriptive than route_short_name
    • route_type: (required) Integer. The GTFS type of the route
    • service_window_id (required): String. A service window ID for the route taken from the file service_windows.csv
    • direction: (required) Integer 0, 1, or 2. Indicates whether the route travels in GTFS direction 0, GTFS direction 1, or in both directions. In the latter case, trips will be created that travel in both directions along the route’s path, each direction operating at the given frequency. Otherwise, trips will be created that travel in only the given direction.
    • frequency (required): Integer. The frequency of the route during the service window in vehicles per hour.
    • speed: (optional) Float. The speed of the route in kilometers per hour
    • shape_id: (required) String. A shape ID that is listed in shapes.geojson and corresponds to the linestring of the (route, direction, service window) tuple.
  • meta.csv: (required) A CSV file containing network metadata. The CSV file contains the columns
    • agency_name: (required) String. The name of the transport agency
    • agency_url: (required) String. A fully qualified URL for the transport agency
    • agency_timezone: (required) String. Timezone where the transit agency is located. Timezone names never contain the space character but may contain an underscore. Refer to http://en.wikipedia.org/wiki/List_of_tz_zones for a list of valid values
    • start_date, end_date (required): Strings. The start and end dates for which all this network information is valid formated as YYYYMMDD strings
    • default_route_speed: (required) Float. Default speed in kilometers per hour to assign to routes with no speed entry in the file routes.csv
  • service_windows.csv: (required) A CSV file containing service window information. A service window is a time interval and a set of days of the week during which all routes have constant service frequency, e.g. Saturday and Sunday 07:00 to 09:00. The CSV file contains the columns
    • service_window_id: (required) String. A unique identifier for a service window
    • start_time, end_time: (required) Strings. The start and end times of the service window in HH:MM:SS format where the hour is less than 24
    • monday, tuesday, wednesday, thursday, friday, saturday, sunday (required): Integer 0 or 1. Indicates whether the service is active on the given day (1) or not (0)
  • shapes.geojson: (required) A GeoJSON file containing route shapes. The file consists of one feature collection of LineString features, where each feature’s properties contains at least the attribute shape_id, which links the route’s shape to the route’s information in routes.csv.
  • stops.csv: (optional) A CSV file containing all the required and optional fields of stops.txt in the GTFS

Module validators

Validators for ProtoFeeds. Designed along the lines of gtfstk.validators.py.

make_gtfs.validators.check_for_invalid_columns(problems, table, df)

Check for invalid columns in the given ProtoFeed DataFrame.

Parameters:
  • problems (list) –

    A four-tuple containing

    1. A problem type (string) equal to 'error' or 'warning'; 'error' means the ProtoFeed is violated; 'warning' means there is a problem but it is not a ProtoFeed violation
    2. A message (string) that describes the problem
    3. A ProtoFeed table name, e.g. 'meta', in which the problem occurs
    4. A list of rows (integers) of the table’s DataFrame where the problem occurs
  • table (string) – Name of a ProtoFeed table
  • df (DataFrame) – The ProtoFeed table corresponding to table
Returns:

The problems list extended as follows. Check whether the DataFrame contains extra columns not in the ProtoFeed and append to the problems list one warning for each extra column.

Return type:

list

make_gtfs.validators.check_for_required_columns(problems, table, df)

Check that the given ProtoFeed table has the required columns.

Parameters:
  • problems (list) –

    A four-tuple containing

    1. A problem type (string) equal to 'error' or 'warning'; 'error' means the ProtoFeed is violated; 'warning' means there is a problem but it is not a ProtoFeed violation
    2. A message (string) that describes the problem
    3. A ProtoFeed table name, e.g. 'meta', in which the problem occurs
    4. A list of rows (integers) of the table’s DataFrame where the problem occurs
  • table (string) – Name of a ProtoFeed table
  • df (DataFrame) – The ProtoFeed table corresponding to table
Returns:

The problems list extended as follows. Check that the DataFrame contains the colums required by the ProtoFeed spec and append to the problems list one error for each column missing.

Return type:

list

make_gtfs.validators.check_frequencies(pfeed, *, as_df=False, include_warnings=False)

Check that pfeed.frequency follows the ProtoFeed spec. Return a list of problems of the form described in gt.check_table(); the list will be empty if no problems are found.

make_gtfs.validators.check_meta(pfeed, *, as_df=False, include_warnings=False)

Analog of check_frequencies() for pfeed.meta

make_gtfs.validators.check_service_windows(pfeed, *, as_df=False, include_warnings=False)

Analog of check_frequencies() for pfeed.service_windows

make_gtfs.validators.check_shapes(pfeed, *, as_df=False, include_warnings=False)

Analog of check_frequencies() for pfeed.shapes

make_gtfs.validators.check_stops(pfeed, *, as_df=False, include_warnings=False)

Analog of check_frequencies() for pfeed.stops

make_gtfs.validators.valid_speed(x)

Return True if x is a positive number; otherwise return False.

make_gtfs.validators.validate(pfeed, *, as_df=True, include_warnings=True)

Check whether the given pfeed satisfies the ProtoFeed spec.

Parameters:
  • pfeed (ProtoFeed) –
  • as_df (boolean) – If True, then return the resulting report as a DataFrame; otherwise return the result as a list
  • include_warnings (boolean) – If True, then include problems of types 'error' and 'warning'; otherwise, only return problems of type 'error'
Returns:

Run all the table-checking functions: check_agency(), check_calendar(), etc. This yields a possibly empty list of items [problem type, message, table, rows]. If as_df, then format the error list as a DataFrame with the columns

  • 'type': ‘error’ or ‘warning’; ‘error’ means the ProtoFeed spec is violated; ‘warning’ means there is a problem but it’s not a ProtoFeed spec violation
  • 'message': description of the problem
  • 'table': table in which problem occurs, e.g. ‘routes’
  • 'rows': rows of the table’s DataFrame where problem occurs

Return early if the pfeed is missing required tables or required columns.

Return type:

list or DataFrame

Module main

make_gtfs.main.buffer_side(linestring, side, buffer)

Given a Shapely LineString, a side of the LineString (string; ‘left’ = left hand side of LineString, ‘right’ = right hand side of LineString, or ‘both’ = both sides), and a buffer size in the distance units of the LineString, buffer the LineString on the given side by the buffer size and return the resulting Shapely polygon.

make_gtfs.main.build_agency(pfeed)

Given a ProtoFeed, return a DataFrame representing agency.txt

make_gtfs.main.build_calendar_etc(pfeed)

Given a ProtoFeed, return a DataFrame representing calendar.txt and a dictionary of the form <service window ID> -> <service ID>, respectively.

make_gtfs.main.build_feed(pfeed, buffer=10)
make_gtfs.main.build_routes(pfeed)

Given a ProtoFeed, return a DataFrame representing routes.txt.

make_gtfs.main.build_shapes(pfeed)

Given a ProtoFeed, return DataFrame representing shapes.txt. Only use shape IDs that occur in both pfeed.shapes and pfeed.frequencies. Create reversed shapes where routes traverse shapes in both directions.

make_gtfs.main.build_stop_ids(shape_id)

Create a pair of stop IDs based on the given shape ID.

make_gtfs.main.build_stop_names(shape_id)

Create a pair of stop names based on the given shape ID.

make_gtfs.main.build_stop_times(pfeed, routes, shapes, stops, trips, buffer=10)

Given a ProtoFeed and its corresponding routes (DataFrame), shapes (DataFrame), stops (DataFrame), trips (DataFrame), return DataFrame representing stop_times.txt. Includes the optional shape_dist_traveled column. Don’t make stop times for trips with no nearby stops.

make_gtfs.main.build_stops(pfeed, shapes=None)

Given a ProtoFeed, return a DataFrame representing stops.txt. If pfeed.stops is not None, then return that. Otherwise, require built shapes output by build_shapes(), create one stop at the beginning (the first point) of each shape and one at the end (the last point) of each shape, and drop stops with duplicate coordinates. Note that this will yield one stop for shapes that are loops.

make_gtfs.main.build_trips(pfeed, routes, service_by_window)

Given a ProtoFeed and its corresponding routes (DataFrame), service-by-window (dictionary), return a DataFrame representing trips.txt. Trip IDs encode route, direction, and service window information to make it easy to compute stop times later.

make_gtfs.main.get_duration(timestr1, timestr2, units='s')

Return the duration of the time period between the first and second time string in the given units. Allowable units are ‘s’ (seconds), ‘min’ (minutes), ‘h’ (hours). Assume timestr1 < timestr2.

make_gtfs.main.get_nearby_stops(geo_stops, linestring, side, buffer=10)

Given a GeoDataFrame of stops, a Shapely LineString in the same coordinate system, a side of the LineString (string; ‘left’ = left hand side of LineString, ‘right’ = right hand side of LineString, or ‘both’ = both sides), and a buffer in the distance units of that coordinate system, do the following. Return a GeoDataFrame of all the stops that lie within buffer distance units to the side of the LineString.

Module cli

Indices and tables