Make GTFS 2.0.0 documentation¶
Module constants¶
-
make_gtfs.constants.
BUFFER
= 10¶ Meters to buffer trip paths to find stops
-
make_gtfs.constants.
PROTOFEED_ATTRS
= ['frequencies', 'meta', 'service_windows', 'shapes', 'shapes_extra', 'stops']¶
-
make_gtfs.constants.
SEP
= '-'¶ Character to separate different chunks within an ID
Module protofeed¶
-
class
make_gtfs.protofeed.
ProtoFeed
(frequencies=None, meta=None, service_windows=None, shapes=None, stops=None)¶ Bases:
object
A ProtoFeed instance holds the source data from which to build a GTFS feed, plus a little metadata.
Attributes are
service_windows
: DataFramefrequencies
: DataFrame; has speeds filled inmeta
: DataFrameshapes
: GeoDataFrameshapes_extra
: dictionary of the form <shape ID> -> <trip directions using the shape (0, 1, or 2)>
-
copy
()¶ Return a copy of this ProtoFeed, that is, a feed with all the same attributes.
-
make_gtfs.protofeed.
read_protofeed
(path)¶ Read the data files at the given directory path (string or Path object) and build a ProtoFeed from them. Validate the resulting ProtoFeed. If invalid, raise a
ValueError
specifying the errors. Otherwise, return the resulting ProtoFeed.The data files needed to build a ProtoFeed are
frequencies.csv
: (required) A CSV file containing route frequency information. The CSV file contains the columnsroute_short_name
: (required) String. A unique short name for the route, e.g. ‘51X’route_long_name
: (required) String. Full name of the route that is more descriptive thanroute_short_name
route_type
: (required) Integer. The GTFS type of the routeservice_window_id
(required): String. A service window ID for the route taken from the fileservice_windows.csv
direction
: (required) Integer 0, 1, or 2. Indicates whether the route travels in GTFS direction 0, GTFS direction 1, or in both directions. In the latter case, trips will be created that travel in both directions along the route’s path, each direction operating at the given frequency. Otherwise, trips will be created that travel in only the given direction.frequency
(required): Integer. The frequency of the route during the service window in vehicles per hour.speed
: (optional) Float. The speed of the route in kilometers per hourshape_id
: (required) String. A shape ID that is listed inshapes.geojson
and corresponds to the linestring of the (route, direction, service window) tuple.
meta.csv
: (required) A CSV file containing network metadata. The CSV file contains the columnsagency_name
: (required) String. The name of the transport agencyagency_url
: (required) String. A fully qualified URL for the transport agencyagency_timezone
: (required) String. Timezone where the transit agency is located. Timezone names never contain the space character but may contain an underscore. Refer to http://en.wikipedia.org/wiki/List_of_tz_zones for a list of valid valuesstart_date
,end_date
(required): Strings. The start and end dates for which all this network information is valid formated as YYYYMMDD stringsdefault_route_speed
: (required) Float. Default speed in kilometers per hour to assign to routes with nospeed
entry in the fileroutes.csv
service_windows.csv
: (required) A CSV file containing service window information. A service window is a time interval and a set of days of the week during which all routes have constant service frequency, e.g. Saturday and Sunday 07:00 to 09:00. The CSV file contains the columnsservice_window_id
: (required) String. A unique identifier for a service windowstart_time
,end_time
: (required) Strings. The start and end times of the service window in HH:MM:SS format where the hour is less than 24monday
,tuesday
,wednesday
,thursday
,friday
,saturday
,sunday
(required): Integer 0 or 1. Indicates whether the service is active on the given day (1) or not (0)
shapes.geojson
: (required) A GeoJSON file containing route shapes. The file consists of one feature collection of LineString features, where each feature’s properties contains at least the attributeshape_id
, which links the route’s shape to the route’s information inroutes.csv
.stops.csv
: (optional) A CSV file containing all the required and optional fields ofstops.txt
in the GTFS
Module validators¶
Validators for ProtoFeeds. Designed along the lines of gtfstk.validators.py.
-
make_gtfs.validators.
check_for_invalid_columns
(problems, table, df)¶ Check for invalid columns in the given ProtoFeed DataFrame.
Parameters: - problems (list) –
A four-tuple containing
- A problem type (string) equal to
'error'
or'warning'
;'error'
means the ProtoFeed is violated;'warning'
means there is a problem but it is not a ProtoFeed violation - A message (string) that describes the problem
- A ProtoFeed table name, e.g.
'meta'
, in which the problem occurs - A list of rows (integers) of the table’s DataFrame where the problem occurs
- A problem type (string) equal to
- table (string) – Name of a ProtoFeed table
- df (DataFrame) – The ProtoFeed table corresponding to
table
Returns: The
problems
list extended as follows. Check whether the DataFrame contains extra columns not in the ProtoFeed and append to the problems list one warning for each extra column.Return type: list
- problems (list) –
-
make_gtfs.validators.
check_for_required_columns
(problems, table, df)¶ Check that the given ProtoFeed table has the required columns.
Parameters: - problems (list) –
A four-tuple containing
- A problem type (string) equal to
'error'
or'warning'
;'error'
means the ProtoFeed is violated;'warning'
means there is a problem but it is not a ProtoFeed violation - A message (string) that describes the problem
- A ProtoFeed table name, e.g.
'meta'
, in which the problem occurs - A list of rows (integers) of the table’s DataFrame where the problem occurs
- A problem type (string) equal to
- table (string) – Name of a ProtoFeed table
- df (DataFrame) – The ProtoFeed table corresponding to
table
Returns: The
problems
list extended as follows. Check that the DataFrame contains the colums required by the ProtoFeed spec and append to the problems list one error for each column missing.Return type: list
- problems (list) –
-
make_gtfs.validators.
check_frequencies
(pfeed, *, as_df=False, include_warnings=False)¶ Check that
pfeed.frequency
follows the ProtoFeed spec. Return a list of problems of the form described ingt.check_table()
; the list will be empty if no problems are found.
-
make_gtfs.validators.
check_meta
(pfeed, *, as_df=False, include_warnings=False)¶ Analog of
check_frequencies()
forpfeed.meta
-
make_gtfs.validators.
check_service_windows
(pfeed, *, as_df=False, include_warnings=False)¶ Analog of
check_frequencies()
forpfeed.service_windows
-
make_gtfs.validators.
check_shapes
(pfeed, *, as_df=False, include_warnings=False)¶ Analog of
check_frequencies()
forpfeed.shapes
-
make_gtfs.validators.
check_stops
(pfeed, *, as_df=False, include_warnings=False)¶ Analog of
check_frequencies()
forpfeed.stops
-
make_gtfs.validators.
valid_speed
(x)¶ Return
True
ifx
is a positive number; otherwise returnFalse
.
-
make_gtfs.validators.
validate
(pfeed, *, as_df=True, include_warnings=True)¶ Check whether the given pfeed satisfies the ProtoFeed spec.
Parameters: - pfeed (ProtoFeed) –
- as_df (boolean) – If
True
, then return the resulting report as a DataFrame; otherwise return the result as a list - include_warnings (boolean) – If
True
, then include problems of types'error'
and'warning'
; otherwise, only return problems of type'error'
Returns: Run all the table-checking functions:
check_agency()
,check_calendar()
, etc. This yields a possibly empty list of items [problem type, message, table, rows]. Ifas_df
, then format the error list as a DataFrame with the columns'type'
: ‘error’ or ‘warning’; ‘error’ means the ProtoFeed spec is violated; ‘warning’ means there is a problem but it’s not a ProtoFeed spec violation'message'
: description of the problem'table'
: table in which problem occurs, e.g. ‘routes’'rows'
: rows of the table’s DataFrame where problem occurs
Return early if the pfeed is missing required tables or required columns.
Return type: list or DataFrame
Module main¶
-
make_gtfs.main.
buffer_side
(linestring, side, buffer)¶ Given a Shapely LineString, a side of the LineString (string; ‘left’ = left hand side of LineString, ‘right’ = right hand side of LineString, or ‘both’ = both sides), and a buffer size in the distance units of the LineString, buffer the LineString on the given side by the buffer size and return the resulting Shapely polygon.
-
make_gtfs.main.
build_agency
(pfeed)¶ Given a ProtoFeed, return a DataFrame representing
agency.txt
-
make_gtfs.main.
build_calendar_etc
(pfeed)¶ Given a ProtoFeed, return a DataFrame representing
calendar.txt
and a dictionary of the form <service window ID> -> <service ID>, respectively.
-
make_gtfs.main.
build_feed
(pfeed, buffer=10)¶
-
make_gtfs.main.
build_routes
(pfeed)¶ Given a ProtoFeed, return a DataFrame representing
routes.txt
.
-
make_gtfs.main.
build_shapes
(pfeed)¶ Given a ProtoFeed, return DataFrame representing
shapes.txt
. Only use shape IDs that occur in bothpfeed.shapes
andpfeed.frequencies
. Create reversed shapes where routes traverse shapes in both directions.
-
make_gtfs.main.
build_stop_ids
(shape_id)¶ Create a pair of stop IDs based on the given shape ID.
-
make_gtfs.main.
build_stop_names
(shape_id)¶ Create a pair of stop names based on the given shape ID.
-
make_gtfs.main.
build_stop_times
(pfeed, routes, shapes, stops, trips, buffer=10)¶ Given a ProtoFeed and its corresponding routes (DataFrame), shapes (DataFrame), stops (DataFrame), trips (DataFrame), return DataFrame representing
stop_times.txt
. Includes the optionalshape_dist_traveled
column. Don’t make stop times for trips with no nearby stops.
-
make_gtfs.main.
build_stops
(pfeed, shapes=None)¶ Given a ProtoFeed, return a DataFrame representing
stops.txt
. Ifpfeed.stops
is notNone
, then return that. Otherwise, require built shapes output bybuild_shapes()
, create one stop at the beginning (the first point) of each shape and one at the end (the last point) of each shape, and drop stops with duplicate coordinates. Note that this will yield one stop for shapes that are loops.
-
make_gtfs.main.
build_trips
(pfeed, routes, service_by_window)¶ Given a ProtoFeed and its corresponding routes (DataFrame), service-by-window (dictionary), return a DataFrame representing
trips.txt
. Trip IDs encode route, direction, and service window information to make it easy to compute stop times later.
-
make_gtfs.main.
get_duration
(timestr1, timestr2, units='s')¶ Return the duration of the time period between the first and second time string in the given units. Allowable units are ‘s’ (seconds), ‘min’ (minutes), ‘h’ (hours). Assume
timestr1 < timestr2
.
-
make_gtfs.main.
get_nearby_stops
(geo_stops, linestring, side, buffer=10)¶ Given a GeoDataFrame of stops, a Shapely LineString in the same coordinate system, a side of the LineString (string; ‘left’ = left hand side of LineString, ‘right’ = right hand side of LineString, or ‘both’ = both sides), and a buffer in the distance units of that coordinate system, do the following. Return a GeoDataFrame of all the stops that lie within
buffer
distance units to theside
of the LineString.