Working with realtime transit data

This is a very fast overview of how to work with Kingston's realtime transit data. Before we start, you will need to download/install the gtfs-realtime-bindings package using pip.

In [3]:
# commands preceded with an exclamation point are run on your system's command line
!pip3 install --user gtfs-realtime-bindings
Collecting gtfs-realtime-bindings
  Using cached gtfs-realtime-bindings-0.0.5.tar.gz
Requirement already satisfied: setuptools in /usr/lib/python3.6/site-packages (from gtfs-realtime-bindings)
Collecting protobuf (from gtfs-realtime-bindings)
  Downloading protobuf-3.5.0-cp36-cp36m-manylinux1_x86_64.whl (6.4MB)
    100% |████████████████████████████████| 6.4MB 191kB/s eta 0:00:01    69% |██████████████████████▏         | 4.5MB 23.8MB/s eta 0:00:01
Requirement already satisfied: six>=1.9 in /usr/lib/python3.6/site-packages (from protobuf->gtfs-realtime-bindings)
Installing collected packages: protobuf, gtfs-realtime-bindings
  Running setup.py install for gtfs-realtime-bindings ... done
Successfully installed gtfs-realtime-bindings-0.0.5 protobuf-3.5.0

Now that we have the package, let's load it and do a quick example. In this case we will use the vehicle positions data, but there are service alerts and trip updates as well.

In [4]:
from google.transit import gtfs_realtime_pb2
import requests
In [5]:
feed = gtfs_realtime_pb2.FeedMessage()
# requests will fetch the results from a url, in this case, the positions of all Kingston's buses
response = requests.get('http://kingston.metrolinx.tmix.se/gtfs-realtime/vehicleupdates.pb')
feed.ParseFromString(response.content)
Out[5]:
6996
In [11]:
print('There are {} buses in the dataset.'.format(len(feed.entity)))
# looking closely at the first bus
bus = feed.entity[0]
There are 67 buses in the dataset.
In [10]:
bus
Out[10]:
id: "3490109994"
is_deleted: false
vehicle {
  trip {
    trip_id: "859988:42588:5297"
    start_date: "20171120"
    schedule_relationship: SCHEDULED
    route_id: "1"
  }
  position {
    latitude: 44.256954193115234
    longitude: -76.48285675048828
    bearing: 11.6015625
    speed: 4.5
  }
  current_stop_sequence: 36
  current_status: STOPPED_AT
  timestamp: 1511189681
  congestion_level: RUNNING_SMOOTHLY
  stop_id: "00212"
  vehicle {
    id: "3490109994"
    label: "3490109994"
  }
}
In [12]:
# bus is a special datatype
print(type(bus))
<class 'gtfs_realtime_pb2.FeedEntity'>
In [23]:
# you can use dot notation to get individual pieces of data
print('bus ID:', bus.id, '\n')
bus ID: 3490109994 

In [25]:
# nested pieces of data just needs additional dots
bus.vehicle.position
Out[25]:
latitude: 44.256954193115234
longitude: -76.48285675048828
bearing: 11.6015625
speed: 4.5
In [26]:
bus.vehicle.position.speed
Out[26]:
4.5

Alternatively, if you'd rather use the data as a dict, it might be worth looking into the protobuf3-to-dict package.

In [32]:
!pip3 install --user protobuf3-to-dict
Collecting protobuf3-to-dict
  Downloading protobuf3-to-dict-0.1.5.tar.gz
Requirement already satisfied: protobuf>=2.3.0 in /home/jstaf/.local/lib/python3.6/site-packages (from protobuf3-to-dict)
Requirement already satisfied: six in /usr/lib/python3.6/site-packages (from protobuf3-to-dict)
Requirement already satisfied: setuptools in /usr/lib/python3.6/site-packages (from protobuf>=2.3.0->protobuf3-to-dict)
Installing collected packages: protobuf3-to-dict
  Running setup.py install for protobuf3-to-dict ... done
Successfully installed protobuf3-to-dict-0.1.5
In [34]:
from protobuf_to_dict import protobuf_to_dict
In [36]:
# convert to dict from our original protobuf feed
buses_dict = protobuf_to_dict(feed)
type(buses_dict)
Out[36]:
dict
In [41]:
# get our first bus
bus_again = buses_dict['entity'][0]
bus_again
Out[41]:
{'id': '3490109994',
 'is_deleted': False,
 'vehicle': {'congestion_level': 1,
  'current_status': 1,
  'current_stop_sequence': 36,
  'position': {'bearing': 11.6015625,
   'latitude': 44.256954193115234,
   'longitude': -76.48285675048828,
   'speed': 4.5},
  'stop_id': '00212',
  'timestamp': 1511189681,
  'trip': {'route_id': '1',
   'schedule_relationship': 0,
   'start_date': '20171120',
   'trip_id': '859988:42588:5297'},
  'vehicle': {'id': '3490109994', 'label': '3490109994'}}}
In [43]:
# We can then use the bus data just as we would a normal dict.

# Get the speed for example
bus_again['vehicle']['position']['speed']
Out[43]:
4.5
In [ ]: