This is an example to show how to open and use one of the provided datasets. We can open one of the transit data files with the following code:
library(tidyverse)
# some bus stops have weird arrival times, but we'll pretend that's not an issue here
bus_stops <- read_csv('/home/datasets/transit/stop_times.txt')
number of columns of result is not a multiple of vector length (arg 1)1360 parsing failures.
row # A tibble: 5 x 5 col row col expected actual file expected <int> <chr> <chr> <chr> <chr> actual 1 1112 arrival_time valid date 24:00:00 '/home/datasets/transit/stop_times.txt' file 2 1112 departure_time valid date 24:00:00 '/home/datasets/transit/stop_times.txt' row 3 3922 arrival_time valid date 25:45:00 '/home/datasets/transit/stop_times.txt' col 4 3922 departure_time valid date 25:45:00 '/home/datasets/transit/stop_times.txt' expected 5 3923 arrival_time valid date 25:46:00 '/home/datasets/transit/stop_times.txt'
... ................. ... .................................................................................. ........ .................................................................................. ...... .................................................................................. .... .................................................................................. ... .................................................................................. ... .................................................................................. ........ ..................................................................................
See problems(...) for more details.
# take a peek at our dataset
head(bus_stops)
We could then create a nice plot using ggplot2
to tell us when is the best time to take the bus.
ggplot(bus_stops, aes(x = departure_time)) + geom_histogram(color = 'gray25', fill='dodgerblue') +
theme_classic() + ylab('Number of bus departures in schedule') + xlab('Departure time')
It looks like the best time to take the bus is betwen the hours of 9 am to 6pm! Conversely, there’s almost no buses before 6am (who would have thought?).