Trips Data Source

All the classes of this module are implementations of the abstract class TripsDataSource that we report below.

TripsDataSource class

class TripsDataSource(city_name, data_source_id, vehicles_type_id)

TripsDataSource is an abstract class that contains the information needed to describe a trip. This class is implemented by the other classes of this module. The constructor method takes as parameters:

Parameters
  • city_name (str) – City name. The name also serves to determine the timezone to which the city belongs

  • data_source_id (str) – Data source from which the information is taken. This allows us to have multiple data sources associated with the same city (for example from different operators)

  • vehicles_type_id (str) – Type of service represented by the data source (e.g. car sharing or e-scooter)

load_norm(year, month)

Load a previously created normalized file from memory. It requests month and year as parameters, and checks if the file for that period exists in memory (looking for it with the same format as save_norm in the city folder). If it exists, it returns a pandas.DataFrame containing the data read, otherwise it returns an empty DataFrame

Parameters
  • year (int) – year expressed as a four-digit number (e.g. 1999)

  • month (int) – month expressed as a number (e.g. for November the method expects to receive 11)

Returns

If the file exists, it returns a pandas.DataFrame containing the data read, otherwise it returns an empty DataFrame

load_raw()

Method for loading the data to be preprocessed. Since the data format differs in the various datasets, the method is left abstract. Each city has its own implementation. All implementations will read the data through the pandas readcsv method

Returns

nothing

normalise()

This method is used to standardize the data format. Again the implementation is highly dependent on the data source and almost all modules override the method.

Returns

A normalized pandas.DataFrame

save_norm()

It stores normalized data both in a csv file and in a pickle file. The files produced are of the format <year>_<month number>.csv (or .pickle). For example 2017_11.csv.

Returns

nothing