Open Access Open Access  Restricted Access Subscription Access

Imputation of trip data for a docked bike-sharing system


Affiliations
1 Department of Civil Engineering, Rajiv Gandhi Institute of Technology, Kottayam 686 501, India
2 Department of Civil Engineering, Indian Institute of Science, Bengaluru 560 012, India
3 Department of Civil Engineering, Transport Division, Universidad de Chile, Chile
 

Mobile application-based transportation services are reshaping the urban transportation industries of both the developed and developing worlds. They generate massive amounts of data, which have the potential to provide deeper insights into urban travel activity than ever before. The bike-sharing service (BSS) market is growing at a breakneck pace with new service providers entering the arena. However, we have seen the failure of several BSS start-ups in India in recent years. All these cases have one aspect in common: user dissatisfaction because of insufficient/ineffective rebalancing approaches. The BSS operators rely on data insights to drive their policies and strategies. However, the data generated by these services are found to have several incomplete records as a result of various technical errors, like missing origin/destination. As most BSS modelling focuses on trip origin and destination, completely ignoring (or listwise deleting) trips with missing information results in the loss of valuable data that are still present in other observed variables, which include trip duration, date and time of the trip, and so on. This study proposes two methods for imputing missing data: (i) a probabilistic approach based on Bayes’ theorem, and (ii) a machine learning approach based on the k-nearest neighbor algorithm. The methodologies for their analyses are presented in detail. Data from a BSS that operated in the Indian Institute of Science campus, Bengaluru, India, are used to illustrate the proposed approaches. This is followed by a brief discussion of the results and a comparison of the performance

Keywords

Bike-sharing system, imputation, incomplete records, origin and destination, probabilistic and machine learning approaches, trip data.
User
Notifications
Font Size


  • Imputation of trip data for a docked bike-sharing system

Abstract Views: 485  |  PDF Views: 175

Authors

Milan Mathew Thomas
Department of Civil Engineering, Rajiv Gandhi Institute of Technology, Kottayam 686 501, India
Ashish Verma
Department of Civil Engineering, Indian Institute of Science, Bengaluru 560 012, India
Sai Kiran Mayakuntla
Department of Civil Engineering, Transport Division, Universidad de Chile, Chile

Abstract


Mobile application-based transportation services are reshaping the urban transportation industries of both the developed and developing worlds. They generate massive amounts of data, which have the potential to provide deeper insights into urban travel activity than ever before. The bike-sharing service (BSS) market is growing at a breakneck pace with new service providers entering the arena. However, we have seen the failure of several BSS start-ups in India in recent years. All these cases have one aspect in common: user dissatisfaction because of insufficient/ineffective rebalancing approaches. The BSS operators rely on data insights to drive their policies and strategies. However, the data generated by these services are found to have several incomplete records as a result of various technical errors, like missing origin/destination. As most BSS modelling focuses on trip origin and destination, completely ignoring (or listwise deleting) trips with missing information results in the loss of valuable data that are still present in other observed variables, which include trip duration, date and time of the trip, and so on. This study proposes two methods for imputing missing data: (i) a probabilistic approach based on Bayes’ theorem, and (ii) a machine learning approach based on the k-nearest neighbor algorithm. The methodologies for their analyses are presented in detail. Data from a BSS that operated in the Indian Institute of Science campus, Bengaluru, India, are used to illustrate the proposed approaches. This is followed by a brief discussion of the results and a comparison of the performance

Keywords


Bike-sharing system, imputation, incomplete records, origin and destination, probabilistic and machine learning approaches, trip data.

References





DOI: https://doi.org/10.18520/cs%2Fv122%2Fi3%2F310-318