Mobile application-based transportation services are reshaping the urban transportation industries of both the developed and developing worlds. They generate massive amounts of data, which have the potential to provide deeper insights into urban travel activity than ever before. The bike-sharing service (BSS) market is growing at a breakneck pace with new service providers entering the arena. However, we have seen the failure of several BSS start-ups in India in recent years. All these cases have one aspect in common: user dissatisfaction because of insufficient/ineffective rebalancing approaches. The BSS operators rely on data insights to drive their policies and strategies. However, the data generated by these services are found to have several incomplete records as a result of various technical errors, like missing origin/destination. As most BSS modelling focuses on trip origin and destination, completely ignoring (or listwise deleting) trips with missing information results in the loss of valuable data that are still present in other observed variables, which include trip duration, date and time of the trip, and so on. This study proposes two methods for imputing missing data: (i) a probabilistic approach based on Bayes’ theorem, and (ii) a machine learning approach based on the k-nearest neighbor algorithm. The methodologies for their analyses are presented in detail. Data from a BSS that operated in the Indian Institute of Science campus, Bengaluru, India, are used to illustrate the proposed approaches. This is followed by a brief discussion of the results and a comparison of the performance
Keywords
Bike-sharing system, imputation, incomplete records, origin and destination, probabilistic and machine learning approaches, trip data.
User
Font Size
Information