Open Access Subscription Access
Quantitative Study of Traffic Accident Prediction Models: A Case Study of Virginia Accidents
Traffic accidents are a serious problem that threatens people's lives, health, and properties. Thus, decreasing traffic accidents is a crucial demand for public safety. This paper proposes two data mining models to predict accident risks based on the decision tree and the naive Bayes algorithms. The purpose of the classifiers is to predict the potential severity of a traffic accident based on a set of data attributes related to the weather factors, accident timing, and properties of the road. The models are developed using data on accidents in Virginia between 2016 and 2021. Several metrics are considered to measure the performance of each model such as accuracy, precision, recall, and F1-score. Furthermore, to statistically compare the performance of the prediction models, the study employs three quantitative analysis tools, approximate visual test, paired observations, and ANOVA. The experimental results revealed that the decision tree outperforms naive Bayes in terms of prediction accuracy.
Traffic Accidents, Severity Prediction, Quantitative Analysis, Decision Tree and Naive Bayes Algorithms.
- M. Gaber, A. Mohamed Wahaballa, A. Mahmoud Othman, and A. Diab, “Traffic accidents prediction model using Fuzzy Logic: Aswan desert road case study,” JES. Journal of Engineering Sciences, vol. 45, no. 1, pp. 28–44, 2017.
- NHTSA, “Newly released estimates show traffic fatalities reached a 16-year high in 2021,” NHTSA, 17- May-2022. [Online]. Available: https://www.nhtsa.gov/press-releases/early-estimate2021-traffic-fatalities. [Accessed: 04-Dec-2022].
- K. Banerjee, V. Bali, A. Sharma, D. Aggarwal, A. Yadav, A. Shukla, and P. Srivastav, “Traffic accident risk prediction using machine learning,” 2022 International Mobile and Embedded Technology Conference (MECON), 2022.
- A. Thaduri, V. Polepally, and S. Vodithala, “Traffic accident prediction based on CNN model,” 2021 5th International Conference on Intelligent Computing and Control Systems (ICICCS), 2021.
- S.-L. Lee, “Assessing the severity level of road traffic accidents based on machine learning techniques,” Advanced Science Letters, vol. 22, no. 10, pp. 3115– 3119, 2016.
- S. Moosavi, M. H. Samavatian, S. Parthasarathy, and R. Ramnath, “A countrywide traffic accident dataset,” arXiv.org, 12-Jun-2019. [Online]. Available: https://arxiv.org/abs/1906.05409. [Accessed: 04-Dec2022].
- S. Alasadi and W. Bhaya, “Review of Data Preprocessing Techniques in Data Mining,” Journal of Engineering and Applied Sciences, 2017.
- J. Frankenfield, “Reading into Predictive modeling,” Investopedia, 21-Sep-2021. [Online]. Available: https://www.investopedia.com/terms/p/predictivemodeling.asp. [Accessed: 04-Dec-2022].
- I. Editorial Team, “10 predictive modeling types (with benefits and uses),” Indeed, 16-Nov-2021. [Online]. Available: https://www.indeed.com/careeradvice/career-development/predictive-modeling-types. [Accessed: 04-Dec-2022].
- H. Sharma and S. Kumar, “A survey on decision tree algorithms of classification in Data Mining,” International Journal of Science and Research (IJSR), vol. 5, no. 4, pp. 2094–2097, 2016.
- “1. supervised learning,” scikit. [Online]. Available: https://scikitlearn.org/stable/supervised_learning.html#supervisedlearning. [Accessed: 04-Dec-2022].
- “1.10. decision trees,” scikit. [Online]. Available: https://scikit-learn.org/stable/modules/tree.html#treealgorithms-id3-c4-5-c5-0-and-cart. [Accessed: 04-Dec2022].
- “Pandas,” pandas. [Online]. Available: https://pandas.pydata.org/. [Accessed: 04-Dec-2022].
- M. Galarnyk, “Understanding train test split,” Built In, 28-Jul-2022. [Online]. Available: https://builtin.com/data-science/train-test-split. [Accessed: 04-Dec-2022].
- Graphviz. [Online]. Available: https://graphviz.org/. [Accessed: 04-Dec-2022].
- M. M. Saritas and A. Yasar, “Performance analysis of ann and naive Bayes classification algorithm for Data Classification,” International Journal of Intelligent Systems and Applications in Engineering, vol. 7, no. 2, pp. 88–91, 2019.
- “1.9. naive Bayes,” scikit. [Online]. Available: https://scikitlearn.org/stable/modules/naive_bayes.html. [Accessed: 04-Dec-2022].
- D. J. Lilja, Measuring Computer Performance: A practitioner's guide. Cambridge, UK: Cambridge University Press, 2005.
- “SPSS tutorials: Paired samples T test,” LibGuides. [Online]. Available: https://libguides.library.kent.edu/spss/pairedsamplesttes t. [Accessed: 04-Dec-2022].
- Zach, “How to interpret the classification report in sklearn (with example),” Statology, 09-May-2022. [Online]. Available: https://www.statology.org/sklearnclassification-report/. [Accessed: 04-Dec-2022].
Abstract Views: 21
PDF Views: 2