Open Access Open Access  Restricted Access Subscription Access

A Novel Approach to Outlier Detection using Modified Grey Wolf Optimization and k-Nearest Neighbors Algorithm


Affiliations
1 Department of CSE, Ajay Kumar Garg Engineering College, Ghaziabad - 201009, Uttar Pradesh, India
2 Department of CSE and IT, Jaypee University of Information Technology, Waknaghat -173234, Himachal Pradesh, India
3 Department of CSE and IT, Jaypee Institute of Information Technology, Noida - 201301, Uttar Pradesh, India
 

Objectives: Detecting dataset anomalies has been an interesting yet challenging area in this front. This work proposes a hybrid model using meta-heuristics to detect dataset anomalies efficiently. Methods/Statistical Analysis: A distance based modified grey wolf optimization algorithm is designed which uses the k- Nearest Neighbor algorithm for better results. The proposed approach works well with supervised datasets and gives anomalies with respect to each attribute of the dataset based on a threshold using a confidence interval. Findings: The proposed approach works well with regression as well as classification datasets in the supervised scenario. The results in terms of number of anomalies and the accuracy using confusion matrix are depicted and have been evaluated for a classification dataset considering a minority class to be anomalous in comparison to the majority class. The results have been evaluated based on varying the threshold and ‘k’ values and depending on the data set domain and data distribution the optimal parameters can be identified and used. Application/Improvements: The proposed approach can be used for anomaly detection of datasets of different domains of supervised scenario. It can also be extended to unsupervised scenario by integrating it with K-means clustering.

Keywords

Data Mining, Grey Wolf Optimization, k-Nearest Neighbor, Machine Learning, Outlier Detection.
User

Abstract Views: 203

PDF Views: 0




  • A Novel Approach to Outlier Detection using Modified Grey Wolf Optimization and k-Nearest Neighbors Algorithm

Abstract Views: 203  |  PDF Views: 0

Authors

Reema Aswani
Department of CSE, Ajay Kumar Garg Engineering College, Ghaziabad - 201009, Uttar Pradesh, India
S. P. Ghrera
Department of CSE and IT, Jaypee University of Information Technology, Waknaghat -173234, Himachal Pradesh, India
Satish Chandra
Department of CSE and IT, Jaypee Institute of Information Technology, Noida - 201301, Uttar Pradesh, India

Abstract


Objectives: Detecting dataset anomalies has been an interesting yet challenging area in this front. This work proposes a hybrid model using meta-heuristics to detect dataset anomalies efficiently. Methods/Statistical Analysis: A distance based modified grey wolf optimization algorithm is designed which uses the k- Nearest Neighbor algorithm for better results. The proposed approach works well with supervised datasets and gives anomalies with respect to each attribute of the dataset based on a threshold using a confidence interval. Findings: The proposed approach works well with regression as well as classification datasets in the supervised scenario. The results in terms of number of anomalies and the accuracy using confusion matrix are depicted and have been evaluated for a classification dataset considering a minority class to be anomalous in comparison to the majority class. The results have been evaluated based on varying the threshold and ‘k’ values and depending on the data set domain and data distribution the optimal parameters can be identified and used. Application/Improvements: The proposed approach can be used for anomaly detection of datasets of different domains of supervised scenario. It can also be extended to unsupervised scenario by integrating it with K-means clustering.

Keywords


Data Mining, Grey Wolf Optimization, k-Nearest Neighbor, Machine Learning, Outlier Detection.



DOI: https://doi.org/10.17485/ijst%2F2016%2Fv9i44%2F136721