Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Updating Solving Set Algorithm of Outlier Detection to Reduce the Iterations for Large Data Sets and its Application to Fault Diagnosis


Affiliations
1 Vishwakarma Institute of Technology, Pune, Maharashtra, India
2 Department of Computer Engineering, Vishwakarma Institute of Technology, Pune, Maharashtra, India
     

   Subscribe/Renew Journal


In this paper original solving set algorithm for detection of possible outliers is updated to have less iterations and thus there by less time. Original algorithm selects initial solving set randomly, but if we select this set carefully using standard deviation of each pattern with respect to each other. The proposed modification requires less time and iterations than the original one. Our experimentation says that this modification requires around half to two third of the patterns in the initial solving set having maximum standard deviation. We have compared original and updated algorithms using synthetic 2-dimensional data set, as described in section II, as well as a fault diagnosis data set from NASA. We observed that the time required to detect outliers for updated algorithm is less than the original one and it exhibit better outlier detection rate than the original one along with better cluster entropy. Better outlier detection rate, less time required and better cluster entropy are the key features of this modification that makes it suitable for outlier detection from large data sets.

Keywords

Data Mining, Distance-Based Outlier, Fault Diagnosis, Outlier Detection.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 288

PDF Views: 3




  • Updating Solving Set Algorithm of Outlier Detection to Reduce the Iterations for Large Data Sets and its Application to Fault Diagnosis

Abstract Views: 288  |  PDF Views: 3

Authors

P. S. Dhabe
Vishwakarma Institute of Technology, Pune, Maharashtra, India
A. S. Shingare
Department of Computer Engineering, Vishwakarma Institute of Technology, Pune, Maharashtra, India
M. L. Dhore
Department of Computer Engineering, Vishwakarma Institute of Technology, Pune, Maharashtra, India

Abstract


In this paper original solving set algorithm for detection of possible outliers is updated to have less iterations and thus there by less time. Original algorithm selects initial solving set randomly, but if we select this set carefully using standard deviation of each pattern with respect to each other. The proposed modification requires less time and iterations than the original one. Our experimentation says that this modification requires around half to two third of the patterns in the initial solving set having maximum standard deviation. We have compared original and updated algorithms using synthetic 2-dimensional data set, as described in section II, as well as a fault diagnosis data set from NASA. We observed that the time required to detect outliers for updated algorithm is less than the original one and it exhibit better outlier detection rate than the original one along with better cluster entropy. Better outlier detection rate, less time required and better cluster entropy are the key features of this modification that makes it suitable for outlier detection from large data sets.

Keywords


Data Mining, Distance-Based Outlier, Fault Diagnosis, Outlier Detection.