The PDF file you selected should load here if your Web browser has a PDF reader plug-in installed (for example, a recent version of Adobe Acrobat Reader).

If you would like more information about how to print, save, and work with PDFs, Highwire Press provides a helpful Frequently Asked Questions about PDFs.

Alternatively, you can download the PDF file directly to your computer, from where it can be opened using a PDF reader. To download the PDF, click the Download link above.

Fullscreen Fullscreen Off


Objectives: Curse of Dimensionality and the attribute relevance is the matter of great concern now these days while dealing with the higher dimensional data sets or Big Data, especially to detect the projected outliers. The objective of this research paper is to construct a Robust and a scalable model to prominently highlight the higher dimensional outliers in an effective and an efficient manner. Methods/Analysis: In order to detect the projected outliers, an algorithm EKFFK-Means with a hybrid approach is constructed using two important methodologies- Extended Kalman Filter (EKF) and Fuzzy K-Means. EKF is used to linearize the higher dimensional data by estimating the current mean and covariance by enhancing the Kalman gain and then fuzzy K-Means confirms the outlying property of each data instance and categorizes them in an effective and an efficient way using the membership label. Findings: A model EKFFK-Means is constructed that further creates 30 clusters from the complete data set to detect the projected outliers and various parameters like accuracy, cluster validity, True positive rate, False positive rate , robustness and cluster quality are calculated. Improvements: This algorithm is further compared with HPStream and CLUStream and is proved better against various parameters.

Keywords

Clustering, Projected Outliers, Robustness, Scalability, Unsupervised.
User