Open Access
Subscription Access
Open Access
Subscription Access
High Dimensional Data Mining Using Clustering
Subscribe/Renew Journal
Clustering is one of the major tasks in data mining Clustering algorithms are based on a criterion that maximizes inter cluster distance and minimize intra cluster distance. In higher dimensional feature spaces, the performance and efficiency deteriorates to a greater extent. Large dimensions confuse the clustering algorithms and it is difficult to group similar data points becomes almost the same and is usually called as the “dimensionality curse” problem. These algorithms find a subset of dimensions by removing irrelevant and redundant dimensions on which clustering is performed. Dimensionality reduction technique such as Principal Component Analysis (PCA) is used for feature reduction. If different subsets of the points cluster well on different subspaces of the feature space, a global dimensionality reduction will fail. To overcome these problems, recent directions in research proposed to compute subspace cluster. The algorithms have two common limitations. First, they usually have problems with subspace clusters of different dimensionality. Second, they often fail to discover clusters of different shape and dimensionalities. The goal of this project is to develop new efficient and effective methods for high dimensional clustering.
Keywords
Data Mining, High Dimensional Clustering, Distance Measure.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 226
PDF Views: 2