Open Access
Subscription Access
Open Access
Subscription Access
Towards Efficient Distributed Algorithm with Minimum Communication Overhead
Subscribe/Renew Journal
Currently, organizations are distributed geographically. Normally, all the sites locally store its day-to-day data, which is being updated. Centralized data mining algorithms can’t be used in such type of organizations for discovering useful patterns as merging of datasets from different sites is not feasible as well as it causes large network communication costs. Data mining in distributed form has emerged as an active sub-domain of data mining research. In distributed association rule mining algorithm, one of the major challenges is to reduce the communication overhead. Data sites are required to exchange lot of information in the data mining process which may generates communication overhead. This report proposes an association rule mining algorithm which minimizes the communication overhead among the participating data sites. Instead of transmitting all itemsets and their counts, The algorithm transmits a binary vector of frequently large itemsets using Message Passing Interface (MPI) technique. Another challenge is to reduce number of database scan and generate the frequent itemsets from the database. Hence an algorithm term as "Efficient Distributed dynamic itemset counting" is proposed. This algorithm reduces the time of scan of partition database which increases the performance of the algorithm.
Keywords
Association Rules, Distributed Environment, Minimum Communication Cost, Dynamic Itemset Counting, Frequent Pattern Growth, Support and Confidence.
User
Subscription
Login to verify subscription
Font Size
Information
Abstract Views: 251
PDF Views: 2