Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

Scalable Recommendation System with MapReduce


     

   Subscribe/Renew Journal


If the number of user grows in huge amount in a Recommendation System, the standard approach of sequentially examining each item and looking at all interacting users does not scale. In our proposed system we solve this problem by developing a MapReduce algorithm for the item comparison and Top-N recommendation problem that scales linearly with respect to a growing number of users. We use Similarity-based neighborhood methods for recommendation; infer their predictions by finding users with similar taste or items that have been similarly rated. In Mapreduce, the data to process is split and stored block-wise across the machines of the cluster in a distributed File system (DFS) and is usually represented as (key,value) tuples. It uses parallel algorithm which partitions the data across the clusters and in general it supports a wide range of similarity measures.

Keywords

MapReduce, Parallel Algorithm, Similarity, Pairwise Comparison.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 225

PDF Views: 2




  • Scalable Recommendation System with MapReduce

Abstract Views: 225  |  PDF Views: 2

Authors

Abstract


If the number of user grows in huge amount in a Recommendation System, the standard approach of sequentially examining each item and looking at all interacting users does not scale. In our proposed system we solve this problem by developing a MapReduce algorithm for the item comparison and Top-N recommendation problem that scales linearly with respect to a growing number of users. We use Similarity-based neighborhood methods for recommendation; infer their predictions by finding users with similar taste or items that have been similarly rated. In Mapreduce, the data to process is split and stored block-wise across the machines of the cluster in a distributed File system (DFS) and is usually represented as (key,value) tuples. It uses parallel algorithm which partitions the data across the clusters and in general it supports a wide range of similarity measures.

Keywords


MapReduce, Parallel Algorithm, Similarity, Pairwise Comparison.