Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

An Efficient Algorithm Distance Calculation of Page Sequences Using Dynamic Programming


Affiliations
1 Department of Computer Application, Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, India
     

   Subscribe/Renew Journal


Today web data is rapidly growing, but the information residing in the web includes inconsistent information because it is having different types of information, moreover the data are heterogeneous. Due to heterogeneity of data it is a critical task to extract relevant information from the web. Web uses mining technique; extracts the relevant information from huge amount of data available in the web logs format that enclose intrinsic information regarding web pages accessed. Because of this large amount of web log data, it is better to deal with small set of data at a time, instead of handling with complete data. Now we need to find the distance between two user sessions, using some distance similarity function which can accomplish this kind of tasks. Clustering of users tends to establish groups of users exhibiting similar browsing patterns. In this paper we propose an efficient algorithm for calculating the similarity between two user sessions based on sequence alignment that uses one of the dynamic programming techniques that is Longest Common Subsequences.

Keywords

Clustering, Longest Common Subsequence, Web Logs, Web Usage Mining.
Subscription Login to verify subscription
User
Notifications
Font Size


Abstract Views: 349

PDF Views: 2




  • An Efficient Algorithm Distance Calculation of Page Sequences Using Dynamic Programming

Abstract Views: 349  |  PDF Views: 2

Authors

Saurabh Dhyani
Department of Computer Application, Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, India
Ghanshyam Singh Thakur
Department of Computer Application, Maulana Azad National Institute of Technology, Bhopal, Madhya Pradesh, India

Abstract


Today web data is rapidly growing, but the information residing in the web includes inconsistent information because it is having different types of information, moreover the data are heterogeneous. Due to heterogeneity of data it is a critical task to extract relevant information from the web. Web uses mining technique; extracts the relevant information from huge amount of data available in the web logs format that enclose intrinsic information regarding web pages accessed. Because of this large amount of web log data, it is better to deal with small set of data at a time, instead of handling with complete data. Now we need to find the distance between two user sessions, using some distance similarity function which can accomplish this kind of tasks. Clustering of users tends to establish groups of users exhibiting similar browsing patterns. In this paper we propose an efficient algorithm for calculating the similarity between two user sessions based on sequence alignment that uses one of the dynamic programming techniques that is Longest Common Subsequences.

Keywords


Clustering, Longest Common Subsequence, Web Logs, Web Usage Mining.