





Mining Scalable Multidimensional Sequential User Access Logs Using Parallel Partitioning Transaction Reduction Algorithm
Subscribe/Renew Journal
Web usage mining refers to the automatic discovery and analysis of patterns in click stream generated as a result of user interactions with Web resources on one or more Web sites. The primary data sources used in Web usage mining are the server log files, which include Web server access logs and application server logs. The web usage mining techniques are used to analyze the web usage patterns for a web site. The user access log is used to fetch the user access patterns. These patterns are preprocessed with many preprocessing methods like data fusion, data cleaning, session identification, exclusive user identification, page view identification, term view identification and path completion. To make the entire preprocessing faster, Hash map is used for its data organization. After preprocessing, it gives an isolated group of users with common interests. The complete preprocessing has done with the usage patterns stored in a web server access logs in order to provide clean, unique and reduced dataset for pattern mining. This automatically reduced the original size of dataset which makes it easier of pattern mining, analysis and increases the prediction accuracy. There are numerous pattern mining approaches which can be applied on purified data. The preprocessing practices will exploit the quality of pattern mining methodologies and the results can be used for recommended systems to find the behavior of a user. Key objective is to wide-ranging the above activities with high speed and achieve high prediction accuracy by concentrating on data preprocessing, discovery curious patterns and assessment.