Open Access Open Access  Restricted Access Subscription Access
Open Access Open Access Open Access  Restricted Access Restricted Access Subscription Access

A General Model for Sequential Pattern Mining with a Progressive Database


Affiliations
1 Sri Indu College of Engineering & Technology, India
2 Nagarjuna Univesity, Guntur, India
     

   Subscribe/Renew Journal


Although there have been many recent studies on the mining of sequential patterns in a static database and in a database with increasing data, these works, in general, do not fully explore the effect of deleting old data from the sequences in the database. When sequential patterns are generated, the newly arriving patterns may not be identified as frequent sequential patterns due to the existence of old data and sequences. Even worse, the obsolete sequential patterns that are not frequent recently may stay in the reported results. In practice, users are usually more interested in the recent data than the old ones. To capture the dynamic nature of data addition and deletion, we propose a general model of sequential pattern mining with a progressive database while the data in the database may be static, inserted, or deleted. In addition, we present a progressive algorithm Pisa, which stands for Progressive mining of Sequential patterns, to progressively discover sequential patterns in defined time period of interest (POI). The POI is a sliding window continuously advancing as the time goes by. Pisautilizes a progressive sequential tree to efficiently maintain the latest data sequences, discover the complete set of up-to-date sequential patterns, and delete obsolete data and patterns accordingly. The height of the sequential pattern tree proposed is bounded by the length of POI, thereby effectively limiting the memory space required by Pisathat is significantly smaller than the memory needed by the alternative method, Direct Appending (DirApp). Note that the sequential pattern mining with a static database and with an incremental database are special cases of the progressive sequential pattern mining. By changing Start time and End time of the POI, Pisacan easily deal with a static database or an incremental database as well. Complexity of algorithms proposed is analyzed. The experimental results show that Pisanot only significantly outperforms the prior methods in execution time by orders of magnitude but also possesses graceful scalability.

Keywords

Progressive Sequential Pattern.
User
Subscription Login to verify subscription
Notifications
Font Size

Abstract Views: 374

PDF Views: 2




  • A General Model for Sequential Pattern Mining with a Progressive Database

Abstract Views: 374  |  PDF Views: 2

Authors

K. Venkatesh Sharma
Sri Indu College of Engineering & Technology, India
K. Hanumantha Rao
Nagarjuna Univesity, Guntur, India
A. Shiva Kumar
Sri Indu College of Engineering & Technology, India

Abstract


Although there have been many recent studies on the mining of sequential patterns in a static database and in a database with increasing data, these works, in general, do not fully explore the effect of deleting old data from the sequences in the database. When sequential patterns are generated, the newly arriving patterns may not be identified as frequent sequential patterns due to the existence of old data and sequences. Even worse, the obsolete sequential patterns that are not frequent recently may stay in the reported results. In practice, users are usually more interested in the recent data than the old ones. To capture the dynamic nature of data addition and deletion, we propose a general model of sequential pattern mining with a progressive database while the data in the database may be static, inserted, or deleted. In addition, we present a progressive algorithm Pisa, which stands for Progressive mining of Sequential patterns, to progressively discover sequential patterns in defined time period of interest (POI). The POI is a sliding window continuously advancing as the time goes by. Pisautilizes a progressive sequential tree to efficiently maintain the latest data sequences, discover the complete set of up-to-date sequential patterns, and delete obsolete data and patterns accordingly. The height of the sequential pattern tree proposed is bounded by the length of POI, thereby effectively limiting the memory space required by Pisathat is significantly smaller than the memory needed by the alternative method, Direct Appending (DirApp). Note that the sequential pattern mining with a static database and with an incremental database are special cases of the progressive sequential pattern mining. By changing Start time and End time of the POI, Pisacan easily deal with a static database or an incremental database as well. Complexity of algorithms proposed is analyzed. The experimental results show that Pisanot only significantly outperforms the prior methods in execution time by orders of magnitude but also possesses graceful scalability.

Keywords


Progressive Sequential Pattern.