To address the increase in volume of data streams online users interact with, there are a growing number of tools and models to summarize and extract information. These tools use prediction models to personalize and extract useful information. However, data streams are highly prone to the phenomena of concept drift, in which the data distribution changes over time. To maintain the performance level of these models, models should adapt to handle the existence of adrift. In this work, we present the Incremental Knowledge Concept Drift (IKCD) algorithm, an adaptive unsupervised learning algorithm for recommendation systems in news data stream. Data modelling in IKCD uses k-means clustering to determine the occurrence of a drift while avoiding the dependency on the availability of data labels. Once a drift is detected, new retraining data is composed from the old and new concept. IKCD is tested using synthetic and real benchmark datasets from various domains, which demonstrate the different drift types and with different rate of change. Experimental results illustrate an enhanced performance with respect to (a) reducing model sensitivity to noise, (b) reducing model rebuilding frequency up to 50% in case of re-occurring drift and (c) increasing accuracy of the model by about 10% with respect the accuracy of confidence distribution batch detection algorithm.
Keywords
Concept Drift, Change Detection, Recommendation Systems.
User
Font Size
Information