Graphgain:A Proposed Measure for Ranking Mined Subgraph
Subscribe/Renew Journal
Frequent itemset discovery algorithms have been used to solve various interesting problems over the year. As data mining techniques are being introduced and widely applied to non-traditional itemsets, existing approaches for finding frequent itemsets cannot be used as they cannot model the requirement of these domains. An alternate way of modeling the objects in these data sets, is to use a graph to model the database objects. Within that model, the problem of finding frequent patterns becomes that of finding subgraphs that occur frequently over the entire set of graphs. Modeling objects using graphs allows us to represne tarbitrary relations among entities. In this paper we present a computationally efficient algorithm for finding the ranking of such frequent subgraphs. The subgraph finding method may follow any one of the existing algorithm. In order to find out the ranking of subgraphs we present a new method called “graphgain”. A graphgain is the normalization technique applied at each position for a chosen value of Discounted Cumulative Gain (DCG) of a subgraph. The DCG alone cannot be used to achieve the performance from one query to the next in the search engine’s algorithm. To obtain the graphgain an ideal ordering of DCG (IDCG) at each position is to be found out. For this, a Modified Dicounted Cumulative Gain using “lift” is introduced here and IDCG is also evaluated. Then the graphgain is evaluated. Finally, the graphgain for all rules can be averaged to obtain a measure of the average performance of a search engine’s ranking algorithm. And also the ordering of graphgain will provide the order of evaluation of rules which gives in turn the efficient ranking of subgraph process.
Keywords
Abstract Views: 267
PDF Views: 1