机构地区: 五邑大学信息学院
出 处: 《计算机应用》 2008年第S2期219-222,共4页
摘 要: 数据流的无限性和流动性使得传统的频繁项挖掘算法难以适用。针对数据流的特点,提出了一种实时的挖掘数据流近似频繁项的算法。在允许的偏差范围内,新算法只需扫描一次数据项,使用的存储空间远远小于数据流的规模,能动态地挖掘数据流中的所有频繁项。将数据项存储到一种新的数据结构中,利用该数据结构可以快速地删除非频繁项。最后,理论分析和实验表明这种方法的有效性。 The limitlessness and mobility of data streams made the traditional frequent item algorithm difficult to apply to data streams.According to the characteristic of data streams,a real-time algorithm for mining frequent item from data streams was proposed.Within the allowance of deviation,all the frequent items could be mining dynamically by one pass over the data items and the available memory space was very limited relative to the volume of a data streams in this algorithm.Those data items were stored in a new data structure.And the non-frequent items could be rapidly deleted based on the new data structure.At last,the analysis of theory and experiment show that the approach is effective.
领 域: [自动化与计算机技术] [自动化与计算机技术]