帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

频繁子图挖掘算法的研究
Research on Algorithms for Frequent Subgraph Mining

导  师: 郭景峰

学科专业: 081202

授予学位: 硕士

作  者: ;

机构地区: 燕山大学

摘  要: 数据挖掘技术是人们对数据库技术进行长期研究开发的结果,可对数据库进行查询、遍历和访问,找出数据之间的潜在的联系,从而促进有用信息的产生。图几乎能够模拟所有事物之间的联系,如在web挖掘、空间数据挖掘、生物信息学中蛋白质结构挖掘、药物分子设计及其功能预测等领域都有广泛的应用。作为图挖掘中一个重要分支的频繁子图挖掘是图分类、图聚类等其他图挖掘研究的基础,从而使得频繁子图挖掘的工作具有更深远的意义。本文主要内容如下: 首先,简要介绍了数据挖掘的定义及目前数据挖掘常用的技术,重点讲述频繁子图挖掘的基础知识和定义,还介绍了几个经典的频繁子图挖掘算法并对其技术进行分析。 其次,本文提出一种基于Apriori思想的频繁子图挖掘算法GAI,该算法对图的表示形式,判断子图同构以及计算支持度的方法进行了改进。 再次,在GAI频繁子图挖掘算法的基础上,提出一种聚类算法,阐述怎样构造特征集和如何聚类。 最后,本文通过实验证明了新的频繁子图挖掘算法GAI在运行时间效率上具有明显的优势,还通过数据实验证明该算法能够挖掘出所有的频繁子图模式,从而验证了算法的准确性。该算法比之前的AGM和FSG有效。 Data Mining is the result of which people research and development on database for long-term.It can query,traverse and access the database,find the potential links between the data,thus promotes the generation of useful information.Graph structure can simulate almost all of the links between things, such as web mining,spatial data mining, bioinformatics, protein structure mining, drug design and function prediction and other fields,it has a wide range of applications on them.As an important branch of graph mining,frequent subgraph mining is the basis of graph classification,clustering and other graph mining studies,thus it makes frequent subgraph mining work has more far-reaching significance.The focus of this paper is as follows: Firstly, this paper introduces the definition of data mining and the basic technology which data mining commonly use, introduces the basic knowledge and definition of frequent subgraph mining as focus,introduces several classical algorithms of frequent subgraph mining,and analysis their technology. Secondly, we propose a new algorithm based on Apriori algorithm for frequent subgraph mining named GAI.It improves the representation of graph, the way of judging subgraph isomorphism and the way of calculating the support degree of graph. Then, in this paper, we propose a graph clustering algorithm based on frequent subgraph mining algorithm GAI, also explain how to construct feature sets and how to cluster. Finally, in this paper, we example and experiment the GAI algorithm.The new algorithm for mining frequent subgraphs shows a significant advantage in the run-time efficiency.This experiment also shows the accuracy of the algorithm and experimental studies have shown that through GAI we can find all of the frequent sub-graph models that we need, and GAI is more effective than the previous algorithm AGM and FSG.

关 键 词: 数据挖掘 频繁子图 图聚类

分 类 号: [TP301.6]

领  域: [自动化与计算机技术] [自动化与计算机技术]

相关作者

作者 洪明
作者 汤俊
作者 孙宗锋
作者 谷斌
作者 钟美华

相关机构对象

机构 华南理工大学
机构 中山大学
机构 暨南大学
机构 华南师范大学
机构 暨南大学管理学院

相关领域作者

作者 李文姬
作者 邵慧君
作者 杜松华
作者 周国林
作者 邢弘昊