机构地区: 华南理工大学计算机科学与工程学院
出 处: 《计算机研究与发展》 2004年第5期807-811,共5页
摘 要: 经典链接分析方法 (如PageRank和HITS)更多地关注的是网页的权威度 ,而不是其主题相关度 ,所以在引导主题搜索的过程中 ,很快就发生主题漂移 为此 ,在构建主题关联拓扑模型的基础上 ,提出了Inherit/Feedback方法 ,以用于Web主题挖掘 基本思想是 :在搜索路径上 ,一个结点继承其父辈结点的主题相关度 ,并且将其主题相关度反馈给父辈结点 同时 ,提出了基于Inherit/Feedback的主题搜索算法 (IFC) 实验结果表明 ,这种方法能有效地引导主题搜索 。 Classical hyperlink analysis algorithms (such as PageRank, HITS) focus on the authority of Web page rather than its topic Thus the crawler based on these algorithms would rapidly drift away in the course of crawling In this paper a new hyperlink analysis method called Inherit/Feedback is presented The key idea is that a page inherits the topic specific correlation from its ancestors and gets the feedback from its descendants There are various applications that can be enhanced by the Inherit/Feedback method, such as pages ranking and topic specific crawling A new topic specific crawling algorithm based on Inherit/Feedback (IFC) is also proposed The experiments show that IFC performs quite well while guiding the topic specific crawling agent and it can be applied to the further discovery and mining from topic specific website
领 域: [自动化与计算机技术] [自动化与计算机技术]