文献详情 - Gdtheory理论粤军网|广东智库信息化平台

文献详细_{Journal detailed}

针对小文本的Web数据挖掘技术及其应用
A Web Data Mining Technology Aims at Small Texts and its Application

下载全文在线阅读

收藏

作　　者： ; ;

出　　处： 《微计算机信息》 2006年第07X期203-205,共3页

摘　　要： 现有搜索引擎技术返回给用户的信息太多太杂,为此提出一种针对小文本的基于近似网页聚类算法的Web文本数据挖掘技术,该技术根据用户的兴趣程度形成词汇库,利用模糊聚类方法获得分词词典组,采用MD5算法去除重复页面,采用近似网页聚类算法对剩余页面聚类,并用马尔可夫Web序列挖掘算法对聚类结果排序,从而提供用户感兴趣的网页簇序列,使用户可以迅速找到感兴趣的页面。实验证明该算法在保证查全率和查准率的基础上大大提高了搜索效率。由于是针对小文本的数据挖掘,所研究的算法时间和空间复杂度都不高,因此有望成为一种实用、有效的信息检索技术。 As the usual search engines often return too massive and disorder information, an algorithm on clustering Web pages in view of small texts is proposed.This algorithm expresses the text characteristic by using the vector space model and clusters the vocabulary interested （users can initialize it according needs） by the users with fuzzy clustering analysis method to obtain knowledge pattern ,removes the repeated pages by using MDS. The rest pages are clustered by using the approximate pages clusters algorithm and ordered by using a data mining algorithm of Web accessing sequence based on Markov＇ s chain to make users obtain the cared approximate pages clusters. The experiment indicates that this algorithm greatly enhance the searching efficiency. Because the data mining points to small texts, the complexity of time and space axe not high. So it is hopeful to become a practicable and information searching technology.

关键词： 智能搜索数据挖掘小文本用户兴趣

领　　域： [自动化与计算机技术] [自动化与计算机技术]

针对小文本的Web数据挖掘技术及其应用
A Web Data Mining Technology Aims at Small Texts and its Application

参考文献更多+

二级参考文献更多+

引证文献更多+

二级引证文献更多+

同被引文献更多+

耦合作品文献更多+

相关文献更多+

相关作者

相关机构对象

相关领域作者

针对小文本的Web数据挖掘技术及其应用 A Web Data Mining Technology Aims at Small Texts and its Application

参考文献 更多+

二级参考文献 更多+

引证文献 更多+

二级引证文献 更多+

同被引文献 更多+

耦合作品文献 更多+

相关文献 更多+

相关作者

相关机构对象

相关领域作者

针对小文本的Web数据挖掘技术及其应用
A Web Data Mining Technology Aims at Small Texts and its Application

参考文献更多+

二级参考文献更多+

引证文献更多+

二级引证文献更多+

同被引文献更多+

耦合作品文献更多+

相关文献更多+