文献详情 - Gdtheory理论粤军网|广东智库信息化平台

文献详细_{Journal detailed}

基于加权二部图匹配的中文段落相似度计算
Chinese paragraph similarity calculated based on weighted bipartite graph match

下载全文在线阅读

收藏

出　　处： 《计算机工程与应用》 2017年第18期95-101,共7页

摘　　要： 为了改进传统以向量空间模型(VSM)为代表的基于词频统计的方法在中文段落相似度计算时存在的精度不高问题,在基于加权二部图匹配的思想上提出了一种计算中文段落之间相似度的方法。该方法将相似度计算分为段落和句子两个层次,将句子作为简单段落看待,也使用二部图匹配进行相似度计算。首先利用句子主干词汇提取算法来提取句子的主干词汇,将主干词汇作为二部图的顶点,把主干词汇之间的相似度作为二部图顶点之间的权值系数,进行句子相似度的计算。其次,将句子作为加权二部图的顶点,把句子之间的相似度作为二部图顶点之间的权值系数,进行段落之间的相似度计算。实验结果表明,该方法与VSM相比,由于它能准确识别同义词,自动匹配两个在段落中不同位置的相似词语,因而在准确度上有了很大的提高。 In order to improve the low accuracy of the statistical method that is represented by the traditional VectorSpace Model(VSM)and based on word frequency in Chinese paragraph similarity computing,this thesis proposes amethod to compute Chinese paragraph similarity on the basis of weighted bipartite graph matching.The similarity computingmethod will be divided into two levels:paragraphs and sentences.Thus,sentences can be treated as paragraphs andcalculated the similarity by using bipartite graph matching.First of all,it utilizes key words extraction algorithm to extractthe main vocabulary backbone of the sentences,using the main vocabulary as vertex of weighted bipartite graph to calculatesimilarity of sentences.Secondly,it calculates the paragraph similarity by using the sentence as a vertex of weightedbipartite graph,and the similarity between sentences as the weight coefficient between the vertex of weighted bipartitegraph.Experimental results show that the proposed method has been greatly increased in accuracy compared with VSM,in virtue of its ability to identify synonyms accurately and match two similar words in different locations of paragraphsautomatically.

关键词： 段落相似度句子主干提取二部图匹配向量空间模型中文分词

基于加权二部图匹配的中文段落相似度计算
Chinese paragraph similarity calculated based on weighted bipartite graph match

参考文献更多+

二级参考文献更多+

引证文献更多+

二级引证文献更多+

同被引文献更多+

耦合作品文献更多+

相关文献更多+

相关作者

相关机构对象

相关领域作者

基于加权二部图匹配的中文段落相似度计算 Chinese paragraph similarity calculated based on weighted bipartite graph match

参考文献 更多+

二级参考文献 更多+

引证文献 更多+

二级引证文献 更多+

同被引文献 更多+

耦合作品文献 更多+

相关文献 更多+

相关作者

相关机构对象

相关领域作者

基于加权二部图匹配的中文段落相似度计算
Chinese paragraph similarity calculated based on weighted bipartite graph match

参考文献更多+

二级参考文献更多+

引证文献更多+

二级引证文献更多+

同被引文献更多+

耦合作品文献更多+

相关文献更多+