机构地区: 广东工业大学计算机学院
出 处: 《计算机科学》 2019年第6期64-68,共5页
摘 要: 随着大数据应用的发展,通过非线性流形采样得到的多类型关系数据规模越来越大,数据几何结构更加复杂,异构关系数据变得异常稀疏,导致数据挖掘难度增大且准确率降低.针对上述问题,提出一种基于流形非负矩阵三分解的多类型关系数据联合聚类方法:首先,对于较小规模的实体,根据其自然关系或内容相关性构造关联矩阵,对其分解后得到该类实体的聚类指示矩阵,将其作为非负矩阵三分解的输入;然后,在快速非负矩阵三分解(FNMTF)的基础上加入流形正则化处理,实现数据类型间关系与类型内部关系的联合聚类,进一步提高聚类的准确率.实验表明:在准确率和整体性能方面,流形非负矩阵三分解算法优于传统的基于非负矩阵分解的联合聚类算法. With the development of big data applications,the size of multi-type relational data sampled from nonlinear manifolds is getting larger.The data geometric structure is more complicated,and the heterogeneous relational data are becoming extremely sparse.As a result,data mining becomes more difficult and less accurate.In order to solve this problem,this paper proposed a manifold nonnegative matrix tri-factorization(MNMTF)approach for multi-type relational data co-clustering.First of all,the correlation matrix is constructed with the natural relationship or content relevance of smaller-scale entities and it is decomposed into indicating matrix.The indicating matrix is used as the input of nonnegative matrix tri-factorization.Then,the manifold regularization is added on the basis of fast nonnegative matrix tri-factorization(FNMTF)to simultaneously cluster data inter-type relationships and intra-type relationships,improving the accuracy of clustering.Experiments show that the accuracy and performance of MNMTF algorithm are superior to the traditional co-clustering algorithms based on nonnegative matrix factorization.
关 键 词: 多类型关系数据 流形正则化 非负矩阵分解 关联矩阵
领 域: []