机构地区: 中国矿业大学计算机科学与技术学院
出 处: 《模式识别与人工智能》 2011年第3期327-331,共5页
摘 要: 现存的数据提取算法,大都以方差贡献率作为评价准则,来衡量特征提取的效果.然而方差贡献率注重的是样本相关矩阵特征值的性质,并不能顾及到信息的度量问题.文中将Shannon信息熵理论引入提取算法,定义类概率、类信息函数,通过计算累计信息贡献率来确定提取特征维数,提取效果可以从信息论的角度评价.将此理论与因子分析(FA)结合,建立基于信息熵的FA特征提取算法,利用信息贡献率确定主因子提取的个数.通过实例分析,验证理论的有效性. The performance assessments of existing data extraction algorithms mostly use variance contribution rate calculated by eigenvalues of raw data to measure the effect of feature extraction. However, variance contribution rate emphasizes the characteristic of eigenvalues of correlation matrix of the sample and it can not take information measuring into account. The extraction effect can be assessed from the angle of information theory by introducing Shannon information entropy into extraction algorithm, defining class probability and class information function and determining feature dimensions by calculating total information contribution rate. The theory are combined with factor analysis (FA) and FA feature extraction algorithm of information function is established. The extracting number of main factors is determined by information contribution rate. Finally, the efficiency of the theory is tested by cases.
领 域: [自动化与计算机技术] [自动化与计算机技术]