机构地区: 广东药学院医药信息工程学院
出 处: 《生物医学工程学杂志》 2011年第6期1213-1216,共4页
摘 要: 针对基因表达谱样本数据少、维度高、噪声大的特点,维数约减十分必要。由于基因表达谱数据是以一种高维非线性的向量存在,传统的降维方法使得一些本质维数较低的高维数据无法投影到低维空间中,为此本文引入一种改进距离的局部线性嵌入(LLE)算法对其进行降维。由于原始的LLE方法对近邻个数参数非常敏感,为了增强算法对近邻参数的鲁棒性,文中提出了一种改进距离来度量样本点之间的距离,从而降低了样本点分布不均匀对算法的影响。实验结果表明,改进距离的LLE方法能够有效地提取分类特征信息,并能够在保持较高的分类正确率的前提下大幅度地降低基因数据的维数。 With its high dimensionalities, small samples and great noise, feature reduetiota of gene expression profile becomes quite necessary. The most common form of gene expression profile is nonlinear, and traditional dimensionality reduction methods can not project high dimensional data, whose initial dimensionalities are low, into low dimensional space. In this work, an improved distance locally linear embedding (LLE) algorism was proposed to reduce the dimensionalities. LLE method is very sensitive to the closely-neighboring parameters. In order to enhance the robustness to the number of neighbors, in the paper we presented a novel distance to measure the distance between the samples for the purpose of reducing the influence of distribution of samples. Experimental results demonstrated that the improved distance LLE can effectively extract information of classification features and greatly reduce the dimen- sionalities of data while maintaining a higher classification accuracy.
关 键 词: 局部线性嵌入 基因数据分类 特征提取 改进距离
领 域: [自动化与计算机技术] [自动化与计算机技术]