机构地区: 佛山科学技术学院
出 处: 《计算机与数字工程》 2013年第11期1725-1728,共4页
摘 要: 在基于Stacking框架下异构分类器集成的元学习基础上,将无监督的聚类应用到分类过程中,提出一种基于聚类分析的改进Stacking集成算法。训练样本首先被基分类器分类,随后分类结果被聚类成多个簇,以便分类结果相一致的样本能够被聚集至同一个簇中,同时,将样本特征属性也应用到聚类过程中以增强聚类效果,在每个聚簇内应用C4.5决策树算法提炼决策边界;在分类阶段,首先找出与待分类样本距离最近的聚簇,之后用此聚簇的决策树模型进行分类。实验结果表明,该算法在分类准确性方面有明显优势。 On the Stacking framework to construct heterogeneous ensemble meta-learning, a modified version of Stacking based on clus- ter analysis was proposed, applying unsupervised K-means clustering to classification process. Instances from training set are firstly classified by all base classifiers, the classified results are then grouped into a number of clusters, which means that one cluster should contain objects that were correctly/incorrectly classified to the same class by the same group of base classifiers. The algorithm apply the whole instance fea- tures in the clustering process to enhance clustering quality. Next, using C4. 5 algorithm on each cluster to build decision tree, the decision tree on each cluster refines the decision boundaries by learning the subgroups within the cluster. When classifying a new instance, the ap- proach attempts to find the cluster to which it is closest, then uses the decision model on each cluster to make a final decision. Experimental results show that the proposed method outperform individual classifiers, majority voting and classic Stacking method.
领 域: [自动化与计算机技术] [自动化与计算机技术]