机构地区: 中国科学院计算技术研究所智能信息处理重点实验室
出 处: 《计算机学报》 2010年第4期666-671,共6页
摘 要: 分类超曲面算法是一种简单的基于覆盖的分类算法.实验证明该算法具有分类正确率高、速度快的优点.但是,关于该算法的相关理论问题需要深入研究.文中对该算法的几个相关理论问题进行了研究.首先给出并证明了在分割的最大层数给定时算法假设空间的VC维,在此基础上结合可能近似正确(Probably Approximately Correct,PAC)学习框架,得出了对算法样本复杂度的估计,使得分类超曲面算法保证可PAC学习到任意目标概念.其次,分析了算法的时间复杂度和空间复杂度.最后,给出了无矛盾样本集的概念,并证明当输入样本集是有限无矛盾样本集的条件下,算法一定是收敛的. Hyper Surface Classification (HSC) is a simple covering based classification algorithm.Experiments show that HSC can efficiently and accurately classify large-sized data in two-dimensional space and three-dimensional space.However,little research has been done on theoretical problems in HSC.This paper studies several theoretical problems in HSC.First,the paper shows that given the biggest dividing level l,the VC dimension of the hypothesis space is d2l.Under the PAC theory,it arrives at the conclusion on sample complexity,the algorithm probably learn a hypothesis that is approximately correct.Then,the time complexity and space complexity are analyzed.Finally,sample set without contradiction is defined,and show that if the inputting sample set is a finite sample set without contradiction,the algorithm must be convergent.
领 域: [自动化与计算机技术] [自动化与计算机技术]