导 师: 刘云生
学科专业: H1202
授予学位: 博士
作 者: ;
机构地区: 华中科技大学
摘 要:
在知识发现过程中,由于待处理的数据集有时带有噪声或不完整,因此需要能处理不精确、不确定数据的理论和方法。粗糙集理论正是满足这种要求的新型数学工具。基于粗糙集的知识发现过程,就是利用粗糙集理论与方法从数据中挖掘出新颖的、有用的非平凡的模式过程。本文围绕知识约简这个核心研究问题,分别从差别矩阵、启发式信息及数据库系统的角度对知识约简进行了深入研究。将粗糙集引入VAGUE目标信息系统,讨论了VAGUE目标信息系统的知识约简问题。主要有以下几方面:
⑴现有差别矩阵只适用于一致或部分一致决策表,对于完全不一致决策表并不能得到正确的结果,给出了一种基于差别矩阵的知识约简改进算法。
⑵由等价类而不是单个元素参与差别矩阵的构造,得到一种简化的代数约简差别矩阵。从差别矩阵的角度讨论了代数约简和条件信息熵约简的核属性计算问题,指出代数约简核属性是信息熵约简核属性的子集。证明了分布协调集、分配协调集必为代数协调集。但代数约简与分布或分配约简之间并无必然的包含与被包含关系,通过具体算例,分析并指出产生这个结果的原因。基于等价差别矩阵具有相同的知识约简和核属性的思想,对各知识约简所对应的差别矩阵改写成统一的表示形式,分析了其不一致性及内在联系,给出了一种将分布或分配约简转化为代数约简,分布约简转化为分配约简的新方法。
⑶提出了一种新的近似质量及其启发式约简算法。对基于正区域的属性重要性进行分析,发现论域中由决策属性正确分类的等价类及完全由矛盾对象构成的等价类对属性的重要性不会产生影响,从而可以逐步删除,减少约简过程的搜索空间。给出了一种基于新近似质量的属性重要性递�
During the process of knowledge discovery, it is necessary to develop the theories and methods which can deal with imprecise and uncertain information caused by noise or incompleteness. Rough set theory is a novel mathematical tool to handle uncertain information. Knowledge discovery based on rough set theory is a process of finding new, applicable and non-trivial patterns by rough set theory and method. As one of the fundamental contents in rough set theory, knowledge reduction is intensively studied in terms of discernible matrix, heuristic information algorithms and database systems. Knowledge reduction in vague objective system is discussed by introducing rough set theory.
Since the existing discernible matrices can only adapt to consistent or partially consistent decision table, they can not obtain correct reduction results for completely inconsistent one. A improved knowledge reduction algorithm based on discernible matrix is proposed.
A simplified discernible matrix can be constructed by equivalence class instead of single element. The core attribute calculation of algebraic and conditional information entropy reductions is discussed from the view of discernible matrix. It is pointed out that the algebraic core of a decision table is the subset of its information core. It is proved that distribution and assignment consistent sets must be algebraic consistent ones. However, not all algebra reductions can find corresponding distribution or assignment reductions to include them, which is illustrated by some numerical examples. Based on the thought that equivalent discernible matrix has the same attribute reduction and core, the existing discernible matrices for distribution, maximum distribution, assignment and algebraic reductions are rewritten as a uniform form. The inconsistence and intrinsic relation between different types of knowledge reductions are analyzed under inconsistent decision table. New methods of transforming distribution or assignment reductions into algebra reduction, distribution r
领 域: [自动化与计算机技术] [自动化与计算机技术] [理学] [理学]