机构地区: 四川大学
出 处: 《管理科学学报》 2015年第3期114-126,共13页
摘 要: 现实的银行客户信用评估数据常包含大量的缺失值,这在很大程度上影响了信用评估模型的性能.针对已有模型的不足,提出了面向缺失数据的动态分类器集成选择模型DCESM.该模型充分利用数据集中所包含的已知信息,在训练信用评估模型之前不需要事先对缺失数据进行预处理,从而减少了对数据缺失机制假设以及数据分布模型的依赖.从UCI数据库中选择两个银行信用卡业务信用评估数据集进行实证分析,结果表明,与4种常用的基于插补法的多分类器集成模型以及1种直接面向缺失数据建模的集成模型相比,DCESM模型能够取得更好的客户信用评估性能. The data in the bank customer ’ s credit scoring often include lots of missing values , which affect the modeling performance to a large extent .To overcome the deficiencies of existing models , this paper proposes a dynamic classifier ensemble selection model for missing values ( DCESM) .The model can make full use of the information included in the dataset and does not need to pre-process the missing values before training the model , which decreases the dependence on the hypothesis for data missing mechanism and distribution model . Two credit scoring datasets on bank credit card business from UCI database were selected for our empirical a -nalysis .The results show that the DCESM model outperformed four imputation-based multi-classifiers ensemble models and one ensemble model for missing values .
领 域: [经济管理]