帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

基于随机森林的鱼粉蛋白近红外定量分析
Near-infrared Analysis of Fishmeal Protein Based on Random Forest

作  者: ; ; ; (陈福);

机构地区: 桂林理工大学理学院

出  处: 《农业机械学报》 2015年第5期233-238,共6页

摘  要: 基于近红外(NIR)光谱技术,采用随机森林(RF)回归方法测定饲料鱼粉的蛋白含量。考虑到RF模型的随机性,通过调试决策树数量(ntree)和分裂变量数目(nsv)来进行模型优选;利用基尼系数(G)的下降量来判断近红外波长变量的建模重要性,进而为鱼粉蛋白的NIR分析优选信息波长,以提高NIR定量分析精度。根据统计学原理,选择具有较低计算复杂度的等效最优模型。优选的RF模型构建471个决策树,需要随机的103个波长变量进行树节点分裂,同时通过计算节点分裂前后G的平均下降量来选择52个近红外信息波长进行定标校正,得到等效最优的校正模型,校正均方根偏差和校正相关系数分别为3.970%和0.943;经过独立的预测集样品对最优RF模型进行检验,预测均方根偏差为5.271%,预测相关系数为0.906,说明RF回归结合G系数的波长优选能够有效地提高NIR光谱应用于鱼粉蛋白定量的预测能力。 Random forest (RF) regression algorithm was utilized for determination of protein content in fishmeal samples based on near-infrared (NIR) spectrometry. Considering the randomness of RF method, the optimized models were selected by tuning the two vital modeling parameters of the number of decision trees ( ntree ) and the number of split variables (nsv). The descending of Gini coefficient (G) is taken as the indicator performing the modeling importance of NIR valuables. It was used to select the informative wavelengths for NIR analysis of fishmeal, with an aim to improve the accuracy of quantitative models. According to statistical theory, we tried to select equivalent optimal model with relatively low computational complexity. The optimized RF model needed to construct 471 decision trees and randomly select 103 wavelength variables for node splitting when the decision trees grow. Simultaneously, 52 NIR informative wavelengths can be selected out according to the average of G descending values based on the trees in the forest. The equivalent optimized RF model output the root mean square error (RMSEv) and correlation coefficient ( Rv ) of validation set were 3.970% and 0. 943, respectively. The optimized model was further evaluated by using the prediction samples that were excluded from modeling process, with the RMSEp of 5.271% , and the Rp of 0. 906. Results showed that RF regression combined with G coefficients for wavelength selection is feasible and effective to improve the NIR predictive ability for quantitative determination of fishmeal protein.

关 键 词: 鱼粉蛋白 近红外光谱 随机森林 基尼系数 波长优选

领  域: [机械工程] [理学] [理学] [农业科学] [农业科学] [农业科学]

相关作者

作者 侯春峰
作者 钟文菁
作者 李刚
作者 谢小平
作者 钟慧

相关机构对象

机构 暨南大学
机构 暨南大学经济学院
机构 广东外语外贸大学国际经济贸易学院
机构 广东外语外贸大学南国商学院
机构 中山大学

相关领域作者

作者 李振义
作者 吴晨
作者 张琳
作者 丁培强
作者 吴肖林