帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

蛋白质组质谱分析中基于串并联支持向量机的肽段色谱保留时间预测方法
A new peptide retention time prediction method for mass spectrometry based proteomic analysis by a serial and parallel support vector machine model

作  者: ; ; ; ;

机构地区: 国防科学技术大学机电工程与自动化学院

出  处: 《色谱》 2012年第9期857-863,共7页

摘  要: 基于质谱的大规模蛋白质鉴定中,在线液相色谱分离发挥了重要作用。色谱保留时间(retention time,RT)是肽段鉴定和定量的重要信息。由于整个色谱分析运行时间中,流动相中的有机相采用了非线性浓度曲线以及样品中肽段之间的相互影响等因素,基于肽段序列的RT预测还存在精度不高、模型推广性能差等问题。本文提出了一种基于串并联支持向量机(serial and parallel support vector machine,SP-SVM)的RT预测方法,能够表征洗脱过程中有机相浓度的非线性变化和肽段之间的相互影响,显著提高了肽段保留时间预测的精度。利用复杂样本数据集验证结果表明,预测RT和实验RT之间的决定系数达到了0.95,超过95%的鉴定肽段的RT预测误差范围小于总运行时间的20%,超过70%的鉴定肽段的RT预测误差范围小于总运行时间的10%。本文提出的模型的性能达到了目前已知的最好水平。 The online reversed-phase liquid chromatography(RPLC) contributes a lot for the large scale mass spectrometry based protein identification in proteomics.Retention time(RT) as an important evidence can be used to distinguish the false positive/true positive peptide identifications.Because of the nonlinear concentration curve of organic phase in the whole range of run time and the interactions among peptides,the sequence based RT prediction of peptides has low accuracy and is difficult to generalize in practice,and thus is less effective in the validation of peptide identifications.A serial and parallel support vector machine(SP-SVM) method was proposed to characterize the nonlinear effect of organic phase concentration and the interactions among peptides.The SP-SVM contains a support vector regression(SVR) only for model training(named as p-SVR) and 4 SVM models(named as C-SVM,l-SVR,s-SVR and n-SVR) for the RT prediction.After distinguishing the peptide chromatographic behavior by C-SVM,l-SVR and s-SVR were used to predict the peptide RT specifically to improve the accuracy.Then the peptide RT was normalized by n-SVR to characterize the peptide interactions.The prediction accuracy was improved significantly by applying this method to the processing of the complex sample dataset.The coefficient of the determination between predictive and experimental RTs reaches 0.95,the prediction error range was less than 20% of the total LC run time for more than 95% cases,and less than 10% of the total LC run time for more than 70% cases.The performance of this model reaches the best of known so far.More important,the SP-SVM method provides a framework to take into account the interactions among peptides in chromatographic separation,and its performance can be improved further by introducing new data processing and experiment strategy.

关 键 词: 液相色谱 质谱 串并联支持向量机 保留时间 预测精度 肽段鉴定 蛋白质组学

领  域: [理学] [理学]

相关作者

作者 唐敏然
作者 洪雁
作者 程雪宁
作者 瞿娜娜
作者 崔蓉

相关机构对象

机构 华南理工大学
机构 中山大学
机构 华南理工大学工商管理学院
机构 五邑大学
机构 华南师范大学教育科学学院

相关领域作者

作者 刘广平
作者 彭刚
作者 杨科
作者 陈艺云
作者 崔淑慧