帮助 本站公告
您现在所在的位置:网站首页 > 知识中心 > 文献详情
文献详细Journal detailed

函数数据回归与降维
Regression and Dimension Reduction for Functional Data

导  师: 张宝学

学科专业: 070103

授予学位: 博士

作  者: ;

机构地区: 东北师范大学

摘  要: 在多元分析领域,人们越来越普遍的遇到高维数据.但是由于计算量和维数灾难,分析高维数据是十分困难的.函数数据分析/(Functional data Analysis, FDA/)假设观测数据来自一条光滑的曲线,在处理高维数据等问题中具有独特的优势.随着在线收集数据和非参数技术的发展,函数数据分析是现代统计学研究的一个热门领域,在医学,生物,微阵列数据,化学和生存分析等领域都有广泛的应用.很多经典的统计分析方法都被相应的推广到函数数据. 本文由四部分组成.首先,本文提出了一种新的典型相关-混合典型相关分析.混合典型相关分析研究的是向量与函数间的线性相关关系.进一步本文研究了混合典型相关分析的理论性质. 其次,函数型数据是将整个函数看作一个无穷维的观测,因此在函数数据分析中降维是十分重要的.本文主要研究的是函数数据的降维问题,本文提出函数多指标模型,函数多指标模型是把一个实值反应变量看成是一些被称为指标的预测变量的函数.这些指标是希尔伯特空间中的二阶随机过程生成的随机元素,并且这些指标系数构成了所谓的有效降维空间/(Effective Dimensional Reduction, EDR/).为了对EDR空间进行统计推断,本文将考虑3种情况下的函数多指标模型:一是预测是函数数据,反应是一维随机变量;二是预测是函数数据,反应是多元向量;三是预测是函数数据,反应是二值随机变量,并把这些方法应用到实际数据比如谱数据,煤质分析,气候预报等。 再次,当反应是一维实值变量,预测是函数的函数线性模型是所有函数线性模型中研究最多的.本文提出一种新的基于函数降维的方法来估计回归参数函数.具体地介绍了用函数降维来估计回归函数参数的过程,并与函数主成分回归做比较.本文提出一种新的方法来估计EDR空间,这种方法不需要协方差算子的逆. 最后,本文的第四部分考虑的是函数变换模型.函数线性模型被广泛的应用来建立一维反应变量与函数预测的回归关系.但是当原始数据不满足线性假设时,个直观的解是对反应作某些变换,使得变换后的数据满足线性关系.但是在函数数据分析中,怎么估计这样的变换函数一直以来都没有受到大家的重视.本文建议一个非参数的变换模型,并用样条来构建变换函数.本文提出用混合典型相关来估计函数回归系数.混合典型相关类似于向量里的典型相关分析,但是它是描述一个向量和一个函数间的相关关系.这里,本文应用混合典型相关把变换函数投影到B-样条空间.本文建议用一个模型选择的准则来决定样条空间的维数.典型地,需要很少的样条结点.接下来,本文讨论了关于函数数据的一些普遍的计算问题. High dimensional data are becoming more and more common in the field of multi-variate data analysis. However, the high dimensionality is problematic due to increasingcomputational costs and to the curse of dimensionality. Functional data analysis assumethe observed data from a smooth curve, and have its distinctive superiority to deal withhigh dimensional data. As the development of collecting data online and nonparametrictechnology, functional data analysis is a hot area of modern statistics research, and hasreceived more and more attention in diferent fields of application, including medical,biological, Micro-array data, Chemistry and Survival analysis. Many classic statisticsmethod have been extended to functional data analysis. This paper consists four parts. Firstly, I propose a new canonical correlation-mixeddata canonical correlation analysis. Mixed data canonical correlation analysis studiesthe linear correlation between random vector and random function. Furthermore, I studyits theory properties. Secondly, functional data view the whole function as an observation, thus, the di-mension of the observation is infinity, therefore dimension reduction becomes necessaryfor functional data. In this paper, I mainly focus on functional dimension reduction. iproposed functional multi-index model which treat the real response variable as a func-tion of the called index. These indexes are random elements which are generated bysecond order stochastic process in Hilbert space. To infer the EDR space, this paperconsider three cases of functional multiple index models: the first case is that the re-sponse is scalar and the predictor is random curve; the second one is that the responseis binary and the predictor is random curve, and the last case is that the response isp-dimensional random vector and the predictor is random curve. Furthermore, I applythese methods to real data analysis including spectrum data, coal analysis and climateforecasting and so on. Thirdly, the most studied functional linear model is that the response is scalar vari-able and the predictor is functional data. I propose a novel method to estimate the re-gression parameter function, which is based on the functional sufcient dimension reduc-tion. A specific procedure for the estimation of the regression parameter function usingfunctional sufcient dimension reduction is proposed and compared with an establishedfunctional principal component regression approach, and I proposed a new method toestimate the EDR space without needed the inverse of the covariance operator. Lastly, the fourth part of this paper is to consider functional transformed model.Functional linear regression has been widely used to model the relationship betweena scalar response and functional predictors. If the original data do not satisfy the lin-ear assumption, an intuitive solution is to perform some transformation such that trans-formed data will be linearly related. The problem of finding such transformations hasbeen rather neglected in the development of functional data analysis tools. This paperconsider transformation on the response variable in functional linear regression and pro-pose a nonparametric transformation model in which this paper use spline functions toconstruct the transformation function. The functional regression coefcients are thenestimated by an innovative procedure called mixed data canonical correlation analysis/(MDCCA/). MDCCA is analogous to the canonical correlation analysis between twomultivariate samples, but is between a multivariate sample and a set of functional data.Here, this paper apply the MDCCA to the projection of the transformation function onthe B-spline space and the functional predictors. then show that our estimates agree withthe regularized functional least squares estimate for the transformation model subjectto a scale multiplication. The dimension of the space of spline transformations can bedetermined by a model selection principle. Typically, a very small number of B-splineknots is needed. Some general computational issues of functional data analysis werediscussed.

关 键 词: 函数数据 充分性降维 函数线性模型 模型选择 混合典型相关 变换模型 有效降维空间

分 类 号: [O212.4]

领  域: [理学] [理学]

相关作者

作者 陈笑
作者 何卫平
作者 夏华菁
作者 赵树萍
作者 康孝军

相关机构对象

机构 广州大学
机构 暨南大学
机构 华南理工大学
机构 华南师范大学数学科学学院
机构 中山大学

相关领域作者

作者 刘广平
作者 彭刚
作者 杨科
作者 陈艺云
作者 崔淑慧