2024-03-28T10:59:35Z
https://tsukuba.repo.nii.ac.jp/oai
oai:tsukuba.repo.nii.ac.jp:00027801
2022-04-27T08:55:47Z
117:1697
117:786
3:62:5601:1659
Two-Stage Procedures for High-Dimensional Data
青嶋, 誠
矢田, 和善
Aoshima, Makoto
Yata, Kazuyoshi
Asymptotic normality
Classification
Confidence region
HDLSS
Lasso
Pathway analysis
Regression
Sample size determination
Testing equality of covariance matrices
Two-sample test
Variable selection
In this article, we consider a variety of inference problems for high-dimensional data. The purpose of this article is to suggest directions for future research and possible solutions about p n problems by using new types of two-stage estimation methodologies. This is the first attempt to apply sequential analysis to high-dimensional statistical inference ensuring prespecified accuracy. We offer the sample size determination for inference problems by creating new types of multivariate two-stage procedures. To develop theory and methodologies, the most important and basic idea is the asymptotic normality when p → ∞. By developing asymptotic normality when p → ∞, we first give (a) a given-bandwidth confidence region for the square loss. In addition, we give (b) a two-sample test to assure prespecified size and power simultaneously together with (c) an equality-test procedure for two covariance matrices. We also give (d) a two-stage discriminant procedure that controls misclassification rates being no more than a prespecified value. Moreover, we propose (e) a two-stage variable selection procedure that provides screening of variables in the first stage and selects a significant set of associated variables from among a set of candidate variables in the second stage. Following the variable selection procedure, we consider (f) variable selection for high-dimensional regression to compare favorably with the lasso in terms of the assurance of accuracy and the computational cost. Further, we consider variable selection for classification and propose (g) a two-stage discriminant procedure after screening some variables. Finally, we consider (h) pathway analysis for high-dimensional data by constructing a multiple test of correlation coefficients.
Editor's Special Invited Paper。
この招待論文は、Abraham Wald Prize in Sequential Analysis 2012の受賞論文となっております。
journal article
Taylor & Francis
2011-11
application/pdf
application/pdf
Sequential analysis
4
30
356
399
http://hdl.handle.net/2241/117753
0747-4946
AA10538981
https://tsukuba.repo.nii.ac.jp/record/27801/files/SA_30-4-432.pdf
https://tsukuba.repo.nii.ac.jp/record/27801/files/SA_30-4-356.pdf
eng
10.1080/07474946.2011.619088
http://hdl.handle.net/2241/117756
© Taylor & Francis Group, LLC.
This is an Author's Accepted Manuscript of an article published in Sequential Analysis Nov 2011 , available online at: http://www.tandfonline.com/doi/full/10.1080/07474946.2011.619088