“Confidence interval of R-square”, but, which one?

In linear regression, confidence interval (CI) of population DV is narrower than that of predicted DV. With the assumption of generalizability, CI of \tilde{Y}_{\left[1\times1\right]} at x_{\left[1\times p\right]} is

\;\hat{Y}\pm\left(x\left(X^{\tau}X_{\left[N\times p\right]}\right)^{-1}x^{\tau}\right)^{\frac{1}{2}}\hat{\sigma}t_{\frac{\alpha}{2},N-p},

while CI of Y\left(x\right)=\tilde{Y}\left(x\right)+\varepsilon is

\;\hat{Y}\pm\left(1+x\left(X^{\tau}X_{\left[N\times p\right]}\right)^{-1}x^{\tau}\right)^{\frac{1}{2}}\hat{\sigma}t_{\frac{\alpha}{2},N-p}.

The pivot methods of both are quite similar as following.

\;\frac{\hat{Y}-\tilde{Y}}{s_{\hat{Y}}}\sim t_{df=N-p} ,

so \tilde{Y}_{critical}=\hat{Y}-s_{\hat{Y}}\times t_{critical} .

\;\frac{\hat{Y}-Y}{s_{\left(\hat{Y}-Y\right)}}\sim t_{df=N-p} ,

so Y_{critical}=\hat{Y}-s_{\left(\hat{Y}-Y\right)}\times t_{critical}=\hat{Y}-s_{\left(\hat{Y}-\tilde{Y}-\varepsilon\right)}\times t_{critical}

R^{2} of linear regression is the point estimate of


for fixed IV(s) model. Or, it is the point estimate of \rho^{2} wherein \rho denotes the correlation of Y and X\beta, the linear composition of random IV(s) . The CI of \rho^{2} is wider than that of \eta^{2} with the same R^{2} and confidence level.

[update] It is obvious that CI of \rho^{2} relies on the distribution presumption of IV(s) and DV, as fixed IV(s) are just special cases of generally random IV(s). Usually, the presumption is that all IV(s) and DV are from multivariate normal distribution.

In the bivariate normal case with a single random IV, through Fisher's z-transform of Pearson's r, CI of the re-sampled R^{\prime2}=r^{\prime2} can also be constructed. Intuitively, it should be wider than CI of \rho^{2} .

\;\tanh^-\left(r\right)\equiv\frac{1}{2}\log\frac{1+r}{1-r}\;{appr\atop \sim}\; N\left(\tanh^-\left(\rho\right),\frac{1}{N-3}\right)


\;\tanh^-\left(r^{\prime}\right)-\tanh^-\left(r\right){appr\atop \sim}N\left(0,\frac{2}{N-3}\right)

CI of \tanh^-\left(r^{\prime}\right) can be constructed as \tanh^-\left(r\right)\pm\sqrt{\frac{2}{N-3}}z_{\frac{\alpha}{2}} . With the reverse transform \tanh\left(.\right), the CI bounds of R^{\prime2} are




In multiple p IV(s) case, Fisher's z-transform is

\;\left(N-2-p\right)\left(\tanh^-\left(R\right)\right)^{2}\;{appr\atop \sim}\;\chi_{df=p,ncp=\left(N-2-p\right)\left(\tanh^-\left(\rho\right)\right)^{2}}^{2} .

Although it could also be used to construct CI of \rho^{2} , it is inferior to noncentral F approximation of R (Lee, 1971). The latter is the algorithm adopted by MSDOS software R2 (Steiger & Fouladi, 1992) and R-function ci.R2(...) within package MBESS (Kelley, 2008).

In literature, "CI(s) of R-square" are hardly the literal CI(s) of R^{2} in replication once more. Most of them actually refer to CI of \rho^{2} . Authors in social science unfamiliar to L^AT_EX hate to type \rho when they feel convenient to type r or R. Users of experimentally designed fixed IV(s) should have reported CI of \eta^{2} . However, if they were too familiar to Steiger's software R2 to ignore his series papers on CI of effect size, it would be significant chance for them to report a loose CI of \rho^{2}, even in a looser name "CI of R^{2}".


Lee, Y. S. (1971). Some results on the sampling distribution of the multiple correlation coefficient. Journal of the Royal Statistical Society, B, 33, 117–130.

Kelley, K. (2008). MBESS: Methods for the Behavioral, Educational, and Social Sciences. R package version 1.0.1. [Computer software]. Available from http://www.indiana.edu/~kenkel

Steiger, J. H., & Fouladi, R. T. (1992). R2: A computer program for interval estimation, power calculation, and hypothesis testing for the squared multiple correlation. Behavior research methods, instruments and computers, 4, 581–582.

R Code of Part I:

R Code of Part II:

Confidence Region and Not-reject Region

Either Confidence Interval (CI) or Null Hypothesis Significance Test (NHST) has the same business, to advise whether some sample X\equiv\left(X_{1},X_{2},\dots,X_{n}\right) is or is not disliked by some hypothesized parameter \vartheta.

NHST.com manages a database. For each Miss \vartheta, NHST spies out all she dislikes. Mr X logs in NHST.com and inputs a girl name and his credit card number, to bet his luck whispering-- Does she dislike me?

CI.com manages a database too. For each Mr X, CI only needs his credit card with his name X on it, then serves him a full list of available girls.

NHST.com has been historically monopolizing the market. Nevertheless, somebody prefer visiting CI.com and find that the two may share database in most cases.

Not-reject Region of \vartheta is defined as A\left(\vartheta\right)=\left\{ x:\vartheta\; doesn't\;dislike\;x\right\} .

Confidence Region of x is defined as S\left(x\right)\equiv\left\{ \vartheta:\vartheta\; doesn't\;dislike\;x\right\} .

\theta\in S\left(X\right)\Leftrightarrow \theta\,does\,not\,dislike\,X \Leftrightarrow\,X\in\,A\left(\theta\right)

So, Pr_{\vartheta}\left(\vartheta\in S\left(X\right)\right)\ge1-\alpha,\forall\vartheta\Longleftrightarrow Pr_{\vartheta}\left(X\notin A\left(\vartheta\right)\right)\le\alpha,\forall\vartheta


如果确实大部分同学认真跟着我学三遍后还不能明白区间估计的假设检验,我承认是我教学上的失败。然而我不介意讲第四遍第五遍(实际上,在结构方程部分, \Sigma(\theta)、方程结构和S的关系我至少重复了五遍。但是五遍都能听懂,一定胜过三遍还没听懂?)假如有同学有兴趣,欢迎贡献一个问卷调查有多少人终于弄懂区间估计和假设检验,还没有弄懂的同学中有多少同学仍然有足够的兴趣企图花时间去弄懂。做在线问卷只需要动机,不需要写代码的能力。我很希望有更多同学去实践在线问卷这项重要的技能。