“Confidence interval of R-square”, but, which one?

In linear regression, confidence interval (CI) of population DV is narrower than that of predicted DV. With the assumption of generalizability, CI of \tilde{Y}_{\left[1\times1\right]} at x_{\left[1\times p\right]} is

\;\hat{Y}\pm\left(x\left(X^{\tau}X_{\left[N\times p\right]}\right)^{-1}x^{\tau}\right)^{\frac{1}{2}}\hat{\sigma}t_{\frac{\alpha}{2},N-p},

while CI of Y\left(x\right)=\tilde{Y}\left(x\right)+\varepsilon is

\;\hat{Y}\pm\left(1+x\left(X^{\tau}X_{\left[N\times p\right]}\right)^{-1}x^{\tau}\right)^{\frac{1}{2}}\hat{\sigma}t_{\frac{\alpha}{2},N-p}.

The pivot methods of both are quite similar as following.

\;\frac{\hat{Y}-\tilde{Y}}{s_{\hat{Y}}}\sim t_{df=N-p} ,

so \tilde{Y}_{critical}=\hat{Y}-s_{\hat{Y}}\times t_{critical} .

\;\frac{\hat{Y}-Y}{s_{\left(\hat{Y}-Y\right)}}\sim t_{df=N-p} ,

so Y_{critical}=\hat{Y}-s_{\left(\hat{Y}-Y\right)}\times t_{critical}=\hat{Y}-s_{\left(\hat{Y}-\tilde{Y}-\varepsilon\right)}\times t_{critical}

R^{2} of linear regression is the point estimate of


for fixed IV(s) model. Or, it is the point estimate of \rho^{2} wherein \rho denotes the correlation of Y and X\beta, the linear composition of random IV(s) . The CI of \rho^{2} is wider than that of \eta^{2} with the same R^{2} and confidence level.

[update] It is obvious that CI of \rho^{2} relies on the distribution presumption of IV(s) and DV, as fixed IV(s) are just special cases of generally random IV(s). Usually, the presumption is that all IV(s) and DV are from multivariate normal distribution.

In the bivariate normal case with a single random IV, through Fisher's z-transform of Pearson's r, CI of the re-sampled R^{\prime2}=r^{\prime2} can also be constructed. Intuitively, it should be wider than CI of \rho^{2} .

\;\tanh^-\left(r\right)\equiv\frac{1}{2}\log\frac{1+r}{1-r}\;{appr\atop \sim}\; N\left(\tanh^-\left(\rho\right),\frac{1}{N-3}\right)


\;\tanh^-\left(r^{\prime}\right)-\tanh^-\left(r\right){appr\atop \sim}N\left(0,\frac{2}{N-3}\right)

CI of \tanh^-\left(r^{\prime}\right) can be constructed as \tanh^-\left(r\right)\pm\sqrt{\frac{2}{N-3}}z_{\frac{\alpha}{2}} . With the reverse transform \tanh\left(.\right), the CI bounds of R^{\prime2} are




In multiple p IV(s) case, Fisher's z-transform is

\;\left(N-2-p\right)\left(\tanh^-\left(R\right)\right)^{2}\;{appr\atop \sim}\;\chi_{df=p,ncp=\left(N-2-p\right)\left(\tanh^-\left(\rho\right)\right)^{2}}^{2} .

Although it could also be used to construct CI of \rho^{2} , it is inferior to noncentral F approximation of R (Lee, 1971). The latter is the algorithm adopted by MSDOS software R2 (Steiger & Fouladi, 1992) and R-function ci.R2(...) within package MBESS (Kelley, 2008).

In literature, "CI(s) of R-square" are hardly the literal CI(s) of R^{2} in replication once more. Most of them actually refer to CI of \rho^{2} . Authors in social science unfamiliar to L^AT_EX hate to type \rho when they feel convenient to type r or R. Users of experimentally designed fixed IV(s) should have reported CI of \eta^{2} . However, if they were too familiar to Steiger's software R2 to ignore his series papers on CI of effect size, it would be significant chance for them to report a loose CI of \rho^{2}, even in a looser name "CI of R^{2}".


Lee, Y. S. (1971). Some results on the sampling distribution of the multiple correlation coefficient. Journal of the Royal Statistical Society, B, 33, 117–130.

Kelley, K. (2008). MBESS: Methods for the Behavioral, Educational, and Social Sciences. R package version 1.0.1. [Computer software]. Available from http://www.indiana.edu/~kenkel

Steiger, J. H., & Fouladi, R. T. (1992). R2: A computer program for interval estimation, power calculation, and hypothesis testing for the squared multiple correlation. Behavior research methods, instruments and computers, 4, 581–582.

R Code of Part I:

R Code of Part II:

WordPress (and WPMU) Plugin for R Web Interface

Download: RwebFriend.zip [Update] Including Chinese UTF8 Version
Plugin Name: RwebFriend
Plugin URL: http://xiaoxu.lxxm.com/RwebFriend
Description: Set Rweb url options and transform [rcode]...[/rcode] or <rcode>...</rcode> tag-pair into TEXTAREA which supports direct submit to web interface of R.

*Credit notes:codes of two relevant plugins are studied and imported. One of the plugins deals with auto html tags within TEXTAREA tag-pair, the other stops WordPress to auto-transform quotation marks.

Version: 1.0
Author: Xiaoxu LI
Author URI: http://xiaoxu.lxxm.com/
Setup:Wordpress 3.5



WordPress 3.4


[update] The free Chinese wordpress platform yo2.cn has installed this plugin.  See my demo.

More online demos -- http://wiki.qixianglu.cn/rwebfriendttest/

[update, June 2009] 72pines.com (here!) installed this plugin. Try

[update2009JUL18]Test installed packages of Rweb:


Type III ANOVA SS for factor A within interaction of factor B is defined as SS_{A:B+A+B}-SS_{A:B+B}, wherein A:B  is the pure interaction effect orthogonal to main effects of A, B, and intercept. There are some details in R to get pure interaction dummy IV(s).

Data is from SAS example PROC GLM, Example 30.3: Unbalanced ANOVA for Two-Way Design with Interaction

##Data from http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
drug <- as.factor(c(t(t(rep(1,3)))%*%t(1:4))); ##Factor A
disease <- as.factor(c(t(t(1:3)) %*% t(rep(1,4))));##Factor B
y <- t(matrix(c(
42 ,44 ,36 ,13 ,19 ,22
,33 ,NA ,26 ,NA ,33 ,21
,31 ,-3 ,NA ,25 ,25 ,24
,28 ,NA ,23 ,34 ,42 ,13
,NA ,34 ,33 ,31 ,NA ,36
,3 ,26 ,28 ,32 ,4 ,16
,NA ,NA ,1 ,29 ,NA ,19
,NA ,11 ,9 ,7 ,1 ,-6
,21 ,1 ,NA ,9 ,3 ,NA
,24 ,NA ,9 ,22 ,-2 ,15
,27 ,12 ,12 ,-5 ,16 ,15
,22 ,7 ,25 ,5 ,12 ,NA
## verify data with http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
## make a big table
y <- c(y);
drug <- rep(drug,6);
disease <- rep(disease,6);
## Design the PURE interaction dummy variables
m <- model.matrix(lm(rep(0,length(disease)) ~ disease + drug +disease:drug));
##! If lm(y~ ...) is used, the is.na(y) rows will be dropped. The residuals will be orthogonal to observed A, & B rather than designed cell A & B. It will be Type II SS rather than Type III SS.
c <- attr(m,"assign")==3;
(IV_Interaction <-residuals( lm(m[,c] ~ m[,!c])));
## verify data through type I & II ANOVA to http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
## Type I ANOVA of A, defined by SS_A --
## Type II ANOVA of A, defined by SS_{A+B}-SS_B --
anova(lm(y~disease),lm(y~drug + disease))
## Type III ANOVA of A defined by SS_{A:B+A+B}-SS_{A:B+B}
t(t(c( anova(lm(y~IV_Interaction+disease),lm(y~disease * drug))$'Sum of Sq'[2]
,anova(lm(y~IV_Interaction+drug),lm(y~disease*drug))$'Sum of Sq'[2]
,anova(lm(y~disease+drug),lm(y~disease*drug))$'Sum of Sq'[2])))

Currently, Anova(...) of Prof John Fox's car package (V. 1.2-8 or 1.2-9) used "impure" interaction dummy IV(s), which made its type III result relying upon the order of factor levels. I think in its next version, the "pure" interaction dummy IV(s) will be adopted to give consistent type III SS.


In Prof John FOX's car package, with parameter contrasts in inputted lm object, Example(Anova) gave type III SS consistent to  other softwares. In this case, the code line should be --

Anova(lm(y~drug*disease, contrasts=list(drug=contr.sum, disease=contr.sum)),type='III');

Contrasts patterns are defined within lm(...) rather than Anova(...). An lm object with default contrasts parameter is inappropriate to calculate type III SS, or the result will rely on the level names in any nominal factor --

M2$f2<-as.factor(- as.integer(M2$fcategory));
mod1<-lm(formula = conformity ~ f1 * partner.status,data=M2);
mod2<-lm(formula = conformity ~ f2 * partner.status,data=M2);
c(Anova(mod1,type='III')$'Sum Sq'[3],Anova(mod2,type='III')$'Sum Sq'[3])

There was hot discussion of type III ANOVA on R-help newsgroup. Thomas Lumley thought Types of SS nowadays don't have to make any real sense --


This is one of many examples of an attempt to provide a mathematical answer to something that isn't a mathematical question.

As people have already pointed out, in any practical testing situation you have two models you want to compare. If you are working in an interactive statistical environment, or even in a modern batch-mode system, you can fit the two models and compare them. If you want to compare two other models, you can fit them and compare them.

However, in the Bad Old Days this was inconvenient (or so I'm told). If you had half a dozen tests, and one of the models was the same in each test, it was a substantial saving of time and effort to fit this model just once.

This led to a system where you specify a model and a set of tests: eg I'm going to fit y~a+b+c+d and I want to test (some of) y~a vs y~a+b, y~a+b vs y~a+b+c and so on. Or, I want to test (some of) y~a+b+c vs y~a+b+c+d, y~a+b+d vs y~a+b+c+d and so on. This gives the "Types" of sums of squares, which are ways of specifying sets of tests. You could pick the "Type" so that the total number of linear models you had to fit was minimized. As these are merely a computational optimization, they don't have to make any real sense. Unfortunately, as with many optimizations, they have gained a life of their own.

The "Type III" sums of squares are the same regardless of order, but this is a bad property, not a good one. The question you are asking when you test "for" a term X really does depend on what other terms are in the model, so order really does matter. However, since you can do anything just by specifying two models and comparing them, you don't actually need to worry about any of this.


DV predicted by two IVs, vs. triangular pyramid

-- Diagram from Wiki

It is easier to imagine relation in three spatial vectors by their angles, than by their correlations. For standardized DV Y=\left(y_{1},y_{2},\dots,y_{N}\right)^{\tau} and IVs X_{1}=\left(x_{1,1},x_{2,1},\dots,x_{N,1}\right)^{\tau}, X_{2}=\left(x_{1,2},x_{2,2},\dots,x_{N,2}\right)^{\tau}, cosines of three angles of the triangular pyramid determinate the correlation matrix, thus, all statistics of the regressions Y=\beta_{1}X_{1}+\beta_{2}X_{2}+\varepsilon and Y=\beta_{1}X_{1}+\varepsilon . Unexpected but imaginative results on the impact of introducing X_{2} are --

1. Both IVs are nearly independent of DV. Togethor they predict DV almost perfectly (\angle YX_{1}=\angle YX_{2}=89^{\circ} and \angle X_{1}X_{2}=177.9^{\circ}).

2. Both IVs are almost perfectly correlated with DV. Togethor, one of the regressive coefficient is significantly negative (1^{\circ}, 0.6^{\circ} and 0.5^{\circ} respectively).

3. Redundancy (Cohen, Cohen, West, & Aiken, 2003) increases to full and then decreases to zero and even negative (\angle YX_{1}=60^{\circ}, \angle YX_{2}=45{}^{\circ} and \angle X_{1}X_{2} closes from 90^{\circ} to 45^{\circ} then to 15^{\circ}+\epsilon ).

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S.  (2003). Applied multiple regression/correlation analysis for the behavioral sciences(3rd ed.) Mahwah, NJ: Lawrence Erlbaum Associates.