转自：泡网
Data Snoop, 民科的神奇直线(google 始作俑者):
R tip：
In linear regression, confidence interval (CI) of population DV is narrower than that of predicted DV. With the assumption of generalizability, CI of \tilde{Y}_{\left[1\times1\right]} at x_{\left[1\times p\right]} is
\;\hat{Y}\pm\left(x\left(X^{\tau}X_{\left[N\times p\right]}\right)^{1}x^{\tau}\right)^{\frac{1}{2}}\hat{\sigma}t_{\frac{\alpha}{2},Np},
while CI of Y\left(x\right)=\tilde{Y}\left(x\right)+\varepsilon is
\;\hat{Y}\pm\left(1+x\left(X^{\tau}X_{\left[N\times p\right]}\right)^{1}x^{\tau}\right)^{\frac{1}{2}}\hat{\sigma}t_{\frac{\alpha}{2},Np}.
The pivot methods of both are quite similar as following.
\;\frac{\hat{Y}\tilde{Y}}{s_{\hat{Y}}}\sim t_{df=Np} ,
so \tilde{Y}_{critical}=\hat{Y}s_{\hat{Y}}\times t_{critical} .
\;\frac{\hat{Y}Y}{s_{\left(\hat{Y}Y\right)}}\sim t_{df=Np} ,
so Y_{critical}=\hat{Y}s_{\left(\hat{Y}Y\right)}\times t_{critical}=\hat{Y}s_{\left(\hat{Y}\tilde{Y}\varepsilon\right)}\times t_{critical}
R^{2} of linear regression is the point estimate of
\;\eta^{2}\equiv\frac{SS\left(\tilde{Y}_{\left[N\times1\right]}\right)}{SS\left(\tilde{Y}_{\left[N\times1\right]}\right)+N\sigma^{2}}for fixed IV(s) model. Or, it is the point estimate of \rho^{2} wherein \rho denotes the correlation of Y and X\beta, the linear composition of random IV(s) . The CI of \rho^{2} is wider than that of \eta^{2} with the same R^{2} and confidence level.
[update] It is obvious that CI of \rho^{2} relies on the distribution presumption of IV(s) and DV, as fixed IV(s) are just special cases of generally random IV(s). Usually, the presumption is that all IV(s) and DV are from multivariate normal distribution.
In the bivariate normal case with a single random IV, through Fisher's ztransform of Pearson's r, CI of the resampled R^{\prime2}=r^{\prime2} can also be constructed. Intuitively, it should be wider than CI of \rho^{2} .
\;\tanh^\left(r\right)\equiv\frac{1}{2}\log\frac{1+r}{1r}\;{appr\atop \sim}\; N\left(\tanh^\left(\rho\right),\frac{1}{N3}\right)Thus,
\;\tanh^\left(r^{\prime}\right)\tanh^\left(r\right){appr\atop \sim}N\left(0,\frac{2}{N3}\right)CI of \tanh^\left(r^{\prime}\right) can be constructed as \tanh^\left(r\right)\pm\sqrt{\frac{2}{N3}}z_{\frac{\alpha}{2}} . With the reverse transform \tanh\left(.\right), the CI bounds of R^{\prime2} are
\;\left(\max\left(0,\tanh\left(\tanh^\left(R\right)\sqrt{\frac{2}{N3}}z_{1\frac{\alpha}{2}}\right)\right)\right)^{2}and
\;\left(\tanh\left(\tanh^{1}\left(R\right)+\sqrt{\frac{2}{N3}}z_{1\frac{\alpha}{2}}\right)\right)^{2}.
In multiple p IV(s) case, Fisher's ztransform is
\;\left(N2p\right)\left(\tanh^\left(R\right)\right)^{2}\;{appr\atop \sim}\;\chi_{df=p,ncp=\left(N2p\right)\left(\tanh^\left(\rho\right)\right)^{2}}^{2} .
Although it could also be used to construct CI of \rho^{2} , it is inferior to noncentral F approximation of R (Lee, 1971). The latter is the algorithm adopted by MSDOS software R2 (Steiger & Fouladi, 1992) and Rfunction ci.R2(...) within package MBESS (Kelley, 2008).
In literature, "CI(s) of Rsquare" are hardly the literal CI(s) of R^{2} in replication once more. Most of them actually refer to CI of \rho^{2} . Authors in social science unfamiliar to L^AT_EX hate to type \rho when they feel convenient to type r or R. Users of experimentally designed fixed IV(s) should have reported CI of \eta^{2} . However, if they were too familiar to Steiger's software R2 to ignore his series papers on CI of effect size, it would be significant chance for them to report a loose CI of \rho^{2}, even in a looser name "CI of R^{2}".

Lee, Y. S. (1971). Some results on the sampling distribution of the multiple correlation coefficient. Journal of the Royal Statistical Society, B, 33, 117–130.
Kelley, K. (2008). MBESS: Methods for the Behavioral, Educational, and Social Sciences. R package version 1.0.1. [Computer software]. Available from http://www.indiana.edu/~kenkel
Steiger, J. H., & Fouladi, R. T. (1992). R2: A computer program for interval estimation, power calculation, and hypothesis testing for the squared multiple correlation. Behavior research methods, instruments and computers, 4, 581–582.
R Code of Part I:
R Code of Part II:
Download: RwebFriend.zip [Update] Including Chinese UTF8 Version 
Plugin Name: RwebFriend 
Plugin URL: http://xiaoxu.lxxm.com/RwebFriend 
Description: Set Rweb url options and transform [rcode]...[/rcode] or <rcode>...</rcode> tagpair into TEXTAREA which supports direct submit to web interface of R.
*Credit notes：codes of two relevant plugins are studied and imported. One of the plugins deals with auto html tags within TEXTAREA tagpair, the other stops WordPress to autotransform quotation marks. 
Version: 1.0 
Author: Xiaoxu LI 
Author URI: http://xiaoxu.lxxm.com/ 
Setup:Wordpress 3.5

WordPress 3.4 
Usage: 
[update] The free Chinese wordpress platform yo2.cn has installed this plugin. See my demo.
More online demos  http://wiki.qixianglu.cn/rwebfriendttest/
[update, June 2009] 72pines.com (here!) installed this plugin. Try
[update2009JUL18]Test installed packages of Rweb:
http://pbil.univlyon1.fr/cgibin/Rweb/Rweb.cgi
https://rweb.stat.umn.edu/cgibin/Rweb/Rweb.cgi
Type III ANOVA SS for factor A within interaction of factor B is defined as SS_{A:B+A+B}SS_{A:B+B}, wherein A:B is the pure interaction effect orthogonal to main effects of A, B, and intercept. There are some details in R to get pure interaction dummy IV(s).
Data is from SAS example PROC GLM, Example 30.3: Unbalanced ANOVA for TwoWay Design with Interaction
##
##Data from http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
##
drug < as.factor(c(t(t(rep(1,3)))%*%t(1:4))); ##Factor A
disease < as.factor(c(t(t(1:3)) %*% t(rep(1,4))));##Factor B
y < t(matrix(c(
42 ,44 ,36 ,13 ,19 ,22
,33 ,NA ,26 ,NA ,33 ,21
,31 ,3 ,NA ,25 ,25 ,24
,28 ,NA ,23 ,34 ,42 ,13
,NA ,34 ,33 ,31 ,NA ,36
,3 ,26 ,28 ,32 ,4 ,16
,NA ,NA ,1 ,29 ,NA ,19
,NA ,11 ,9 ,7 ,1 ,6
,21 ,1 ,NA ,9 ,3 ,NA
,24 ,NA ,9 ,22 ,2 ,15
,27 ,12 ,12 ,5 ,16 ,15
,22 ,7 ,25 ,5 ,12 ,NA
),nrow=6));
## verify data with http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
(cbind(drug,disease,y));
##
## make a big table
y < c(y);
drug < rep(drug,6);
disease < rep(disease,6);
##
## Design the PURE interaction dummy variables
m < model.matrix(lm(rep(0,length(disease)) ~ disease + drug +disease:drug));
##! If lm(y~ ...) is used, the is.na(y) rows will be dropped. The residuals will be orthogonal to observed A, & B rather than designed cell A & B. It will be Type II SS rather than Type III SS.
c < attr(m,"assign")==3;
(IV_Interaction <residuals( lm(m[,c] ~ m[,!c])));
##
## verify data through type I & II ANOVA to http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
## Type I ANOVA of A, defined by SS_A 
anova(lm(y~drug*disease));
##
## Type II ANOVA of A, defined by SS_{A+B}SS_B 
require(car);
Anova(lm(y~drug*disease),type='II');
anova(lm(y~disease),lm(y~drug + disease))
##
##
## Type III ANOVA of A defined by SS_{A:B+A+B}SS_{A:B+B}
t(t(c( anova(lm(y~IV_Interaction+disease),lm(y~disease * drug))$'Sum of Sq'[2]
,anova(lm(y~IV_Interaction+drug),lm(y~disease*drug))$'Sum of Sq'[2]
,anova(lm(y~disease+drug),lm(y~disease*drug))$'Sum of Sq'[2])))
##
##
Currently, Anova(...) of Prof John Fox's car package (V. 1.28 or 1.29) used "impure" interaction dummy IV(s), which made its type III result relying upon the order of factor levels. I think in its next version, the "pure" interaction dummy IV(s) will be adopted to give consistent type III SS.
[update:]
In Prof John FOX's car package, with parameter contrasts in inputted lm object, Example(Anova) gave type III SS consistent to other softwares. In this case, the code line should be 
Anova(lm(y~drug*disease, contrasts=list(drug=contr.sum, disease=contr.sum)),type='III');
Contrasts patterns are defined within lm(...) rather than Anova(...). An lm object with default contrasts parameter is inappropriate to calculate type III SS, or the result will rely on the level names in any nominal factor 
require(car);
M2<Moore;
M2$f1<M2$fcategory;
M2$f2<as.factor( as.integer(M2$fcategory));
mod1<lm(formula = conformity ~ f1 * partner.status,data=M2);
mod2<lm(formula = conformity ~ f2 * partner.status,data=M2);
c(Anova(mod1,type='III')$'Sum Sq'[3],Anova(mod2,type='III')$'Sum Sq'[3])
There was hot discussion of type III ANOVA on Rhelp newsgroup. Thomas Lumley thought Types of SS nowadays don't have to make any real sense 
http://tolstoy.newcastle.edu.au/R/help/05/04/3009.html
This is one of many examples of an attempt to provide a mathematical answer to something that isn't a mathematical question.
As people have already pointed out, in any practical testing situation you have two models you want to compare. If you are working in an interactive statistical environment, or even in a modern batchmode system, you can fit the two models and compare them. If you want to compare two other models, you can fit them and compare them.
However, in the Bad Old Days this was inconvenient (or so I'm told). If you had half a dozen tests, and one of the models was the same in each test, it was a substantial saving of time and effort to fit this model just once.
This led to a system where you specify a model and a set of tests: eg I'm going to fit y~a+b+c+d and I want to test (some of) y~a vs y~a+b, y~a+b vs y~a+b+c and so on. Or, I want to test (some of) y~a+b+c vs y~a+b+c+d, y~a+b+d vs y~a+b+c+d and so on. This gives the "Types" of sums of squares, which are ways of specifying sets of tests. You could pick the "Type" so that the total number of linear models you had to fit was minimized. As these are merely a computational optimization, they don't have to make any real sense. Unfortunately, as with many optimizations, they have gained a life of their own.
The "Type III" sums of squares are the same regardless of order, but this is a bad property, not a good one. The question you are asking when you test "for" a term X really does depend on what other terms are in the model, so order really does matter. However, since you can do anything just by specifying two models and comparing them, you don't actually need to worry about any of this.
thomas
 Diagram from Wiki
It is easier to imagine relation in three spatial vectors by their angles, than by their correlations. For standardized DV Y=\left(y_{1},y_{2},\dots,y_{N}\right)^{\tau} and IVs X_{1}=\left(x_{1,1},x_{2,1},\dots,x_{N,1}\right)^{\tau}, X_{2}=\left(x_{1,2},x_{2,2},\dots,x_{N,2}\right)^{\tau}, cosines of three angles of the triangular pyramid determinate the correlation matrix, thus, all statistics of the regressions Y=\beta_{1}X_{1}+\beta_{2}X_{2}+\varepsilon and Y=\beta_{1}X_{1}+\varepsilon . Unexpected but imaginative results on the impact of introducing X_{2} are 
1. Both IVs are nearly independent of DV. Togethor they predict DV almost perfectly (\angle YX_{1}=\angle YX_{2}=89^{\circ} and \angle X_{1}X_{2}=177.9^{\circ}).
2. Both IVs are almost perfectly correlated with DV. Togethor, one of the regressive coefficient is significantly negative (1^{\circ}, 0.6^{\circ} and 0.5^{\circ} respectively).
3. Redundancy (Cohen, Cohen, West, & Aiken, 2003) increases to full and then decreases to zero and even negative (\angle YX_{1}=60^{\circ}, \angle YX_{2}=45{}^{\circ} and \angle X_{1}X_{2} closes from 90^{\circ} to 45^{\circ} then to 15^{\circ}+\epsilon ).