Type III ANOVA SS for factor A within interaction of factor B is defined as SS_{A:B+A+B}-SS_{A:B+B}, wherein A:B  is the pure interaction effect orthogonal to main effects of A, B, and intercept. There are some details in R to get pure interaction dummy IV(s).

Data is from SAS example PROC GLM, Example 30.3: Unbalanced ANOVA for Two-Way Design with Interaction

##Data from http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
drug <- as.factor(c(t(t(rep(1,3)))%*%t(1:4))); ##Factor A
disease <- as.factor(c(t(t(1:3)) %*% t(rep(1,4))));##Factor B
y <- t(matrix(c(
42 ,44 ,36 ,13 ,19 ,22
,33 ,NA ,26 ,NA ,33 ,21
,31 ,-3 ,NA ,25 ,25 ,24
,28 ,NA ,23 ,34 ,42 ,13
,NA ,34 ,33 ,31 ,NA ,36
,3 ,26 ,28 ,32 ,4 ,16
,NA ,NA ,1 ,29 ,NA ,19
,NA ,11 ,9 ,7 ,1 ,-6
,21 ,1 ,NA ,9 ,3 ,NA
,24 ,NA ,9 ,22 ,-2 ,15
,27 ,12 ,12 ,-5 ,16 ,15
,22 ,7 ,25 ,5 ,12 ,NA
## verify data with http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
## make a big table
y <- c(y);
drug <- rep(drug,6);
disease <- rep(disease,6);
## Design the PURE interaction dummy variables
m <- model.matrix(lm(rep(0,length(disease)) ~ disease + drug +disease:drug));
##! If lm(y~ ...) is used, the is.na(y) rows will be dropped. The residuals will be orthogonal to observed A, & B rather than designed cell A & B. It will be Type II SS rather than Type III SS.
c <- attr(m,"assign")==3;
(IV_Interaction <-residuals( lm(m[,c] ~ m[,!c])));
## verify data through type I & II ANOVA to http://www.otago.ac.nz/sas/stat/chap30/sect52.htm
## Type I ANOVA of A, defined by SS_A --
## Type II ANOVA of A, defined by SS_{A+B}-SS_B --
anova(lm(y~disease),lm(y~drug + disease))
## Type III ANOVA of A defined by SS_{A:B+A+B}-SS_{A:B+B}
t(t(c( anova(lm(y~IV_Interaction+disease),lm(y~disease * drug))$'Sum of Sq'[2]
,anova(lm(y~IV_Interaction+drug),lm(y~disease*drug))$'Sum of Sq'[2]
,anova(lm(y~disease+drug),lm(y~disease*drug))$'Sum of Sq'[2])))

Currently, Anova(...) of Prof John Fox's car package (V. 1.2-8 or 1.2-9) used "impure" interaction dummy IV(s), which made its type III result relying upon the order of factor levels. I think in its next version, the "pure" interaction dummy IV(s) will be adopted to give consistent type III SS.


In Prof John FOX's car package, with parameter contrasts in inputted lm object, Example(Anova) gave type III SS consistent to  other softwares. In this case, the code line should be --

Anova(lm(y~drug*disease, contrasts=list(drug=contr.sum, disease=contr.sum)),type='III');

Contrasts patterns are defined within lm(...) rather than Anova(...). An lm object with default contrasts parameter is inappropriate to calculate type III SS, or the result will rely on the level names in any nominal factor --

M2$f2<-as.factor(- as.integer(M2$fcategory));
mod1<-lm(formula = conformity ~ f1 * partner.status,data=M2);
mod2<-lm(formula = conformity ~ f2 * partner.status,data=M2);
c(Anova(mod1,type='III')$'Sum Sq'[3],Anova(mod2,type='III')$'Sum Sq'[3])

There was hot discussion of type III ANOVA on R-help newsgroup. Thomas Lumley thought Types of SS nowadays don't have to make any real sense --


This is one of many examples of an attempt to provide a mathematical answer to something that isn't a mathematical question.

As people have already pointed out, in any practical testing situation you have two models you want to compare. If you are working in an interactive statistical environment, or even in a modern batch-mode system, you can fit the two models and compare them. If you want to compare two other models, you can fit them and compare them.

However, in the Bad Old Days this was inconvenient (or so I'm told). If you had half a dozen tests, and one of the models was the same in each test, it was a substantial saving of time and effort to fit this model just once.

This led to a system where you specify a model and a set of tests: eg I'm going to fit y~a+b+c+d and I want to test (some of) y~a vs y~a+b, y~a+b vs y~a+b+c and so on. Or, I want to test (some of) y~a+b+c vs y~a+b+c+d, y~a+b+d vs y~a+b+c+d and so on. This gives the "Types" of sums of squares, which are ways of specifying sets of tests. You could pick the "Type" so that the total number of linear models you had to fit was minimized. As these are merely a computational optimization, they don't have to make any real sense. Unfortunately, as with many optimizations, they have gained a life of their own.

The "Type III" sums of squares are the same regardless of order, but this is a bad property, not a good one. The question you are asking when you test "for" a term X really does depend on what other terms are in the model, so order really does matter. However, since you can do anything just by specifying two models and comparing them, you don't actually need to worry about any of this.


DV predicted by two IVs, vs. triangular pyramid

-- Diagram from Wiki

It is easier to imagine relation in three spatial vectors by their angles, than by their correlations. For standardized DV Y=\left(y_{1},y_{2},\dots,y_{N}\right)^{\tau} and IVs X_{1}=\left(x_{1,1},x_{2,1},\dots,x_{N,1}\right)^{\tau}, X_{2}=\left(x_{1,2},x_{2,2},\dots,x_{N,2}\right)^{\tau}, cosines of three angles of the triangular pyramid determinate the correlation matrix, thus, all statistics of the regressions Y=\beta_{1}X_{1}+\beta_{2}X_{2}+\varepsilon and Y=\beta_{1}X_{1}+\varepsilon . Unexpected but imaginative results on the impact of introducing X_{2} are --

1. Both IVs are nearly independent of DV. Togethor they predict DV almost perfectly (\angle YX_{1}=\angle YX_{2}=89^{\circ} and \angle X_{1}X_{2}=177.9^{\circ}).

2. Both IVs are almost perfectly correlated with DV. Togethor, one of the regressive coefficient is significantly negative (1^{\circ}, 0.6^{\circ} and 0.5^{\circ} respectively).

3. Redundancy (Cohen, Cohen, West, & Aiken, 2003) increases to full and then decreases to zero and even negative (\angle YX_{1}=60^{\circ}, \angle YX_{2}=45{}^{\circ} and \angle X_{1}X_{2} closes from 90^{\circ} to 45^{\circ} then to 15^{\circ}+\epsilon ).

Cohen, J., Cohen, P., West, S. G., & Aiken, L. S.  (2003). Applied multiple regression/correlation analysis for the behavioral sciences(3rd ed.) Mahwah, NJ: Lawrence Erlbaum Associates.

Correction: the convenient radius for 95% confidence interval of t-test

-- What do you call a tea party with more than 30 people? 
-- A Z party!!! 
Joke #123 on http://www.ilstu.edu/~gcramsey/Gallery.html

2*SE is a popular convenient radius to eye 95% CI for t-test. Statisticians take t with df>=30 as z. However, I was incorrect to teach that 1.96*SE could be the precise radius when df>=30.

Guess what is the critical df to use decimally precise 1.96*SE. Larger than expected --

R: str(…) 与 getS3method(…,…)

感谢R专家XIE Yihui同学在线答疑

me: 请教两个R的技术:1.R中有没有对象浏览器之类的工具?一举看完一个对象的子子孙孙 2.怎么看深入的源代码> prcomp
function (x, ...)
<environment: namespace:stats>

Yihui: 1. str()是很常用的一个函数,它可以充分查看对象的子子孙孙 2. 很多函数要么是S3 method,要么是调用C code,所以一般不能直接看源代码

S3 method可以用getS3method()去查看,比如prcomp就是S3方法,那么可以看它的default方法是什么:

> getS3method('prcomp','default')
function (x, retx = TRUE, center = TRUE, scale. = FALSE, tol = NULL,
x <- as.matrix(x)
x <- scale(x, center = center, scale = scale.)
cen <- attr(x, "scaled:center")
sc <- attr(x, "scaled:scale")
if (any(sc == 0))
stop("cannot rescale a constant/zero column to unit variance")
s <- svd(x, nu = 0)
s$d <- s$d/sqrt(max(1, nrow(x) - 1))
if (!is.null(tol)) {
rank <- sum(s$d > (s$d[1L] * tol))
if (rank < ncol(x)) {
s$v <- s$v[, 1L:rank, drop = FALSE]
s$d <- s$d[1L:rank]
dimnames(s$v) <- list(colnames(x), paste("PC", seq_len(ncol(s$v)),
sep = ""))
r <- list(sdev = s$d, rotation = s$v, center = if (is.null(cen)) FALSE else cen,
scale = if (is.null(sc)) FALSE else sc)
if (retx)
r$x <- x %*% s$v
class(r) <- "prcomp"
<environment: namespace:stats>


me: 多谢,节约俺好多搜索时间

Yihui: 嗯,我也是花了很长时间才明白S3 method的意思,呵呵

me: 如果要看biplot.prcomp呢?


Yihui: help会告诉你biplot也是generic function,可以应用在prcomp这种class上,所以:

> getS3method('biplot','prcomp')
function (x, choices = 1:2, scale = 1, pc.biplot = FALSE, ...)
if (length(choices) != 2)
stop("length of choices must be 2")
if (!length(scores <- x$x))
stop(gettextf("object '%s' has no scores", deparse(substitute(x))),
domain = NA)
if (is.complex(scores))
stop("biplots are not defined for complex PCA")
lam <- x$sdev[choices]
n <- NROW(scores)
lam <- lam * sqrt(n)
if (scale < 0 || scale > 1)
warning("'scale' is outside [0, 1]")
if (scale != 0)
lam <- lam^scale
else lam <- 1
if (pc.biplot)
lam <- lam/sqrt(n)
biplot.default(t(t(scores[, choices])/lam), t(t(x$rotation[,
choices]) * lam), ...)
<environment: namespace:stats>


Unexpectedly, the theoretically best reject-region of T-test is bounded.

f_{t}\left(x;\mu,df\right)\equiv C\left(df\right)\left(1+\frac{\left(x-\mu\right)^{2}}{df}\right)^{-\frac{df+1}{2}} \lambda\left(x;\mu_{0},\mu_{1},df\right)\equiv\frac{f_{t}\left(x;\mu_{1},df\right)}{f_{t}\left(x;\mu_{0},df\right)}=\left(\frac{v+\left(x-\mu_{1}\right)^{2}}{v+\left(x-\mu_{0}\right)^{2}}\right)^{-\frac{df+1}{2}}{\longrightarrow\atop x\rightarrow\infty}1

For NHST H_{0}:T\sim t_{df} vs H_{1}:T-1\sim t_{df}, theoretically, p\left(t\right)=\int_{\left\{ x:\lambda\left(x\right)\ge\lambda\left(t\right)\right\} }f_{t}\left(x,\mu_{0},df\right)dx is s.t. \lim_{t\rightarrow\infty}p\left(t\right)=\frac{1}{2} , rather than zero. Nevertheless, pratically a large t, rejecting both H_{0} and H_{1}, should not be counted as any evidence to retain or reject H_{0}.

To verify the shape of \lambda\left(x\right) --

Confidence Region and Not-reject Region

Either Confidence Interval (CI) or Null Hypothesis Significance Test (NHST) has the same business, to advise whether some sample X\equiv\left(X_{1},X_{2},\dots,X_{n}\right) is or is not disliked by some hypothesized parameter \vartheta.

NHST.com manages a database. For each Miss \vartheta, NHST spies out all she dislikes. Mr X logs in NHST.com and inputs a girl name and his credit card number, to bet his luck whispering-- Does she dislike me?

CI.com manages a database too. For each Mr X, CI only needs his credit card with his name X on it, then serves him a full list of available girls.

NHST.com has been historically monopolizing the market. Nevertheless, somebody prefer visiting CI.com and find that the two may share database in most cases.

Not-reject Region of \vartheta is defined as A\left(\vartheta\right)=\left\{ x:\vartheta\; doesn't\;dislike\;x\right\} .

Confidence Region of x is defined as S\left(x\right)\equiv\left\{ \vartheta:\vartheta\; doesn't\;dislike\;x\right\} .

\theta\in S\left(X\right)\Leftrightarrow \theta\,does\,not\,dislike\,X \Leftrightarrow\,X\in\,A\left(\theta\right)

So, Pr_{\vartheta}\left(\vartheta\in S\left(X\right)\right)\ge1-\alpha,\forall\vartheta\Longleftrightarrow Pr_{\vartheta}\left(X\notin A\left(\vartheta\right)\right)\le\alpha,\forall\vartheta

Automatize LISREL jobs

LISREL routine can run in DOS or in command line mode of windows (windows-key + R -> CMD) . The command line is just like --

D:\My Documents>"C:\Program Files\lisrel87\lisrel87.exe" "C:\Program Files\lisrel87\LS8EX\EX61.LS8" D:\myOutput.out

1. You only need edit and input the bold part.
2. Quotation marks are used wherever the paths or filenames include blanks.
3. The 2nd argument is the output file. You can still specify more output options in your .ls8 file.
4. A .bat file can automatize batches of such lisrel jobs.

Developing normal pdf from symmetry & independence

When I was in the 3rd grade of my middle school, I enjoyed my town bookstore as a standing library. There a series of six math-story books by Zhang Yuan-Nan impressed me a lot. I cited a case from one in my PPT when I taught the normal distribution -- the normal pdf can be derived from simple symmetry & independence conditions.

Today I can even google out an illegal pdf of its new edition to verify the case (2005, pp. 89). Actually I have bought the new edition series (now 3 books) and lent them to students. Those conditions are as instinctive as--

1. For white noise errors on 2-D, the independence means pdf at (x,y) is the product of 1-D pdf, that is, \phi\left(x\right)\phi\left(y\right) .

2. The symmetry means pdf at (x,y) is just a function of x^{2}+y^{2}, nothing to do with direction. That is, \phi\left(x\right)\phi\left(y\right)=f\left(x^{2}+y^{2}\right).

So, f\left(x^{2}\right)f\left(y^{2}\right)=f\left(x^{2}+0\right)f\left(0+y^{2}\right)=\phi^{2}\left(0\right)f\left(x^{2}+y^{2}\right).

For middle school students, the book stated a gap here to arrive at the final result f\left(x^{2}\right)=ke^{bx^{2}}, which is \phi\left(x\right)=\frac{1}{\phi\left(0\right)}f\left(x^{2}+0\right)=\frac{k}{\phi\left(0\right)}e^{bx^{2}}.

I think non-math graduate students with interests can close the gap by themselves with following small hints.

Denote \alpha=x^{2},\beta=y^{2}.
We have
\log f\left(\alpha\right)+\log f\left(\beta\right)=\log\phi^{2}\left(0\right)+\log f\left(\alpha+\beta\right),
\;\;\left[\log f\left(\alpha\right)-\log\phi^{2}\left(0\right)\right]+\left[\log f\left(\beta\right)-\log\phi^{2}\left(0\right)\right]   =\left(\log f\left(\alpha+\beta\right)-\log\phi^{2}\left(0\right)\right).
Denote g\left(\alpha\right)=\log f\left(\alpha\right)-\log\phi^{2}\left(0\right) .
That is, g\left(\alpha\right)+g\left(\beta\right)=g\left(\alpha+\beta\right).

Now to prove g\left(\frac{m}{n}\right)=\frac{m}{n}g\left(1\right),\forall m,n\in\mathbb{N}. With continuousness, it gets g\left(\alpha\right)=\alpha g\left(1\right),\forall\alpha\in\mathbb{R}.


愉快地发现我在WinXP上最常用的编辑工具SciTE(当前版本1.75) 和 LyX(当前版本1.5.3) 都支持unicode(也就是说,支持中文)。之前不了解,只因为缺省设置不支持中文。需要手工操作修改设置。

SciTE的设置是Options->Open Global Options File,编辑SciTEGlobal.properties,找到如下段落

# Unicode
# Required for Unicode to work on GTK+:


# Unicode
# Required for Unicode to work on GTK+:

保存。然后关闭再打开SciTE,就会发现不再出现中文被切一半的现象。如果编辑的文档格式不是utf-8而是ucs-2 ,还可以在File->Encoding 里临时选。

[update] 除了utf-8, SciTE 还支持国内更常用的GBK码,设置如下:


此外,我还推荐把SciTEGlobal.properties文件中的line.margin.visible=1 和 wrap=1 两处的注释#号去掉,效果是缺省显示行号,并使超长的行折行显示。SciTE的优点太多了--开源免费;轻巧,启动快;支持Ctrl+鼠标中轮滚动无级缩放;支持Ctrl+回车 前文已出现过的拼写自动补齐选项;支持Alt键方形选段;...

LyX(版本>=1.5.1)在winXP已经可以在.lyx文件正文和公式框中录入中文。麻烦的是输出中文的pdf。[UPDATED update]LyX的最新版本(1.6.2)捆绑MikTeX的安装包已经对中文(unicode)支持得很好了。感谢楼下joomlagate先生email给我的情报:http://cohomo.blogbus.com/logs/31361739.html 的后半篇介绍了通过XeTeX输出pdf的简单设置。我今天试了一下,效果非常理想。


新版本LyX已经引入了文档版本控制,相当于word中的revision功能,有待深入试用。目前LyX仍不[update]已经支持Ctrl+鼠标中轮滚动无级缩放,如果公式显得太小,需要在菜单设置中修改显示缩放比例:Tools->Preferences->Look and feel->Screen fonts->Zoom %。这可能是比较容易在后续版本中实现的功能


SciTE主页 http://www.scintilla.org/SciTE.html

LyX主页 http://lyx.org/

南开MiKTeX中文插件 http://miktex.math.nankai.edu.cn/

LyX中设置XeTeX中文支持的介绍: http://cohomo.blogbus.com/logs/31361739.html

我为Wordpress / WordPress MU 系列平台制作的支持暗背景LaTeX小插件 http://lixiaoxu.lxxm.com/latex_math_cgi

My first wp plugin work: LaTeX_Math_cgi 1.0

Download: LaTeX_Math_cgi.zip

Install: Download  and unzip it to your wordpress' /wp-content/plugins/ directory.

Activate it in Plugins menu. View its options in Options > LaTeX tab --

It is actually used as a mimeTeX plugin rather than just a L^{A}T_{E}X plugin. My contribution is technically trivial. But, you must need it if you change from a light theme to a cool black one while find the default mimeTeX images are black in front and transparent in background. This plugin provides mimeTeX's \reverse and \opaque options to tune your math forms to your wp theme without editing them one by one.

The 2nd function is the option for your cgi URL. The default is http://www.forkosh.dreamhost.com/mimetex.cgi You can DIY one following Forkosh's instructions and then set it like /cgi-bin/mimetex.cgi It can help your visitors not to cost Forkosh's unselfish bandwidth.