## Understanding QQ plots

## Try distributions like rchisq, rt, runif, rf to view its heavy, or light, left, or right tail.

 n <- 30; ry <- rnorm(n); qqnorm(ry);qqline(ry); max(ry) min(ry) ##view and guess what are x(s) and y(s) I <- rep(1,n); qr <- ((ry%*%t(I) > I %*% t(ry))+.5*(ry %*% t(I) == I%*%t(ry)))%*%I *(1/n);##qr are the sample quantiles points(qr,ry,col="blue"); ##to view the fact, try the following points(qr,qr*0,col="green",pch="|"); rx <- qnorm(qr); points(rx,ry,col="red",pch="O"); ##Red O(s) circle black o(s) exactly.

## 03DEC2007 R-workshop sponsored by dept of psy, ZSU(=SYSU, Guang-Zhou)

Here is the updated PPT for the talk in the afternoon--which includes the zipped example codes and set-up steps for the workshop in the evening within the 3rd page. The listed anonymous on-line test (result statistics) on p-value interpretation was cited indirectly From Gigerenzer, Krauss, & Vitouch (2004).

There is an advert on http://www.psy.sysu.edu.cn/detail_news.asp?id=258 and a formal CV of the speaker is available on http://lixiaoxu.googlepageS.com

## Classic Neyman-Pearson approach demo

It notes here that N-P approach does not utilize the information in the accurate p value. Actually, at the time N-P approach was firstly devised, the accurate p value was not available usually. Now almost all statistic softwares provide accurate p values and the N-P approach becomes obsolete. Wilkinson & APA TFSI (1999) recommended to report the accurate p value rather than just significance/insignificance, unless p is smaller than any meaningful precision.

--

Wilkinson, L. & APA TFSI (1999). Statistical methods in psychology journals: Guidelines and explanations. American Psychologist, 54, 594-604.

## Different corr(s) of different IV scopes with same regression coef

$Y=\alpha+\beta X+\varepsilon,\;\varepsilon\sim N\left(0,\sigma^{2}\right)$

With $\alpha,\beta, \sigma$ known in the linear relationship, can the correlation in the scatter plot of Y against X be estimated from the linear formula?

You may recall in Hierarchical Linear Model class, the scopes of the W dramatically impact the regression coefficients of F~W in the following R demo (hlm.jpg). While this time the regression coefficient has been fixed to a known $\beta$. So the scopes of X would never impact the regression coefficient. However, it proved that the correlation r could range from zero to unit (or -1) according to the variance of X in the final close form $r=\frac{\beta\mbox{Var}\left(X\right)}{\mbox{Std}\left(Y\right)\mbox{Std}\left(X\right)}=\beta\frac{\mbox{Std}\left(X\right)}{\sqrt{\beta^{2}\mbox{Var}\left(X\right)+\sigma^{2}}}$.

Let me quote as the final words from Cohen (1994; p.1001; Where the role of IV is replaced by that of DV within typical contexts like ANOVA) --

... standardized effect size measures, such as d and f, developed in power analysis (Cohen, 1988) are, like correlations, also dependent on population variability of the dependent variable and are properly used only when that fact is kept in mind.

--

Cohen, J. (1994). The earth is round (p<.05). American Psychologist, 49, 997-1003.

--

Compare to the following case: different corr(s) of different IV scopes with hierarchical regression coefficients --

## “Effect Size” — same data, different interpretations

Just a short R-script note to embody the 3-page-paper of Rosenthal & Rubin (1982).

Table 1. (p. 167) listed a simple set-up. There was a between-subject treatment. Control group includes 34 alive cases and 66 dead cases. Treatment group includes 66 alive cases and 34 dead cases. The question is what is the percentage of the variance explained by the nominal IV indicating the group?

The authors pointed out that one may interpret the data result as death rate was reduced by 32% while the other may interpret the same as 10.24% variance was explained. Let's demo it more dramatically to imagine just 4% explained variance would reduce death rate by 20%.

--

Rosenthal, R. & Rubin, D. B. (1982). A simple, general purpose display of magnitude of experimental effect. Journal of Educational Psychology, 74, 166-169.

## Anscombe’s 4 Regressions — A Trivially Updated Demo

##---------- ## This is a trivially updated version based on the R document "?anscombe". require(stats); require(graphics) anscombe

##-- now some "magic" to do the 4 regressions in a loop:##< - ff = y ~ x for(i in 1:4) { ff[2:3] = lapply(paste(c("y","x"), i, sep=""), as.name) assign(paste("lm.",i,sep=""), lmi <- lm(ff, data= anscombe)) }

 ## See how close they are (numerically!) sapply(objects(pattern="lm\\.[1-4]$"), function(n) coef(get(n))) lapply(objects(pattern="lm\\.[1-4]$"), function(n) coef(summary(get(n)))) ## Now, do what you should have done in the first place: PLOTS op <- par(mfrow=c(4,3),mar=.1+c(4,4,1,1), oma= c(0,0,2,0)) for(i in 1:4) { ff[2:3] <- lapply(paste(c("y","x"), i, sep=""), as.name) plot(ff, data =anscombe, col="red", pch=21, bg = "orange", cex = 1.2, xlim=c(3,19), ylim=c(3,13)) abline(get(paste("lm.",i,sep="")), col="blue") plot(lm(ff, data =anscombe),which=1,col="red", pch=21, bg = "orange", cex = 1.2 ,sub.caption="",caption="" ) plot(lm(ff, data =anscombe),which=2,col="red", pch=21, bg = "orange", cex = 1.2 ,sub.caption="",caption="" ) } mtext("Anscombe's 4 Regression data sets", outer = TRUE, cex=1.5) par(op) ## 

## Anscombe, F. J. (1973). Graphs in statistical analysis. American Statistician, 27, 17–21.

## R: Simulating multiple normal distribution with any given corr matrix

For example , we have a corr matrix for five standardized factors $\left[\begin{array}{c}1.00\;0.42\;0.41\;0.55\;0.42\\ 0.42\;1.00\;0.48\;0.47\;0.46\\ 0.41\;0.48\;1.00\;0.48\;0.44\\ 0.55\;0.47\;0.48\;1.00\;0.50\\ 0.42\;0.46\;0.44\;0.50\;1.00\end{array}\right]$ (Hau, Chinese Textbook, pp. 49-50).

## The tail(s) of p value

For any given $H_0$ vs $H_1$, the p value of any given point x is $\underset{\theta\in H_{0}}{sup}P\left(\left\{ z|L\left(z\right)\ge L\left(x\right)\right\} |\theta\right)$, Where $L\left(x\right)\equiv\frac{\underset{\theta\in H_{1}}{sup}\left[f\left(x|\theta\right)\right]}{\underset{\theta\in H_{0}}{sup}\left[f\left(x|\theta\right)\right]}$

-- See R. Weber's Statistics Note (Chap 6.2 & 7.1)

I made some wrong comment on the pdf Null Ritual (Gigerenzer, Krauss, & Vitouch, 2004) Where three types of significance level (rather than p value) were discussed. I had written the comment to note that the chapter had ignored the role of $H_1$ in definition of p value. In almost every textbook, the two-tail p vs single-tail p are differentiated. Usually, the two-tail p is defined by $H_1$ like $\mu\neq0$.

Here I demonstrate a three-tail p value case on R platform.

 z=(-1000:1000)*0.02; f=0.5 * dchisq(abs(z),df=5); h=dchisq(10,df=5)*.5; plot(z,f,type="h",col=c("black","grey")[1+(f>h)]); lines(c(-20,20),c(h,h)); ## $H_0$ is $\chi^2(5)$ * binomial(-1 vs 1) ##

Do you agree the region nearby zero under the "V" curve (which is below the horizontal line) should be the 3rd tail? I think so, if only $H_1$ includes all other possible distributions in the same shape.

You'll also agree there will be two asymmetrical tails if $H_1$ includes just two asymmetrical curves, for example, $\mu=-2$ and $\mu=1$ ($\sigma^2\equiv1$) while $H_0$ is the standardized normal distribution.

## The Popperian falsibility behind Regression Discontinuity Design (RDD)

Figure linked From http://www.socialresearchmethods.net/kb/statrd.php (Trochim, W., 2006, Figure 2). The red line is the fallacious treatment effect.

Causal analysis entails counter-factualist comparison between the treatment and the control conditions (Mark, 2003; Maris, 1998). To define a causal effect, two respective imaginary latent groups are introduced. The comparison is between identical subjects in the actual treatment group and in an imaginary control group, or vice versa. For example, student-A registered her RSS online and missed the collective entertainment these days. Student-B did not bother to register her RSS and took part in the collective entertainment. To ask whether RSS-attendance caused entertainment-skip, the causal statement means comparison between the actual A with RSS-attendance to an imaginary A without RSS-attendance, rather than the actual A to the actual B.

The full experimental design with randomization makes it sure that the two actual groups are identical in population before their treatment. The identity covers both pretest and relationship between post-test and pretest, so the mean post-test of the imaginary control group could be unbiasedly estimated From and then replaced by that of the actual observed control group, or vice versa.

Nevertheless, RDD only assumes that two actual groups are identical in relationship between post-test and pretest, plus that the relationships were modeled appropriately. It usually also assumes two groups were divided by a cutoff in pretest, while it is not necessary. In my opinion, RDD is a special instance of bi-group analysis. A typical RDD context is to teach students in accordance with their aptitude (in Chinese 因材施教).

The critical difference between full experimental design and RDD is that the identity and the model in pre-post-relationship between two actual groups is just some hypotheses to be tested by Popperian falsibility, while the population identity between groups in full experimental design is free of uncertainty by manipulated randomization. If the relationship between pretest and post-test is curvilinear or of other non-linear types, a linear regression analysis would report a fallacious treatment effect (Trochim, 2006, Figure 2).

If we have precision comparable to classic physics experiments, the relationship between pre and post tests would be shown with high Popperian falsibility. Thus, the true model is recognized without uncertainty and statistical hypothesis tests are just a surplus. Actually, we have only a typical .7 or .8 reliability in our social science measurement, and usually an approximation in true model (like RMSEA in SEM) is necessary. Then, a RDD conclusion would critically rely on the assumption of appropriate relationship modeling.

There are two conventional models to compare two groups -- Score of gain (Gain) vs residual with covariate adjustment (Cov. Adj). Moris gave discussions in depth on them (Moris, 1998). The difference between them in the Lord paradox context is well known to researchers. However, there are still a lot of confusions, some of them were cleared or tried to clear by Moris. He asserted that Regression-Toward-the-Mean and biases of Gain model do not imply one another, and that measurement errors need not be the reason of biases of Gain model. It notes that Moris explicitly stated his RTM definition is different From some version in the earlier literature (p. 322). If ubiquitousness should be a feature of RTM, the definition of Moris does not fit this criterion.

Moris pointed out that a sufficient condition for Gain model to be unbiased is that the gain scores are independent of the groups (p. 320). A more sufficient version is that gain(=posttest- pretest) scores are independent of the pretests. In figure, it equals to constant unit slopes for each regressive line. Such a relationship between posttest and pretest is more constrained than a general linear relationship for Cov. Adj., just like that the latter one is more constrained than a curvilinear relationship. Considering the low level of Popperian Falsibility in the modeling, the constraints of the relationship will be a source of controversies for researchers.

--

Maris, E. (1998). Covariance Adjustment Versus Gain Scores – Revisited. Psychological Methods, 3, 309-327.

Mark, M. M. (2003). Program evaluation. In Schinka, J. A. & Velicer, W. F. (Eds.), Handbook of psychology. Vol. 2: Research methods in psychology. (pp. 323-347). New York: Wiley.

Trochim, W. (2006). Regression-Discontinuity Analysis. Retrieved Sep. 15, 2007, From
http://www.socialresearchmethods.net/kb/statrd.php