## R: Simulating multiple normal distribution with any given corr matrix

For example , we have a corr matrix for five standardized factors $\left[\begin{array}{c}1.00\;0.42\;0.41\;0.55\;0.42\\ 0.42\;1.00\;0.48\;0.47\;0.46\\ 0.41\;0.48\;1.00\;0.48\;0.44\\ 0.55\;0.47\;0.48\;1.00\;0.50\\ 0.42\;0.46\;0.44\;0.50\;1.00\end{array}\right]$ (Hau, Chinese Textbook, pp. 49-50).

## The tail(s) of p value

For any given $H_0$ vs $H_1$, the p value of any given point x is $\underset{\theta\in H_{0}}{sup}P\left(\left\{ z|L\left(z\right)\ge L\left(x\right)\right\} |\theta\right)$, Where $L\left(x\right)\equiv\frac{\underset{\theta\in H_{1}}{sup}\left[f\left(x|\theta\right)\right]}{\underset{\theta\in H_{0}}{sup}\left[f\left(x|\theta\right)\right]}$

-- See R. Weber's Statistics Note (Chap 6.2 & 7.1)

I made some wrong comment on the pdf Null Ritual (Gigerenzer, Krauss, & Vitouch, 2004) Where three types of significance level (rather than p value) were discussed. I had written the comment to note that the chapter had ignored the role of $H_1$ in definition of p value. In almost every textbook, the two-tail p vs single-tail p are differentiated. Usually, the two-tail p is defined by $H_1$ like $\mu\neq0$.

Here I demonstrate a three-tail p value case on R platform.

 z=(-1000:1000)*0.02; f=0.5 * dchisq(abs(z),df=5); h=dchisq(10,df=5)*.5; plot(z,f,type="h",col=c("black","grey")[1+(f>h)]); lines(c(-20,20),c(h,h)); ## $H_0$ is $\chi^2(5)$ * binomial(-1 vs 1) ##

Do you agree the region nearby zero under the "V" curve (which is below the horizontal line) should be the 3rd tail? I think so, if only $H_1$ includes all other possible distributions in the same shape.

You'll also agree there will be two asymmetrical tails if $H_1$ includes just two asymmetrical curves, for example, $\mu=-2$ and $\mu=1$ ($\sigma^2\equiv1$) while $H_0$ is the standardized normal distribution.

## The Popperian falsibility behind Regression Discontinuity Design (RDD)

Figure linked From http://www.socialresearchmethods.net/kb/statrd.php (Trochim, W., 2006, Figure 2). The red line is the fallacious treatment effect.

Causal analysis entails counter-factualist comparison between the treatment and the control conditions (Mark, 2003; Maris, 1998). To define a causal effect, two respective imaginary latent groups are introduced. The comparison is between identical subjects in the actual treatment group and in an imaginary control group, or vice versa. For example, student-A registered her RSS online and missed the collective entertainment these days. Student-B did not bother to register her RSS and took part in the collective entertainment. To ask whether RSS-attendance caused entertainment-skip, the causal statement means comparison between the actual A with RSS-attendance to an imaginary A without RSS-attendance, rather than the actual A to the actual B.

The full experimental design with randomization makes it sure that the two actual groups are identical in population before their treatment. The identity covers both pretest and relationship between post-test and pretest, so the mean post-test of the imaginary control group could be unbiasedly estimated From and then replaced by that of the actual observed control group, or vice versa.

Nevertheless, RDD only assumes that two actual groups are identical in relationship between post-test and pretest, plus that the relationships were modeled appropriately. It usually also assumes two groups were divided by a cutoff in pretest, while it is not necessary. In my opinion, RDD is a special instance of bi-group analysis. A typical RDD context is to teach students in accordance with their aptitude (in Chinese 因材施教).

The critical difference between full experimental design and RDD is that the identity and the model in pre-post-relationship between two actual groups is just some hypotheses to be tested by Popperian falsibility, while the population identity between groups in full experimental design is free of uncertainty by manipulated randomization. If the relationship between pretest and post-test is curvilinear or of other non-linear types, a linear regression analysis would report a fallacious treatment effect (Trochim, 2006, Figure 2).

If we have precision comparable to classic physics experiments, the relationship between pre and post tests would be shown with high Popperian falsibility. Thus, the true model is recognized without uncertainty and statistical hypothesis tests are just a surplus. Actually, we have only a typical .7 or .8 reliability in our social science measurement, and usually an approximation in true model (like RMSEA in SEM) is necessary. Then, a RDD conclusion would critically rely on the assumption of appropriate relationship modeling.

There are two conventional models to compare two groups -- Score of gain (Gain) vs residual with covariate adjustment (Cov. Adj). Moris gave discussions in depth on them (Moris, 1998). The difference between them in the Lord paradox context is well known to researchers. However, there are still a lot of confusions, some of them were cleared or tried to clear by Moris. He asserted that Regression-Toward-the-Mean and biases of Gain model do not imply one another, and that measurement errors need not be the reason of biases of Gain model. It notes that Moris explicitly stated his RTM definition is different From some version in the earlier literature (p. 322). If ubiquitousness should be a feature of RTM, the definition of Moris does not fit this criterion.

Moris pointed out that a sufficient condition for Gain model to be unbiased is that the gain scores are independent of the groups (p. 320). A more sufficient version is that gain(=posttest- pretest) scores are independent of the pretests. In figure, it equals to constant unit slopes for each regressive line. Such a relationship between posttest and pretest is more constrained than a general linear relationship for Cov. Adj., just like that the latter one is more constrained than a curvilinear relationship. Considering the low level of Popperian Falsibility in the modeling, the constraints of the relationship will be a source of controversies for researchers.

--

Maris, E. (1998). Covariance Adjustment Versus Gain Scores – Revisited. Psychological Methods, 3, 309-327.

Mark, M. M. (2003). Program evaluation. In Schinka, J. A. & Velicer, W. F. (Eds.), Handbook of psychology. Vol. 2: Research methods in psychology. (pp. 323-347). New York: Wiley.

Trochim, W. (2006). Regression-Discontinuity Analysis. Retrieved Sep. 15, 2007, From
http://www.socialresearchmethods.net/kb/statrd.php