DV predicted by two IVs, vs. triangular pyramid

-- Diagram from Wiki

It is easier to imagine relation in three spatial vectors by their angles, than by their correlations. For standardized DV Y=\left(y_{1},y_{2},\dots,y_{N}\right)^{\tau} and IVs X_{1}=\left(x_{1,1},x_{2,1},\dots,x_{N,1}\right)^{\tau}, X_{2}=\left(x_{1,2},x_{2,2},\dots,x_{N,2}\right)^{\tau}, cosines of three angles of the triangular pyramid determinate the correlation matrix, thus, all statistics of the regressions Y=\beta_{1}X_{1}+\beta_{2}X_{2}+\varepsilon and Y=\beta_{1}X_{1}+\varepsilon . Unexpected but imaginative results on the impact of introducing X_{2} are --

1. Both IVs are nearly independent of DV. Togethor they predict DV almost perfectly (\angle YX_{1}=\angle YX_{2}=89^{\circ} and \angle X_{1}X_{2}=177.9^{\circ}).

2. Both IVs are almost perfectly correlated with DV. Togethor, one of the regressive coefficient is significantly negative (1^{\circ}, 0.6^{\circ} and 0.5^{\circ} respectively).

3. Redundancy (Cohen, Cohen, West, & Aiken, 2003) increases to full and then decreases to zero and even negative (\angle YX_{1}=60^{\circ}, \angle YX_{2}=45{}^{\circ} and \angle X_{1}X_{2} closes from 90^{\circ} to 45^{\circ} then to 15^{\circ}+\epsilon ).




--
Cohen, J., Cohen, P., West, S. G., & Aiken, L. S.  (2003). Applied multiple regression/correlation analysis for the behavioral sciences(3rd ed.) Mahwah, NJ: Lawrence Erlbaum Associates.

自由度的几何:对截距项投影残差向量的长度平方

这是《相关系数的几何:对截距投影的残差向量之间交角余弦》示意图,恰好可以用于解释为什么 \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}满足的 \chi^2分布dfn-1而不是n

其中 X_{i}\equiv\mu+\varepsilon_{i} \left[\begin{array}{c}\varepsilon_{1}\\\varepsilon_{2}\\\vdots\\\varepsilon_{n}\end{array}\right]n维空间中的标准正态随机向量。那么,容易知道有 \sum_{i=1}^{n}\left(X_{i}-\bar{X}\right)^{2}=\sum_{i=1}^{n}\left(\varepsilon{}_{i}-\bar{\varepsilon}\right)^{2}。这个表达式就是向量 \left[\begin{array}{c}\varepsilon_{1}\\\varepsilon_{2}\\\vdots\\\varepsilon_{n}\end{array}\right]-\left[\begin{array}{c}\bar{\varepsilon}\\\bar{\varepsilon}\\\vdots\\\bar{\varepsilon}\end{array}\right]长度的平方。我们已经知道, \left[\begin{array}{c}\bar{\varepsilon}\\\bar{\varepsilon}\\\vdots\\\bar{\varepsilon}\end{array}\right]就是 \left[\begin{array}{c}\varepsilon_{1}\\\varepsilon_{2}\\\vdots\\\varepsilon_{n}\end{array}\right]在截距向量(日晷指针) \left[\begin{array}{c}1\\1\\\vdots\\1\end{array}\right]上的投影。自然, \left[\begin{array}{c}\varepsilon_{1}\\\varepsilon_{2}\\\vdots\\\varepsilon_{n}\end{array}\right]-\left[\begin{array}{c}\bar{\varepsilon}\\\bar{\varepsilon}\\\vdots\\\bar{\varepsilon}\end{array}\right]就是对截距项投影残差向量,也就是在日晷盘上的投影。

日晷所处空间的n是3。如果我们对 \left[\begin{array}{c}\varepsilon_{1}\\\varepsilon_{2}\\\varepsilon_{3}\end{array}\right]抽样许多次,就会看到三维空间中各个方向对称的标准正态分布散点图。这些散点图在日晷盘上的投影就是二维空间标准正态分布散点图。日晷盘中这些点对应向量的长度平方自然是 \chi^2_{df=2}的抽样。