2019年8月27日 星期二

你還在被人口統計變數所困惑嗎?


自變數和依變數的相關係數只有-0.031,相關不顯著,這種情況下能加入人口學控制變數,再加入干擾變數,進行干擾效應的迴歸分析嗎?如果不行,該怎麼處理?


  相關已經不顯著了,加入任何控制變數只會造成錯誤的估計值出現。考慮干擾變數是可以的。人口統計學變數不可以為控制變數,但可以為干擾變數。


很多文章做中介效果是寫到控制年齡、性別、水準等控制變數後,請問Mplus中如何寫程式控制這些變數?還是不用管就是默認控制了?

  這是錯誤做法,控制變數這個問題許多人問過。首先回答一個問題,你知道什麼是控制變數嗎?我想絕大多數的人都講不出來吧!那為何研究要放控制變數呢?理由大多是別人有用,所以我也用,因此就造成了許多控制變數的誤用。

  從理論上先瞭解性別,教育程度是名義及順序尺度,控制變數其實在統計上是迴歸的偏相關,也是自變數,迴歸的基本假設是自變數與依變數必需是線性相關,性別為類別變數,跑出的值是平均值差異,而非斜率,教育程度與Y不可能為線性關係,代入也不對。

  而且控制變數是理論上要和Y有顯著相關的變數,但又不是研究的重點,所以分析完後,如果控制變數與Y不顯著,表示研究設控制變數是錯的。


引用文獻出處:
Use specific, well-explored theory to drive the inclusion of controls, which goes beyond simple statements like, “previous researchers used this control” or “this variable is correlated with my outcomes.”  If you believe that a specific relationship may be contaminating your results, this may be justification for a control, but you should explicit state why and defend this decision when describing your methods. 

Don’t control for demographic variables, e.g. race, gender, sex, age. For example, if you find a gender difference in your outcome of interest, controlling for that variable may hide real variance in the outcome that could be explained by whatever real phenomenon is causing that difference.  In my own research are, it is not uncommon to control for age when examining the effects of technology on outcomes of interest (e.g. learning).  But age does not itself cause trouble with technology; instead, underlying differences like familiarity with technology or comfort with technology or other characteristics may be driving those differences.  Simply controlling for age not only removes “real” variance that should remain in the equation but also camouflages a real relationship of interest.

Richard, N. L. (2011). Stats and Methods Urban Legend 2: Control Variables Improve Your Study.
Paul, E. S., & Michael, T. B. (2011). Methodological Urban Legends: The Misuse of Statistical Control Variables. Organizational Research Methods, 14(2), 287-305.

沒有留言:

張貼留言

EFA與CFA能否用相同樣本進行?

請問在 SEM 模型中,有一個潛變數要做 EFA, 請問可以用搜集到的所有樣本先做 EFA, 然後再用相同的這些樣本做 SEM 嗎?還是要用一些樣本做 EFA, 然後用總體中剩下的那部分樣本做 SEM ?