邏輯回歸分析 (Logistic Regression)
套路45: 邏輯回歸分析 (Logistic Regression)
1. 使用時機: 以多變數(自變數)預測判斷依變數與自變數之間相關的方向(趨勢)和程度。
2. 分析類型: 多變量統計分析(Multivariate Statistical Analysis)。
3. 範例資料: 以R內建資料mtcars為例
head(mtcars) # 以head函數列出mtcars前幾筆資料結果如下
mpg cyl disp hp drat
wt qsec vs am gear carb
Mazda RX4 21.0
6 160 110 3.90 2.620 16.46 0 1 4
4
Mazda RX4 Wag 21.0
6 160 110 3.90 2.875 17.02 0 1 4
4
Datsun 710 22.8
4 108 93 3.85 2.320 18.61 1
1 4 1
Hornet 4 Drive 21.4
6 258 110 3.08 3.215 19.44 1
0 3 1
Hornet Sportabout
18.7 8
360 175 3.15 3.440 17.02 0 0 3 2
Valiant 18.1 6 225
105 2.76 3.460 20.22 1 0
3 1
# 注意邏輯迴歸分析的依變數只能是0或1二元變量,例如mtcars的vs及am。
4. 使用R計算邏輯迴歸分析,方法一:
第一步: 以glm計算回歸,結果放入變數L_reg
L_reg <- glm(formula
= vs ~ mpg + cyl + disp + hp + drat + wt + qsec + gear + carb, data = mtcars,
family = binomial)
第二步: 使用基本模組(base)函數summary代入變數L_reg顯示結果。
summary(L_reg)
第三步: 判讀結果
Call:
glm(formula = vs ~ mpg
+ cyl + disp + hp + drat + wt + qsec +
gear + carb, family = binomial, data =
mtcars)
Deviance Residuals:
Min 1Q Median 3Q Max
-1.213e-05 -2.110e-08
-2.110e-08 2.110e-08 1.597e-05
Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept)
-7.678e+02 3.745e+06 0
1
mpg 5.129e+00 7.870e+04 0
1
cyl -1.009e+01 1.378e+05 0
1
disp -1.227e-01 1.036e+04 0
1
hp 3.940e-01 8.098e+03 0
1
drat -1.611e+01 3.484e+05 0
1
wt 8.632e+00 8.681e+05 0
1
qsec 3.704e+01 2.977e+05 0
1
gear 8.156e+00 5.505e+05 0
1
carb 1.196e+01 2.033e+05 0
1
(Dispersion parameter
for binomial family taken to be 1)
Null deviance: 4.386e+01 on 31
degrees of freedom
Residual deviance:
9.265e-10 on 22 degrees of freedom
AIC: 20
Number of Fisher
Scoring iterations: 25
# 找不到好的模型。
5. 使用R計算邏輯迴歸分析,方法二:
第一步: 從mtcars挑一部份資料產生新資料集,結果放入變數s_mtcars
s_mtcars
<-
mtcars[,c("am","cyl","hp","wt","mpg")]
第二步: 安裝MASS及magrittr程式套件
第三步: 呼叫MASS及magrittr程式套件
library(magrittr)
library(MASS)
第三步: 以glm計算回歸的full model (含全部變數),結果放入變數full_model
full_model <-
glm(formula = am ~ ., data = mtcars, family = binomial)
第四步: 使用基本模組(base)函數summary代入變數full_model顯示結果。
summary(full_model)
第五步: 判讀結果
Call:
glm(formula = am ~ .,
family = binomial, data = s_mtcars)
Deviance Residuals:
Min
1Q Median 3Q
Max
-1.71610 -0.03708
-0.00170 0.00092 1.47854
Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept)
-37.0580 66.4149 -0.558
0.577
cyl 1.1824 1.4395
0.821 0.411
hp 0.1005 0.1138
0.883 0.377
wt -9.1639 5.3373
-1.717 0.086 .
mpg 2.1574 2.8385
0.760 0.447
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
(Dispersion parameter
for binomial family taken to be 1)
Null deviance: 43.2297 on 31
degrees of freedom
Residual
deviance: 7.8182 on 27
degrees of freedom
AIC: 17.818
Number of Fisher
Scoring iterations: 11
第六步: 以stepAIC挑選自變數,結果放入變數step_model
step_model <-
full_model %>% stepAIC(trace = FALSE)
第七步: 使用基本模組(base)函數summary代入變數step_model顯示結果。
summary(step_model)
第八步: 判讀結果
Call:
glm(formula = am ~ hp
+ wt, family = binomial, data = s_mtcars)
Deviance Residuals:
Min
1Q Median 3Q
Max
-2.2537 -0.1568
-0.0168 0.1543 1.3449
Coefficients:
Estimate Std. Error z value
Pr(>|z|)
(Intercept)
18.86630 7.44356 2.535
0.01126 *
hp 0.03626 0.01773 2.044
0.04091 *
wt -8.08348 3.06868
-2.634 0.00843 **
---
Signif. codes: 0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’
1
(Dispersion parameter
for binomial family taken to be 1)
Null deviance: 43.230 on 31
degrees of freedom
Residual deviance:
10.059 on 29 degrees of freedom
AIC: 16.059
Number of Fisher
Scoring iterations: 8
# 最後挑出來的model: am = 0.03626hp – 8.08348wt
+ 18.86630
來勁了嗎? 想知道更多?? 補充資料(連結):
2. R Tutorial: Logistic regression in R (https://www.tutorialspoint.com/r/r_logistic_regression.htm)
3. 關於R基礎,R繪圖及統計快速入門:
a. R Tutorial: https://www.tutorialspoint.com/r/index.htm
b. Cookbook for R: http://www.cookbook-r.com/
c. Quick-R: https://www.statmethods.net/
d. Statistical tools
for high-throughput data analysis (STHDA): http://www.sthda.com/english/
e. The Handbook of Biological Statistics: http://www.biostathandbook.com/
f. An R Companion for the Handbook of
Biological Statistics: http://rcompanion.org/rcompanion/index.html
4. Zar, JH. 2010. Biostatistical Analysis, Fifth Edition,
Pearson.
留言
張貼留言