比較多組比例 (Comparing Multiple-Proportions)

5月 03, 2018

套路32: 比較多組比例 (Comparing Multiple-Proportions)

1. 使用時機: 用於比較觀測到的多組比例(proportion)，多組二分變量(dichotomous variables)分析。二分變量是結果只有兩種的事件。

2. 分析類型: 母數分析(parametric analysis)。直接使用資料數值算統計叫parametric方法，把資料排序之後用排序的名次算統計叫non-parametric方法。

3. 前提假設: 無。

4. 資料範例: 咪路調查不同性別大學生頭髮顏色，資料如下：

髮色	黑色	棕色	金色	紅色	總數
男生	32	43	16	9	100
女生	55	65	64	16	200
總數	87	108	80	25

試問不同髮色性別比例是否相同?

H₀: p₁ = p₂ = p₃ = p₄ 不同髮色性別比例相同。

H_A: 不同髮色性別比例不同。

5. 使用R比較多組比例:

第一步: 閱讀基本模組(base)中stats程式套件的prop.test函數的說明書。

help(prop.test)

第二步: 使用stats程式套件的prop.test函數代入資料數值。

prop.test(x = c(32, 43, 16, 9), n = c(87, 108, 80, 25), conf.level = 0.95)

# c(32, 43, 16, 9)及c(87, 108, 80, 25)是觀測數據，亦即觀測到的比例是32/87、43/108、16/80及9/25。

第三步: 判讀結果。

4-sample test for equality of proportions without continuity correction

data: c(32, 43, 16, 9) out of c(87, 108, 80, 25)

X-squared = 8.9872, df = 3, p-value = 0.02946

alternative hypothesis: two.sided

sample estimates:

prop 1 prop 2 prop 3 prop 4

0.3678161 0.3981481 0.2000000 0.3600000

# p-value > 0.05，H₀: p₁ = p₂ = p₃ = p₄，成立。

# p-value < 0.05，H₀: p₁ = p₂ = p₃ = p₄，不成立。

# p-value < 0.05，後續有兩件事可做。第一是執行成對比較，找出那一組比例不同。第二是執行趨勢

檢定，分析資料是否有隨某種因素上升或下降的趨勢。

6. 比較多組比例p值小於0.05表示至少有一組比例不同，要進一步知道那一組比例不同需繼續進行成

對比較(multiple comparison of proportions)。使用R進行成對比較:

第一步: 閱讀基本模組(base)中stats程式套件的pairwise.prop.test函數的說明書。

help(pairwise.prop.test)

第二步: 使用stats程式套件的prop.test函數代入資料數值。

pairwise.prop.test(x = c(32, 43, 16, 9), n = c(87, 108, 80, 25), p.adjust.method = "bonferroni")

# c(32, 43, 16, 9)及c(87, 108, 80, 25)是觀測數據，亦即觀測到的比例是32/87、43/108、16/80及9/25。

# 使用bonferroni方法修正p值。

第三步: 判讀結果。

Pairwise comparisons using Pairwise comparison of proportions

data: c(32, 43, 16, 9) out of c(87, 108, 80, 25)

1 2 3

2 1.000 - -

3 0.157 0.037 - # 成對比較的p值，p < 0.05表示比例有差。

4 1.000 1.000 1.000

P value adjustment method: bonferroni

# 第二組與第三組比例有差。

7. 趨勢檢定(test of trend):

第一步: 閱讀基本模組(base)中stats程式套件的prop.trend.test函數的說明書。

help(prop.trend.test)

第二步: 使用stats程式套件的prop.test函數代入資料數值。

prop.trend.test(x = c(32, 43, 16, 9), n = c(87, 108, 80, 25), score =c(2, 1, 3, 2))

# c(32, 43, 16, 9)及c(87, 108, 80, 25)是觀測數據，亦即觀測到的比例是32/87、43/108、16/80及9/25。

# score =c(2, 1, 3, 2)是32/87、43/108、16/80及9/25比例的排序。

第三步: 判讀結果。

Chi-squared Test for Trend in Proportions

data: c(32, 43, 16, 9) out of c(87, 108, 80, 25) ,

using scores: 2 1 3 2

X-squared = 7.5761, df = 1, p-value = 0.005915

# p-value > 0.05，H₀: 無score所列趨勢，成立。

# p-value < 0.05，H₀: 無score所列趨勢，不成立。

# p-value < 0.05亦即32/87、43/108、16/80及9/25比例的排序符合3、1、2及3。

8. 同一資料也可以使用卡方獨立檢定，但檢定的假設不同:

試問髮色與性別是否有關(dependent)?

H₀: 髮色與性別無關(independent)。

H_A: 髮色與性別有關。

第一步: 閱讀基本模組(base)中stats程式套件的chisq.test函數的說明書。

help(chisq.test)

第二步: 輸入建立資料。

v1 <- c(32, 43, 16, 9)

v2 <- c(55, 65, 64, 16)

m <- matrix(cbind(c(v1, v2)), nrow = 2, byrow = TRUE)

# 將兩組vector組合成2 x 4矩陣。

# 顯示矩陣內容，確保數值排列方式如資料表格所示。

第三步: 使用stats程式套件的chisq.test函數代入m。

chisq.test(m, simulate.p.value = TRUE, B = 2000)

# simulate.p.value = TRUE, B = 2000估計p值。

第四步: 判讀結果。

Pearson's Chi-squared test with simulated p-value (based on 2000 replicates)

data: m

X-squared = 8.9872, df = NA, p-value = 0.03348

# p < 0.05，H₀: 髮色與性別無關 (independent)，不成立。

# p > 0.05，H₀: 髮色與性別無關 (independent)，成立。

來勁了嗎? 想知道更多?? 補充資料(連結):

1. Population proportion (https://en.wikipedia.org/wiki/Population_proportion)

2. Statistical hypothesis testing (https://en.wikipedia.org/wiki/Statistical_hypothesis_testing)

3. Test statistic (https://en.wikipedia.org/wiki/Test_statistic)

4. Z-test (https://en.wikipedia.org/wiki/Z-test)

5. 關於R基礎，R繪圖及統計快速入門:

a. R Tutorial: https://www.tutorialspoint.com/r/index.htm

b. Cookbook for R: http://www.cookbook-r.com/

c. Quick-R: https://www.statmethods.net/

d. Statistical tools for high-throughput data analysis (STHDA): http://www.sthda.com/english/

e. The Handbook of Biological Statistics: http://www.biostathandbook.com/

f. An R Companion for the Handbook of Biological Statistics: http://rcompanion.org/rcompanion/index.html

6. Zar, JH. 2010. Biostatistical Analysis, Fifth Edition, Pearson.

留言

小吳2019年10月22日凌晨1:46
作者已經移除這則留言。
回覆刪除
回覆

新增留言

搜尋此網誌

統計不球人

比較多組比例 (Comparing Multiple-Proportions)

留言

張貼留言

這個網誌中的熱門文章

統計不球人目錄 (Table of Contents)

如何選擇統計方法 1

如何檢查資料是否接近常態分布 (Normality Test using R)

比較多組比例 (Comparing Multiple-Proportions)

留言

張貼留言

這個網誌中的熱門文章

統計不球人 目錄 (Table of Contents)

如何選擇統計方法 1

如何檢查資料是否接近常態分布 (Normality Test using R)

統計不球人目錄 (Table of Contents)