2019年5月のブログ記事一覧-裏 RjpWiki

brunnermunzel.test の異変（その3）

2019年05月31日 | ブログラミング

Brunner-Munzel 検定ッテ，もはや，参照する価値もない記事です。パチもん検定です。

1.3.4 で再度持ち込まれたバグだけを修正した 1.3.5 が発行された

今はまだソースしかない

頑張ってます。クリックお願いします。

brunnermunzel.test の異変（その2）

2019年05月26日 | ブログラミング

Brunner-Munzel 検定ッテ，もはや，参照する価値もない記事です。

まったくのところ，ストーカーに成り下がっています

さて，brunnermunzel を 1.3.3 にもどし，さらに permutationtest を t ではなく p に基づくように変更する。つまり，以下のようjに

calc_statistics.f において

stat = my - mx

bm_test.f において

nvx = n1 * vx; nvy = n2 * vy

以上2箇所変更により

brunnermunzel.test は 1.3.3 に戻り，

> brunnermunzel.test(x,y)

Brunner-Munzel Test

data: x and y
Brunner-Munzel Test Statistic = 3.1375, df = 17.683, p-value = 0.005786
95 percent confidence interval:
0.5952169 0.9827052
sample estimates:
P(X<Y)+.5*P(X=Y)
0.788961

brunnermunzel.permutation.test は t ではなく p に基づいて並べ替え検定を行うようになった。

> brunnermunzel.permutation.test(x, y)

permuted Brunner-Munzel Test

data: x and y
p-value = 0.00669

これでよいのかどうか？

頑張ってます。クリックお願いします。

brunnermunzel.test の異変

2019年05月26日 | ブログラミング

もはや，論評の必要もない，パチもん検定です。

紛らわしいですが，brunnermunzel パッケージの方ね。

今日，例題をやってみたら，結果が違う

> x = c(1,2,1,1,1,1,1,1,1,1,2,4,1,1)
> y = c(3,3,4,3,1,2,3,1,1,5,4)
>
> library(brunnermunzel)
> brunnermunzel.test(x, y)

Brunner-Munzel Test

data: x and y
Brunner-Munzel Test Statistic = 2.8728, df = 16.258, p-value = 0.01091
95 percent confidence interval:
0.5939688 0.9839533
sample estimates:
P(X<Y)+.5*P(X=Y)
0.788961

> sessionInfo()
R version 3.6.0 (2019-04-26)

other attached packages:
[1] lawstat_3.3 brunnermunzel_1.3.4

1.3.4 は 5月21日のアップデートだ。

1.3.3 とどこが違うか見てみると，一箇所だけ違う。

diff brunnermunzel_1.3.4/src/bm_test.f brunnermunzel_1.3.3/src/bm_test.f
103c103
< nvx = n1 * vx; nvy = n1 * vy # 1.3.4
---
> nvx = n1 * vx; nvy = n2 * vy

見た目の対称性からいうと，1.3.4 はおかしい。

ちなみに，対称性という面からは，x, y を逆にしても p 値は同じにならないとまずいが，以下のようになってしまうので，明らかにおかしい。

> brunnermunzel.test(y, x)

Brunner-Munzel Test

data: y and x
Brunner-Munzel Test Statistic = -3.241, df = 16.258, p-value = 0.005032
95 percent confidence interval:
0.01604674 0.40603118
sample estimates:
P(X<Y)+.5*P(X=Y)
0.211039

頑張ってます。クリックお願いします。

Brunnner-Munzel exact test

2019年05月24日 | ブログラミング

Brunner-Munzel 検定ッテ，もはや，参照する価値もない記事です。パチもん検定です。

lawstat や brunnermunzel パッケージの t 統計量を対象にするのはまずい

P(x＜y)+0.5*P(x=y) を対象にすべし（そうすれば，実行時間も短くなるし）

なにより，正しい答えが出るアルゴリズムを決定してから，コンパイラ言語を使って速くする方策を探るべし

> x = c(1,2,1,1,1,1,1,1,1,1,2,4,1,1)
> y = c(3,3,4,3,1,2,3,1,1,5,4)
> n1 = length(x)
> n2 = length(y)
> xandy = c(x, y)

> library(lawstat)
> system.time({
+ bm = brunner.munzel.test(x, y)$statistic
+ foo = function(X) {
+   brunner.munzel.test(xandy[X], xandy[-X])$statistic
+ }
+ z = combn(n1 + n2, n1, foo)
+ print(mean(abs(z) >= abs(bm)))
+ })
[1] 0.008037645 この答えは正しくない
   ユーザ   システム       経過
1173.610      2.509   1177.962

> system.time({
+ f = function(X) {
+   return((mean(rank(c(xandy[X], xandy[-X]))[1:n1]) - w)/n2)
+ }
+ w = (n1 + 1)/2
+ s0 = f(1:n1)
+ s = combn(n1 + n2, n1, f)
+ print(mean(abs(s-0.5) >= abs(s0-0.5)))
+ })
[1] 0.006690223　正しいし，速い
   ユーザ   システム       経過
   167.216      0.542    167.834

頑張ってます。クリックお願いします。

Behrens-Fisher problem 対応の Wilcoxon test

2019年05月24日 | ブログラミング

Wilcoxon 検定（マン・ホイットニーの U 検定）を漸近正規検定ではなく，ベーレンス・フィッシャー問題に対応した漸近 t 分布検定とする関数。

w1, w2 は Brunner-Munzel test からマルパクリ

それに基づく t, df もマルパクリ

ただし，t の分子は U - n1 * n2 / 2 である。

Brunner-Munzel test ではこれが n1 * n2 * (m2 - m1)/(n1 + n2) と書かれているが同じもの。

繰り返すが，検定統計量は本質的に同じなので，この検定は新しいものではない。

x = c(1,2,1,1,1,1,1,1,1,1,2,4,1,1)
y = c(3,3,4,3,1,2,3,1,1,5,4)

n1 = length(x)
n2 = length(y)
r1 = rank(x)
r2 = rank(y)
r = rank(c(x, y))
m2 = mean(r[n1 + 1:n2])
U = n2*(n1 + (n2+1)/2 - m2)
w1 = sum((r[1:n1] - r1 - mean(r[1:n1]) + (n1 + 1)/2)^2)*n1/(n1 - 1)
w2 = sum((r[n1 + 1:n2] - r2 - m2 + (n2 + 1)/2)^2)*n2/(n2 - 1)
t = (U - n1 * n2 / 2)/sqrt(w1 + w2)
df = (w1 + w2)^2 / (w1^2/(n1 - 1) + w2^2/(n2 - 1))
p.value = pt(-abs(t), df)*2
cat(sprintf("U = %.5g, t = %.5g, d.f. = %.5g, p value = %.5f\n", U, t, df, p.value))

実行結果

U = 32.5, t = -3.1375, d.f. = 17.683, p value = 0.00579

正規近似検定では，単純に

tie = table(r)
V = n1*n2*(n^3-n-sum(tie^3-tie))/12/(n^2-n)
Z = (U - n1 * n2 / 2)/sqrt(V)

となっている

頑張ってます。クリックお願いします。

Brunner-Munzel 検定（その2）

2019年05月24日 | ブログラミング

Brunner-Munzel 検定ッテ，もはや，参照する価値もない記事です。パチもん検定です。

Brunner, Munzel の原論文を読んでみた。

肝は，Nonparametric Behrens-Fisher problem　だ。つまり，普通の t 検定と Welch の t 検定との関係と同じことを論じている。

先に書いたように p = 1 - U / n1 /n2 なのだけど，Brunner-Munzel 検定は t 分布で近似するときに Behrens-Fisher 問題を気に留めているということ。

lawstat:::brunner.munzel.test の

    n1 = length(x)
    n2 = length(y)
    r1 = rank(x)
    r2 = rank(y)
    r = rank(c(x, y))
    m1 = mean(r[1:n1])
    m2 = mean(r[n1 + 1:n2])
    pst = (m2 - (n2 + 1)/2)/n1
    v1 = sum((r[1:n1] - r1 - m1 + (n1 + 1)/2)^2)/(n1 - 1)
    v2 = sum((r[n1 + 1:n2] - r2 - m2 + (n2 + 1)/2)^2)/(n2 - 1)
    statistic = n1 * n2 * (m2 - m1)/(n1 + n2)/sqrt(n1 * v1 + n2 * v2)
    dfbm = ((n1 * v1 + n2 * v2)^2)/(((n1 * v1)^2)/(n1 - 1) + ((n2 * v2)^2)/(n2 - 1))

v1，v2 ですな。statistic, dfbm いずれにもこれを使っている。

そりゃねえ，Wilcoxon 検定の正規近似は負けるに決まっています。

でも，どっちも近似なんだから，permutation 検定には負ける。

頑張ってます。クリックお願いします。

Brunner-Munzel 検定

2019年05月21日 | 統計学

Brunner-Munzel 検定ッテ，もはや，参照する価値もない記事です。パチもん検定です。

確かに，Brunner-Munzel 検定と Mann-Whitney 検定の検定統計量を B, U とすると

B =1-U/n1/n2 なので，両検定は同等（http://aoki2.si.gunma-u.ac.jp/lecture/Average/how2get-U.html を見れば一目瞭然）。

前者は正規近似，後者はt分布近似（サンプルサイズが多いときには正規近似も）するという違いだけ。並べ替え検定を行うと当然同じになるなあ。

とすると，Mann-Whitney 検定は不等分散に弱いが，Brunner-Munzel 検定はそんなことはないというのは真っ赤な嘘だなあ。

> library(lawstat)
> set.seed(777)
> n1 = 80
> n2 = 100
> x = rnorm(n1, mean=0, sd=1)
> y = rnorm(n2, mean=0, sd=10) 不等分散データを作る
> brunner.munzel.test(x, y)

   Brunner-Munzel Test

data: x and y
Brunner-Munzel Test Statistic = -0.010395, df = 100.89, p-value = 0.9917　t 分布の近似は十分
95 percent confidence interval:
0.404083 0.594917
sample estimates:
P(X<Y)+.5*P(X=Y)
          0.4995
> pt(-0.010395, 100.89)*2 t 分布の両側確率
[1] 0.9917267

> wilcox.test(x, y)

   Wilcoxon rank sum test with continuity correction

data: x and y
W = 4004, p-value = 0.992 この p 値は，正確な値。Brunner-Munzelの 0.9917 に近いことに気づくべし
alternative hypothesis: true location shift is not equal to 0

> library(coin)
> df = data.frame(d=c(x, y), g=factor(rep(1:2, c(n1, n2))))
> wilcox_test(d ~ g, data=df)

   Asymptotic Wilcoxon-Mann-Whitney Test 皮肉なことに wilcox_test は正規近似を行う羽目に

data: d by g (1, 2)
Z = 0.011515, p-value = 0.9908 正規近似も Brunner-Munzelの 0.9917 に近いことに気づくべし
alternative hypothesis: true mu is not equal to 0
> pnorm(0.011515, lower.tail=FALSE)*2　正規分布の両側確率
[1] 0.9908126

> brunner.munzel.test(x, y)$estimate
P(X<Y)+.5*P(X=Y)
          0.4995
> 1-wilcox.test(x, y)$statistic/n1/n2
     W W はマン・ホイットニーの U
0.4995　検定統計量は本質的に同じなのだ（だから，両方は同じ検定なのだ。どちらが優れているなんてことはない）

頑張ってます。クリックお願いします。

記事一覧 | 画像一覧 | フォロワー一覧 | フォトチャンネル一覧

PVアクセスランキングにほんブログ村

プロフィール

フォロー中フォローするフォローする

バックナンバー

2024年08月

2024年07月

2024年06月

2024年05月

2024年04月

2024年03月

2024年02月

2024年01月

2023年12月

2023年11月

2023年10月

2023年09月

2023年08月

2023年07月

2023年06月

2023年05月

2023年04月

2023年03月

2023年02月

2023年01月

2022年12月

2022年11月

2022年10月

2022年09月

2022年08月

2022年07月

2022年06月

2022年05月

2022年04月

2022年03月

2022年02月

2022年01月

2021年12月

2021年11月

2021年10月

2021年09月

2021年08月

2021年07月

2021年06月

2021年05月

2021年04月

2021年03月

2021年02月

2021年01月

2020年12月

2020年11月

2020年10月

2020年09月

2020年08月

2020年07月

2020年06月

2020年05月

2020年04月

2020年03月

2020年02月

2020年01月

2019年12月

2019年11月

2019年09月

2019年08月

2019年07月

2019年06月

2019年05月

2019年04月

2018年12月

2018年09月

2018年08月

2018年07月

2018年06月

2018年05月

2018年03月

2018年02月

2018年01月

2017年12月

2017年11月

2017年10月

2017年09月

2017年08月

2017年07月

2017年06月

2017年05月

2017年04月

2017年03月

2017年02月

2017年01月

2016年12月

2016年11月

2016年10月

2016年09月

2016年08月

2016年07月

2016年06月

2016年05月

2016年04月

2016年03月

2016年02月

2016年01月

2015年12月

2015年11月

2015年10月

2015年09月

2015年08月

2015年07月

2015年06月

2015年05月

2015年04月

2015年03月

2015年02月

2015年01月

2014年12月

2014年11月

2014年10月

2014年09月

2014年08月

2014年07月

2014年06月

2014年05月

2014年04月

2014年03月

2014年02月

2014年01月

2013年12月

2013年11月

2013年10月

2013年09月

2013年08月

2013年07月

2013年06月

2013年05月

2013年04月

2013年03月

2013年02月

2013年01月

2012年12月

2012年11月

2012年10月

2012年09月

2012年08月

2012年07月

2012年06月

2012年05月

2012年04月

2012年03月

2012年02月

2012年01月

2011年12月

2011年11月

2011年10月

2011年09月

2011年07月

2011年06月

2011年05月

2011年04月

2011年03月

2011年02月

2011年01月

2010年09月

2010年08月

2010年07月

カレンダー

2019年5月
日	月	火	水	木	金	土
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

前月

次月

goo blog お知らせ

	ブログを読むだけ。毎月の訪問日数に応じてポイント進呈
	【コメント募集中】goo blogスタッフの気になったニュース
	gooブロガーの今日のひとこと
	訪問者数に応じてdポイント最大1,000pt当たる！