1. Julia の歪度と尖度は古典的定義に基づく
1.1. skewness, kurtosis は StatsBase に含まれる。
# Skewness
# This is Type 1 definition according to Joanes and Gill (1998)
"""
skewness(v, [wv::AbstractWeights], m=mean(v))
Compute the standardized skewness of a real-valued array `v`, optionally
specifying a weighting vector `wv` and a center `m`.
"""
# (excessive) Kurtosis
# This is Type 1 definition according to Joanes and Gill (1998)
"""
kurtosis(v, [wv::AbstractWeights], m=mean(v))
Compute the excess kurtosis of a real-valued array `v`, optionally
specifying a weighting vector `wv` and a center `m`.
"""
1.2. R は 3 通りの定義に基づく関数値を返す
Joanes and Gill (1998) [^1]discuss three methods for estimating kurtosis and skewness.
[1]: D. N. Joanes and C. A. Gill (1998), Comparing measures of sample skewness and kurtosis. The Statistician, 47, 183–189.
1.2.1. Kurtosis
Type 1: This is the typical definition used in many older textbooks.
g2 = m4/m2^2 - 3
Type 2: Used in SAS and SPSS.
G2 = ((n + 1)*g2 + 6)*(n − 1)/((n − 2)*(n − 3))
Type 3: Used in MINITAB and BMDP.
b2 = m4/s^4 − 3 = (g2 + 3)*(1 − 1/n)^2 − 3
Only G2 (corresponding to type = 2
) is unbiased under normality.
デフォルトは type=3 である。
1.2.2. Skewness
Type 1: This is the typical definition used in many older textbooks.
g1 = m3/m2^(3/2)
Type 2: Used in SAS and SPSS.
G1 = g1*sqrt(n*(n - 1))/(n - 2)
Type 3: Used in MINITAB and BMDP.
b1 = m3/s^3 = g1*((n - 1)/n)^(3/2)
All three skewness measures are unbiased under normality.
デフォルトは type=3 である。
1.3. Julia でも type を選べる関数を定義する
1.3.1. 尖度
using StatsBase
kurt1(x) = kurt(x, type=1)
kurt2(x) = kurt(x, type=2)
kurt3(x) = kurt(x)
function kurt(x; type=3)
doc = """
kurt(x; type=n)
type: an integer between 1 and 3 selecting one of the algorithms for computing kurtosis detailed below.
Type 1: g2. This is the typical definition used in many older textbooks.
Type 2: G2. Used in SAS and SPSS.
Type 3: b2. default. Used in MINITAB and BMDP.
"""
type in 1:3 || error(doc)
n = length(x)
g2 = kurtosis(x)
if type == 2
return ((n + 1)*g2 + 6)*(n − 1)/((n − 2)*(n − 3))
elseif type== 3
return (g2 + 3)*(1 − 1/n)^2 − 3
else
return g2
end
end;
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 1];
kurt(x), kurt3(x)
(-1.6335639958376689, -1.6335639958376689)
kurt(x, type=1), kurt1(x)
(-1.3130419701699616, -1.3130419701699616)
kurt(x, type=2), kurt2(x)
(-1.356984911550468, -1.356984911550468)
kurt(x, type=3), kurt3(x)
(-1.6335639958376689, -1.6335639958376689)
1.3.2. 歪度
using StatsBase
skew1(x) = skew(x, type=1)
skew2(x) = skew(x, type=2)
skew3(x) = skew(x)
function skew(x; type=3)
doc = """
skew(x; type=n)
type: an integer between 1 and 3 selecting one of the algorithms for computing skewness detailed below.
Type 1: g1. This is the typical definition used in many older textbooks.
Type 2: G1. Used in SAS and SPSS.
Type 3: b1. default. Used in MINITAB and BMDP.
"""
type in 1:3 || error(doc)
n = length(x)
g1 = skewness(x)
if type == 2
return g1*sqrt(n*(n - 1))/(n - 2)
elseif type== 3
return g1*((n - 1)/n)^(3/2)
else
return g1
end
end;
x = [1, 2, 3, 4, 5, 6, 7, 8, 9, 1]
skew(x), skew3(x)
(0.10905343709973031, 0.10905343709973031)
skew(x, type=1), skew1(x)
(0.12772490663150174, 0.12772490663150174)
skew(x, type=2), skew2(x)
(0.15146310708295876, 0.15146310708295876)
skew(x, type=3), skew3(x)
(0.10905343709973031, 0.10905343709973031)