Compute z-statistic for Wilcox Signed-Rank test (ZScoreForWilcoxSignedRankTest
)¶
- class cerebstats.stat_scores.zWilcoxSignedRankScore.ZScoreForWilcoxSignedRankTest(*args: Any, **kwargs: Any)¶
Compute z-statistic for Wilcox Signed Rank Test. Note that this is not Wilcoxon Signed Rank-Sum test.
Definitions
Interpretation
\(\eta_0\)
some specified value \(^{\dagger}\)
\(x_i\)
each data value
\(|x_i-\eta_0|\)
absolute difference between data value and null value
\(T\)
ranks of the computed absolute difference (excluding difference = 0 )
\(T^+\)
sum of ranks above \(\eta_0\); Wilcoxon signed-rank statistic
\(n_U\)
number of values in sample not equal to \(\eta_0\); sample size
\(\mu_{T^+}\)
assuming \(H_0: \nu = \nu_0\) is true, \(\mu_{T^+}\) = \(\frac{ n_U(1+n_U) }{ 4 }\)
\(\sigma_{T^+}\)
assuming \(H_0\) is true, \(\sigma_{T^+}\) = \(\sqrt{ \frac{ n_U(1+n_U)(1+2n_U) }{24} }\)
z-statistic, z
z = \(\frac{ T^+ - \mu_{T^+} }{ \sigma_{T^+} }\)
\(^{\dagger} \eta_0\), null value is
the model prediction for one sample testing
0 for testing with paired data (observation - prediction)
NOTE:
use this test only when the distribution is symmetric (not necessarily bell-shaped)
this test should not be used for skewed data
the test is often applied to paired data
\(\eta_0\) is the prediction if its not a list of same length as the observation data
for paired data \(\eta_0 = 0\) for zero poulation median difference
Use Case:
x = ZScoreForWilcoxSignedRankTest.compute( observation, prediction ) score = ZScoreForWilcoxSignedRankTest(x)
Note: As part of the SciUnit framework this custom
TScore
should have the following methods,compute()
(class method)sort_key()
(property)__str__()
Additionally,
get_observation_rank()
(instance method)__orderdata_ranks()
(private method)
- classmethod compute(observation, prediction)¶
Argument
Value type
first argument
dictionary; observation/experimental data
second argument
float or array; simulated data
Note:
observation must have the key “raw_data” whose value is the list of numbers
simulation, i.e, model prediction is not a float it must also have the key “raw_data”
- classmethod get_Tplus(data, null_value)¶
Returns computed Wilcoxon signed-rank statistic, Tplus.
case1: data = observation[“raw_data”], null_value = prediction
case2: data = observation[“raw_data”] + prediction, null_value = 0
Example for describing what ‘ranking’ means:
\(data = [65, 55, 60, 62, 70]\)
\(null\_value = 60\)
Then,
\(ordered\_data = [55, 60, 62, 65, 70]\)
\(absolute\_difference = [5, 0, 2, 5, 10]\)
\(absolute\_difference\_without\_zeros = [5, 2, 5, 10]\)
\(ordered\_data\_without\_zeros = [55, 62, 65, 70]\)
\(all\_ranks = [1, 2, 3, 4]\)
Therefore, \(T^+\), Wilcoxon signed-rank statistic is
\(Tplus= 1+2+3+4 = 10\)
- static get_ranks(absdiff_without_zero)¶
Static function that orders the data and returns its appropriate rank.
Step-1:
get unique values in the ordered data
also get the number of frequencies for each unique value
Step-2:
construct raw ranks based on the ordered data
Step-4:
for each value in the ordered data find its index in unique values array
if the corresponding count is more than one compute its midrank (sum ranks/its count)
set ranks (in raw ranks) for the corresponding number of values with the computed midrank