Compute z-statistic for Wilcox Signed-Rank test (ZScoreForWilcoxSignedRankTest)

class cerebstats.stat_scores.zWilcoxSignedRankScore.ZScoreForWilcoxSignedRankTest(*args: Any, **kwargs: Any)

Compute z-statistic for Wilcox Signed Rank Test. Note that this is not Wilcoxon Signed Rank-Sum test.

Definitions

Interpretation

\(\eta_0\)

some specified value \(^{\dagger}\)

\(x_i\)

each data value

\(|x_i-\eta_0|\)

absolute difference between data value and null value

\(T\)

ranks of the computed absolute difference (excluding difference = 0 )

\(T^+\)

sum of ranks above \(\eta_0\); Wilcoxon signed-rank statistic

\(n_U\)

number of values in sample not equal to \(\eta_0\); sample size

\(\mu_{T^+}\)

assuming \(H_0: \nu = \nu_0\) is true, \(\mu_{T^+}\) = \(\frac{ n_U(1+n_U) }{ 4 }\)

\(\sigma_{T^+}\)

assuming \(H_0\) is true, \(\sigma_{T^+}\) = \(\sqrt{ \frac{ n_U(1+n_U)(1+2n_U) }{24} }\)

z-statistic, z

z = \(\frac{ T^+ - \mu_{T^+} }{ \sigma_{T^+} }\)

\(^{\dagger} \eta_0\), null value is

  • the model prediction for one sample testing

  • 0 for testing with paired data (observation - prediction)

NOTE:

  • use this test only when the distribution is symmetric (not necessarily bell-shaped)

  • this test should not be used for skewed data

  • the test is often applied to paired data

  • \(\eta_0\) is the prediction if its not a list of same length as the observation data

  • for paired data \(\eta_0 = 0\) for zero poulation median difference

Use Case:

x = ZScoreForWilcoxSignedRankTest.compute( observation, prediction )
score = ZScoreForWilcoxSignedRankTest(x)

Note: As part of the SciUnit framework this custom TScore should have the following methods,

  • compute() (class method)

  • sort_key() (property)

  • __str__()

Additionally,

classmethod compute(observation, prediction)

Argument

Value type

first argument

dictionary; observation/experimental data

second argument

float or array; simulated data

Note:

  • observation must have the key “raw_data” whose value is the list of numbers

  • simulation, i.e, model prediction is not a float it must also have the key “raw_data”

classmethod get_Tplus(data, null_value)

Returns computed Wilcoxon signed-rank statistic, Tplus.

  • case1: data = observation[“raw_data”], null_value = prediction

  • case2: data = observation[“raw_data”] + prediction, null_value = 0

Example for describing what ‘ranking’ means:

\(data = [65, 55, 60, 62, 70]\)

\(null\_value = 60\)

Then,

\(ordered\_data = [55, 60, 62, 65, 70]\)

\(absolute\_difference = [5, 0, 2, 5, 10]\)

\(absolute\_difference\_without\_zeros = [5, 2, 5, 10]\)

\(ordered\_data\_without\_zeros = [55, 62, 65, 70]\)

\(all\_ranks = [1, 2, 3, 4]\)

Therefore, \(T^+\), Wilcoxon signed-rank statistic is

\(Tplus= 1+2+3+4 = 10\)

static get_ranks(absdiff_without_zero)

Static function that orders the data and returns its appropriate rank.

Step-1:

  • get unique values in the ordered data

  • also get the number of frequencies for each unique value

Step-2:

  • construct raw ranks based on the ordered data

Step-4:

  • for each value in the ordered data find its index in unique values array

  • if the corresponding count is more than one compute its midrank (sum ranks/its count)

  • set ranks (in raw ranks) for the corresponding number of values with the computed midrank