Hypothesis testing about medians (HtestAboutMedians)

class cerebstats.hypothesis_testings.aboutmedians.HtestAboutMedians(observation, prediction, test={'name': 'sign_test', 'side': 'not_equal', 'z_statistic': 0.0})

Hypothesis Testing (significance testing) about medians.

This is a nonparameteric test that does not assume specific type of distribution and hence robust (valid over broad range of circumstances) and resistant (to influence of outliers) test.

1. Verify necessary data conditions.

Statistic

Interpretation

data

experiment/observed data array \(^{\dagger}\)

  • \(^{\dagger}\)

  • \(\overrightarrow{x} =\) experimental data for one sample testing

  • \(\overrightarrow{x} =\) (experimental - prediction) data for paired data testing

  • thus \(\eta =\) median of \(\overrightarrow{x}\)

2. Defining null and alternate hypotheses.

Statistic

Interpretation

sample statistic, \(\eta\)

experiment/observed median \(^{\dagger}\)

null value/population parameter, \(\eta_0\)

prediction (specified value) \(^{\dagger}\)

null hypothesis, \(H_0\)

\(\eta = \eta_0\)

alternate hypothesis, \(H_a\)

\(\eta \neq or < or > \eta_0\)

Depending on whether testing is for a single sample or for paired data \(^{\dagger}\),

Statistic

single sample

paired data

\(\eta\)

experiment/observed median

median of (experiment - observed)

\(\eta_0\)

model prediction

0

Two-sided hypothesis (default)

\(H_0\): \(\eta = \eta_0\) and \(H_a\): \(\eta \neq \eta_0\)

One-side hypothesis (left-sided)

\(H_0\): \(\eta = \eta_0\) and \(H_a\): \(\eta < \eta_0\)

One-side hypothesis (right-sided)

\(H_0\): \(\eta = \eta_0\) and \(H_a\): \(\eta > \eta_0\)

3. Assuming H0 is true, find p-value.

If the data is skewed, the non-parametric z-score is computed for Sign test.

Statistic

Interpretation

\(s_{+}\)

number of values in sample \(> \eta_0\)

\(s_{-}\)

number of values in sample \(< \eta_0\)

\(n_U = s_{+} + s_{-}\)

number of values in sample \(\neq \eta_0\)

z_statistic, z

z = \(\frac{s_{+} - \frac{n_U}{2}}{\sqrt{\frac{n_U}{4}}}\)

If the data is not skewed, the non-parametric z-score is computed for Signed-rank test (Wilcoxon signed-rank test not Wilcoxon rank-sum test).

Statistic

Interpretation

\(\overrightarrow{x}\)

data \(^{\dagger}\)

\(|x_i-\eta_0|\)

absolute difference between data values and null value

\(T\)

ranks of the computed difference (excluding difference = 0 )

\(T^+\)

sum of ranks \(\eta_0\); Wilcoxon signed-rank statistic

\(n_U\)

number of values in data not equal to \(\eta_0\)

z_statistic, z

z = \(\frac{T^+ - [n_U(n_U+1)/4]}{\sqrt{n_U(n_U+1)(2n_U+1)/24}}\)

Using z look up table for standard normal curve which will return its corresponding p.

4. Report and Answer the question, based on the p-value is the result (true H0) statistically significant?

Answer is not provided by the class but it is up to the person viewing the reported result. The reports are obtained calling the attributes .statistics and .description. This is illustrated below.

ht = HtestAboutMedians( observation, prediction, score,
                        side="less_than" ) # side is optional
score.description = ht.outcome
score.statistics = ht.statistics

Arguments

Argument

Representation

Value type

first

experiment/observation

dictionary that must have keys;

“median”,”sample_size”,”raw_data”

second

model prediction

float or Quantity array

third

(keyword)

about test

dictionary with keywords: “name”: string (“sign_test”,

“signed_rank_test”);

“z_statistic”: float; “side”: string (“not_equal”,

“less_than”, “greater_than”);

and any additional names that is specific to the test

This constructor method generated statistics and outcome (which is then assigned to descirption within the validation test class where this hypothesis test class is implemented).

static alternate_hypothesis(side, symbol_null_value, symbol_sample_statistic)

Returns the statement for the alternate hypothesis, Ha.

get_below_equal_above(data)

Set values for the attributes .below, .equal, and .above the null value, \(\eta_0\) = .specified_value.

static null_hypothesis(symbol_null_value, symbol_sample_statistic)

Returns the statement for the null hypothesis, H0.

test_outcome()

Puts together the returned values of null_hypothesis(), alternate_hypothesis(), and _compute_pvalue(). Then returns the string value for .outcome.