Can a Chi2 test be used on uniform p test results?

Paul Uszak · Aug 14, 2015

Not sure that I've phrased the question correctly. If you have a series of p values from a series of tests, and they're all meant to be uniformly distributed, why do you have to do a KS test on that, and not another Chi-squared test?

The following is an extract from a test program's output:-
Test no. 1 p-value .886973
Test no. 2 p-value .473563
Test no. 3 p-value .358962
Test no. 4 p-value .894858
Test no. 5 p-value .767457
Test no. 6 p-value .583446
Test no. 7 p-value .227626
Test no. 8 p-value .765091
Test no. 9 p-value .298747
Test no. 10 p-value .108371
Results of the OSUM test for pu256.bin
KSTEST on the above 10 p-values: .059581

The p-values are meant to be uniformly distributed across 0.0 to 1.0. This implies that they should all be 0.5ish. Why doesn't the program (Diehard randomness tester) perform a Chi-squared test on the ps? This happens several times in the complete report, so I take it to be deliberate. It's always a KS test on uniformly distributed ps. Isn't the Chi-squared test numerically simpler too?

FactChecker · Aug 14, 2015

Paul Uszak said:

The p-values are meant to be uniformly distributed across 0.0 to 1.0. This implies that they should all be 0.5ish.

This is a contradiction. If the values of p are uniformly distributed, they should not be clustered around 0.5.

Why doesn't the program (Diehard randomness tester) perform a Chi-squared test on the ps? This happens several times in the complete report, so I take it to be deliberate. It's always a KS test on uniformly distributed ps. Isn't the Chi-squared test numerically simpler too?

Here is my 2 cents. For Chi-squared, you have to divide the sample data into bins. That loses a lot of information and you do not have a lot of data (10 values of p). Since KS looks at the maximum difference between the cumulative distribution of the sample versus the theoretical cumulative distribution, it is more powerful for small data sets.

Paul Uszak · Aug 15, 2015

Hmm, sound like I'm confusing discrete and continuous distributions... Again. I'd made the assumption that if the p-values were uniformly distributed 0 - 1, then the expected p-values would be 0.5, hence a Chi-squared test could be attempted. As I write my assumption, I can see that it's logically flawed.

BWV · Aug 16, 2015

Chi-Test tests that variables fit a chi-distribution which is the sum of the squares of independent standard normal variables. The K-s test makes no assumptions of an underlying distribution so for uniform variables it is a better choice.

FactChecker · Aug 16, 2015

Paul Uszak said:

Hmm, sound like I'm confusing discrete and continuous distributions... Again. I'd made the assumption that if the p-values were uniformly distributed 0 - 1, then the expected p-values would be 0.5, hence a Chi-squared test could be attempted. As I write my assumption, I can see that it's logically flawed.

You are right that the expected value of p, uniform in [0,1], would be 0.5. But that does not mean that the values will cluster around 0.5. In fact they should be randomly spread between 0 and 1. As an extreme example, suppose you can have a random variable X which is 0 half the time and 1 the other half. Then its expected value is 0.5, but it is always at 0 or 1.

PS: There can be more clustering in uniform random than you would expect, but there is no reason for the clustering to prefer 0.5.

Can a Chi2 test be used on uniform p test results?

Related to Can a Chi2 test be used on uniform p test results?

1. Can a Chi2 test be used on uniform p test results?

2. How does a Chi2 test work?

3. What is the null hypothesis in a Chi2 test?

4. When should a Chi2 test be used?

5. What are the limitations of a Chi2 test?

Similar threads

Hot Threads

Recent Insights