Can a Chi2 test be used on uniform p test results?

In summary, the conversation discusses the use of KS test on uniformly distributed p-values rather than a Chi-squared test. The reason for this is that the KS test is more powerful for small data sets and makes no assumptions about the underlying distribution. The expected value of p for a uniform distribution is 0.5, but this does not mean that the values will cluster around 0.5. The conversation also mentions the confusion between discrete and continuous distributions in this context.
  • #1
Paul Uszak
84
7
Not sure that I've phrased the question correctly. If you have a series of p values from a series of tests, and they're all meant to be uniformly distributed, why do you have to do a KS test on that, and not another Chi-squared test?

The following is an extract from a test program's output:-
Test no. 1 p-value .886973
Test no. 2 p-value .473563
Test no. 3 p-value .358962
Test no. 4 p-value .894858
Test no. 5 p-value .767457
Test no. 6 p-value .583446
Test no. 7 p-value .227626
Test no. 8 p-value .765091
Test no. 9 p-value .298747
Test no. 10 p-value .108371
Results of the OSUM test for pu256.bin
KSTEST on the above 10 p-values: .059581

The p-values are meant to be uniformly distributed across 0.0 to 1.0. This implies that they should all be 0.5ish. Why doesn't the program (Diehard randomness tester) perform a Chi-squared test on the ps? This happens several times in the complete report, so I take it to be deliberate. It's always a KS test on uniformly distributed ps. Isn't the Chi-squared test numerically simpler too?
 
Physics news on Phys.org
  • #2
Paul Uszak said:
The p-values are meant to be uniformly distributed across 0.0 to 1.0. This implies that they should all be 0.5ish.
This is a contradiction. If the values of p are uniformly distributed, they should not be clustered around 0.5.
Why doesn't the program (Diehard randomness tester) perform a Chi-squared test on the ps? This happens several times in the complete report, so I take it to be deliberate. It's always a KS test on uniformly distributed ps. Isn't the Chi-squared test numerically simpler too?
Here is my 2 cents. For Chi-squared, you have to divide the sample data into bins. That loses a lot of information and you do not have a lot of data (10 values of p). Since KS looks at the maximum difference between the cumulative distribution of the sample versus the theoretical cumulative distribution, it is more powerful for small data sets.
 
  • Like
Likes Paul Uszak
  • #3
Hmm, sound like I'm confusing discrete and continuous distributions... Again. I'd made the assumption that if the p-values were uniformly distributed 0 - 1, then the expected p-values would be 0.5, hence a Chi-squared test could be attempted. As I write my assumption, I can see that it's logically flawed.
 
  • #4
Chi-Test tests that variables fit a chi-distribution which is the sum of the squares of independent standard normal variables. The K-s test makes no assumptions of an underlying distribution so for uniform variables it is a better choice.
 
  • #5
Paul Uszak said:
Hmm, sound like I'm confusing discrete and continuous distributions... Again. I'd made the assumption that if the p-values were uniformly distributed 0 - 1, then the expected p-values would be 0.5, hence a Chi-squared test could be attempted. As I write my assumption, I can see that it's logically flawed.
You are right that the expected value of p, uniform in [0,1], would be 0.5. But that does not mean that the values will cluster around 0.5. In fact they should be randomly spread between 0 and 1. As an extreme example, suppose you can have a random variable X which is 0 half the time and 1 the other half. Then its expected value is 0.5, but it is always at 0 or 1.

PS: There can be more clustering in uniform random than you would expect, but there is no reason for the clustering to prefer 0.5.
 

Related to Can a Chi2 test be used on uniform p test results?

1. Can a Chi2 test be used on uniform p test results?

Yes, a Chi2 test can be used on uniform p test results. The Chi2 test is a statistical test used to determine if there is a significant difference between two or more groups of categorical data. Uniform p test results are a type of categorical data that can be analyzed using a Chi2 test.

2. How does a Chi2 test work?

A Chi2 test works by comparing the observed frequencies of a categorical variable with the expected frequencies. The expected frequencies are calculated based on the null hypothesis, which states that there is no significant difference between the groups. The Chi2 test then calculates a test statistic and compares it to a critical value to determine if the null hypothesis can be rejected.

3. What is the null hypothesis in a Chi2 test?

The null hypothesis in a Chi2 test states that there is no significant difference between the groups being compared. It is the default assumption and is used to calculate expected frequencies for comparison with the observed frequencies.

4. When should a Chi2 test be used?

A Chi2 test should be used when analyzing categorical data and determining if there is a significant difference between two or more groups. It is commonly used in social sciences, medical research, and market research.

5. What are the limitations of a Chi2 test?

The Chi2 test can only be used for categorical data and assumes that the expected frequencies are greater than 5 for each group. It also does not provide information on the direction or strength of the relationship between variables. Additionally, the Chi2 test may not be appropriate for small sample sizes or when the data is heavily skewed.

Similar threads

  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
923
  • Set Theory, Logic, Probability, Statistics
Replies
5
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
17
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
1
Views
1K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
4
Views
2K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
3K
  • Set Theory, Logic, Probability, Statistics
Replies
2
Views
4K
  • Set Theory, Logic, Probability, Statistics
Replies
7
Views
3K
  • Calculus and Beyond Homework Help
Replies
2
Views
1K
Back
Top