Chi-Square Test Explained: When and How to Use It
The chi-square test is one of the most widely used statistical tests for categorical data. It allows you to determine whether observed frequencies differ significantly from what you would expect under a specific hypothesis. Whether you are testing whether a die is fair, checking if two survey variables are related, or validating a genetic model, the chi-square test provides a rigorous framework for answering these questions.
This guide covers the chi-square formula, the two main types of chi-square test (goodness of fit and independence), degrees of freedom, p-value interpretation, and three fully worked examples with every calculation shown.
The Chi-Square Formula
Both types of chi-square test use the same core formula. For each category, you compare the observed count () to the expected count ():
The formula squares the difference between observed and expected values, divides by the expected value (to normalise), and sums across all categories. A larger value means the observed data deviates more from what was expected.
Two Types of Chi-Square Test
1. Goodness of Fit Test
The goodness of fit test checks whether a single categorical variable follows a hypothesised distribution. You have one set of observed counts and one set of expected counts based on your hypothesis.
Example questions: Is this die fair? Do customers visit equally on all days of the week? Does the distribution of blood types in a sample match the known population proportions?
2. Test of Independence
The test of independence checks whether two categorical variables are related. The data is organised in a contingency table (rows for one variable, columns for the other), and you test whether the variables are independent of each other.
Example questions: Is there a relationship between gender and voting preference? Does smoking status depend on age group? Is product preference related to geographic region?
Degrees of Freedom
The degrees of freedom (df) determine which chi-square distribution to use when finding your p-value. The formula depends on the type of test.
- Goodness of fit: , where is the number of categories.
- Test of independence: , where is the number of rows and is the number of columns in the contingency table.
Worked Example 1: Is This Die Fair? (Goodness of Fit)
You roll a six-sided die 120 times and record the following results:
| Face | 1 | 2 | 3 | 4 | 5 | 6 |
|---|---|---|---|---|---|---|
| Observed | 25 | 17 | 22 | 18 | 14 | 24 |
| Expected | 20 | 20 | 20 | 20 | 20 | 20 |
If the die is fair, each face should appear times. The null hypothesis is that the die is fair (all faces equally likely).
Step 1: Calculate Each Component
| Face | |||||
|---|---|---|---|---|---|
| 1 | 25 | 20 | 5 | 25 | 1.25 |
| 2 | 17 | 20 | -3 | 9 | 0.45 |
| 3 | 22 | 20 | 2 | 4 | 0.2 |
| 4 | 18 | 20 | -2 | 4 | 0.2 |
| 5 | 14 | 20 | -6 | 36 | 1.8 |
| 6 | 24 | 20 | 4 | 16 | 0.8 |
Step 2: Sum the Components
Step 3: Find the Degrees of Freedom
Step 4: Determine the P-Value
Using a chi-square distribution table or calculator with and , the p-value is approximately 0.454.
Step 5: Interpret
At a significance level of , the p-value (0.454) is much greater than 0.05. We fail to reject the null hypothesis. There is no statistically significant evidence that the die is unfair. The observed deviations from 20 per face are well within the range of normal random variation.
Try it yourself
Use our Chi-Square Calculator to compute and the p-value for your own data.
Worked Example 2: Survey Independence (Test of Independence)
A researcher surveys 200 people about their preferred mode of transport (car, bus, or bicycle) and records their age group (under 30 or 30 and over). The question: is transport preference independent of age group?
| Car | Bus | Bicycle | Row Total | |
|---|---|---|---|---|
| Under 30 | 30 | 35 | 25 | 90 |
| 30 and over | 60 | 30 | 20 | 110 |
| Column Total | 90 | 65 | 45 | 200 |
Step 1: Calculate Expected Frequencies
For each cell, the expected frequency is:
| Car (E) | Bus (E) | Bicycle (E) | |
|---|---|---|---|
| Under 30 | |||
| 30 and over |
Step 2: Calculate Chi-Square
Step 3: Degrees of Freedom
Step 4: P-Value and Interpretation
With and , the p-value is approximately 0.011.
At , the p-value (0.011) is less than 0.05. We reject the null hypothesis of independence. There is statistically significant evidence that transport preference is related to age group. Looking at the data, the under-30 group favours bus and bicycle more than expected, while the 30-and-over group favours cars more than expected.
Worked Example 3: Genetic Ratios (Goodness of Fit)
A biology student crosses two heterozygous pea plants and observes the offspring phenotypes. According to Mendelian genetics, the expected ratio for a monohybrid cross is 3:1 (dominant to recessive). Out of 160 offspring:
| Phenotype | Observed (O) | Expected (E) |
|---|---|---|
| Round (dominant) | 115 | |
| Wrinkled (recessive) | 45 |
Calculate Chi-Square
Degrees of Freedom and P-Value
With and , the p-value is approximately 0.361.
The p-value is well above 0.05, so we fail to reject the null hypothesis. The observed ratio of 115:45 is consistent with the expected 3:1 Mendelian ratio. The deviation from the perfect 120:40 split is not statistically significant.
Try it yourself
Need to look up a p-value? Use our P-Value Calculator to convert any test statistic into a p-value instantly.
Assumptions and Conditions
The chi-square test requires several conditions to produce valid results.
- Random sampling. The data should come from a random sample or a randomised experiment.
- Independence of observations. Each observation must be independent. One person (or item) should appear in only one cell of the table.
- Sufficiently large expected counts. A common rule of thumb is that all expected frequencies should be at least 5. If any expected count is below 5, consider combining categories or using Fisher's exact test instead.
- Categorical data. The chi-square test is designed for counts of categorical outcomes, not continuous measurements. Do not apply it to means or raw measurements.
Reading a Chi-Square Distribution Table
Chi-square distribution tables show critical values for different combinations of degrees of freedom and significance levels. To use the table:
- Find the row matching your degrees of freedom.
- Find the column matching your significance level (commonly 0.05 or 0.01).
- If your calculated exceeds the critical value, reject the null hypothesis.
For example, with and , the critical value is 11.070. Our die example gave , which is below 11.070, so we failed to reject. Using a calculator or software to find the exact p-value is more precise than using tables.
Chi-Square vs Other Tests
- Chi-square vs t-test: The t-test compares means of continuous data. The chi-square test compares frequencies of categorical data. They answer fundamentally different questions.
- Chi-square vs Fisher's exact test: Fisher's exact test is used when sample sizes are small (expected counts below 5). It gives an exact p-value rather than an approximation.
- Chi-square vs G-test: The G-test (likelihood ratio test) is an alternative to the chi-square test that uses logarithms. For large samples, both give similar results.
Common Mistakes
- Using percentages instead of counts. The chi-square formula requires raw counts (frequencies), not percentages or proportions.
- Ignoring the expected count rule. If any expected frequency is below 5, the chi-square approximation may be unreliable.
- Confusing statistical and practical significance. A statistically significant result (low p-value) does not necessarily mean the difference is practically important, especially with large sample sizes.
- Using the wrong degrees of freedom. Goodness of fit uses , while independence uses . Mixing these up gives the wrong p-value.
Frequently Asked Questions
What does a chi-square value of 0 mean?
A chi-square value of exactly 0 means the observed frequencies perfectly match the expected frequencies in every category. In practice this almost never happens with real data. A very small chi-square value simply means the observed data is very close to what was expected.
Can chi-square be negative?
No. Because the formula squares the differences , every component is non-negative, and the sum must be zero or positive. If your calculation produces a negative value, there is an arithmetic error.
What if my expected count is less than 5?
The chi-square test relies on an approximation that works well when expected counts are reasonably large. If any expected count falls below 5, you have several options: combine adjacent categories to increase expected counts, use Fisher's exact test (for 2x2 tables), or use a Monte Carlo simulation approach. Many statistical software packages can perform exact tests automatically.
Is the chi-square test one-tailed or two-tailed?
The chi-square test is always one-tailed in the sense that you only look at the right tail of the distribution. Large values of indicate poor fit, and you never reject based on a chi-square value being too small. However, the test itself detects deviations in any direction (observed counts can be higher or lower than expected in any category).
How is the chi-square test related to the p-value?
The chi-square statistic is converted to a p-value using the chi-square distribution with the appropriate degrees of freedom. The p-value tells you the probability of observing a chi-square value at least as extreme as the one you calculated, assuming the null hypothesis is true. A smaller p-value means stronger evidence against the null hypothesis. You can compute this using our P-Value Calculator or our Chi-Square Calculator.
Related Articles
Probability Distributions Explained: Normal, Binomial, and More
Understand the most important probability distributions: normal, binomial, Poisson, and uniform. Learn when to use each and how to calculate probabilities.
GCD and LCM Explained: Methods, Formulas, and Applications
Learn how to find the greatest common divisor and least common multiple using prime factorisation, the Euclidean algorithm, and their key relationship.
ANOVA Explained: One-Way Analysis of Variance Guide
Learn how to perform one-way ANOVA to compare means across multiple groups. Covers the F-statistic, SS/MS calculations, assumptions, and post-hoc tests.