Animal Behavior Laboratory Manual

CHI-SQUARED TEST

Table of Critical Values

Lab Manual Table of Contents

The two tests described elsewhere, the Sign Test and the Wilcoxon Test, both focus on comparisons between paired observations.   The null hypothesis states that chance alone might reasonably explain the differences observed.

In some cases, you might want to compare your observations with a preconceived hypothesis, rather than to compare paired observations.   In our very first example, we considered whether a coin was fair or not. In this case, we have a preconceived hypothesis about how a fair coin should behave.   We of course expect a fair coin to land heads up 50% of the time.   This hypothesis is based on pure theory, so it is truly preconceived or a priori (Latin for "in advance").   Of course, we realize that we are not likely to get exactly 50% heads in a sample of tosses.   What we would like to know is whether or not it is reasonable to accept the null hypothesis that a coin is fair as opposed to the alternative hypothesis that something is influencing its behavior.

The Chi-squared Test is the one to use in this situation.   "Chi" is the third from last letter of the Greek alphabet, the origin of our letter X.   The test computes a number conventionally represented by X2.

Many students have encountered this test in other courses and know the formula for calculating X2:

(Observed - Expected)2 / Expected
[for k observations]

This formula compares k observed values with their corresponding expected values.   There are several important points to notice.  

  1. The observed values are counts of independent events; so each observed value is always a whole number.   If we were testing whether or not a coin was fair, we would count the number of heads and the number of tails in a series of tosses.   In this case, k = 2, for the two possible outcomes, heads or tails.

  2. An independent event is an observation that is not influenced by the result of any other observation.   In our example of the coin, each toss is an independent event, if the outcome of any one toss does not influence the outcome of any other.   Simply turning a coin over does not fulfill this criterion, because the result would depend on the immediately preceding result.

  3. The expected values do not have to be whole numbers, because they are derived from theory. For instance, the null hypothesis that a coin is fair leads to an expected value of 7.5 heads on average in 15 tosses.

To interpret the value of X2 calculated from a series of observed and expected values, we need to know how many degrees of freedom the expected values represent.   This concept is easier to understand than it seems.   In the case of our coin, notice that we have an expected value of N/2 heads for N independent tosses.   We also have an expected value of N/2 tails.   If you stop to think about it, we really had only one free choice in calculating these two expected values, because as soon as we had calculated the number of heads, the number of tails was a foregone conclusion (it was simply all the results that were not heads).   We thus had only one degree of freedom in calculating our two expected values (df = 1).

The example of the coin illustrates a simple but very useful application of the Chi-squared Test.   It compares the observed proportion of outcomes (the number of outcomes in each of several mutually exclusive categories) to the expected proportions.   In such cases, df = k - 1, where k = the number of possible outcomes.   In the test of a fair coin, k = 2 for the two possible outcomes of a toss (heads or tails), so df = 2 - 1 = 1, as we just saw.

Suppose red and green jellybeans are scattered in a grassy plot, and a student finds 15 red ones and 1 green one.   Could chance alone explain this result?   The answer depends on the proportions of reds and greens in the jellybeans scattered in the grass.   Suppose they included twice as many reds as greens.   Theory then leads us to expect that by chance the student should find twice as many reds as greens -- on the assumption (our null hypothesis) that color does not influence the chance of finding a jellybean.   In a total of 16 jellybeans, we would then expect to find 10.7 reds and 5.3 greens on average.   If the student actually found 15 reds and 1 green, we could use the above formula to calculate a value for X2 as follows:

X2 = (15 - 10.7)2/10.7 + (1 - 5.3)2/5.3 = 1.73 + 3.49 = 5.22.

In this case, k = 2 because there are two outcomes (red or green), and thus df = 1.

As when we used the Sign Test and the Wilcoxon Test, we must next look up our calculated value of X2 in a table of critical values.   This table presents the values of X2 for which there is a 5% probability that chance alone can explain the results (the null hypothesis).   So any value of X2 greater than the critical value indicates that the null hypothesis has less than a 5% chance of being correct.   As before, scientists usually agree that a probability of 5% or less is low enough to reject the null hypothesis -- at least for the time being.

In the example with the jellybeans, the calculation above shows that X2 = 5.22.   This value is greater than 3.84, the critical value for df = 1.   So we could report,

"The subject found red jellybeans significantly more often and green ones less often than expected by chance (Chi-squared Test, X2 = 5.22, df = 1, N = 16)."

CRITICAL VALUES FOR X2 IN THE CHI-SQUARED TEST

[Help!]

df Critical Values
1 3.84
2 5.99
3 7.82
4 9.49
5 11.07
6 12.59
7 14.07
8 5.51
9 16.92
10 18.31
11 19.68
12 21.03