Psychology 320: Psychological Statistics

Professor: Howard B. Lee

Lecture Notes

Week 14: Chapter 9

Comparing Frequencies Using Chi Square

With chi square, we are looking at frequency counts, not scores as we did in previous chapters.

Chi Square
The statistic that measures the discrepancy between the observed values and the expected values in a contingency table.

Ex. The following require frequency data:

Ex.
Demographic Data:
ethnic groups (%) ethnic groups
19902000
LA CountyLA County

Toss coin 100 times:
The frequencies that you have observed as a result of the 100 tosses:
H/57, T/43

This "data" is empirical data gathered from an experiment (like the kind of data you get in science when you perform an experiment and then analyze the results to formulate a conclusion about that which you have observed.)

In contrast to observed frequencies, expected or theoretical frequencies are what you think you will obtain when you conduct the experiment. The numbers we observe on the average if the null hypothesis is true.

ho: The coin is fair.
Ph = .5
Ph = Pt
The proportion of heads = the proportion of tails

h1: The coin is not fair.
Ph not = .5
Ph not = Pt

The theoretical or expected frequencies if the coin is in fact fair:
H/50, T/50
These expected frequencies are based on 100 tosses.

If the observed frequencies are not far away from the expected frequencies we can say that we have insufficient evidence of an unfair coin, and may need more information.

The test statistic for this problem:

  1. Take the observed frequency for each category and subtract the expected frequency.
  2. Square this difference.
  3. Divide by the expected frequency.
  4. Add all the numbers up.

Decision Rule:
alpha = .05
df = # of categories - 1 = "K - 1", where K is equal to the number of categories.
Use table D on page 406.
Similar to the f test, chi square is treated as a one-tailed, right-tailed test.
Chi square value is NEVER negative.
For df = 1 and alpha = .05, the critical value is 3.84. So the decision rule is to reject ho if the Chi-Square test statistic is greater than 3.84, otherwise do not reject ho.
Decision: Since 1.96 is less than 3.84, Do not reject ho.
Conclusion: There is insufficient evidence of an unfair coin.

Chi square is called "Goodness of Fit" because we want to see if observed frequencies fit the theoretical frequencies.

Ex. Given 200 consumers:
Theoretical frequencies (percentages) Observed frequencies
A = 38%A = 80
B = 27%B = 50
C = 35% C = 70
Total = 100%Total = 200

Question: Does the observed data fit the theoretical data?

Percentages must be converted to frequencies by multiplying the percentage by the total number of consumers.

Expected (theoretical) frequencies:
A = 38% of 200 = 76
B = 27% of 200 = 54
C = 35% of 200 = 70

ho: Proportion of A = .38
ho: Proportion of B = .27
ho: Proportion of C = .35

h1: Proportion of A not = .38
h1: Proportion of B not = .27
h1: Proportion of C not = .35

Chi square value:

Decision rule:
alpha = .05
df = K - 1 = 3 - 1 = 2, (where there were 3 categories - 1 = 2)
Using table D, find the critical value = 5.99
Reject ho if the chi square test statistic > 5.99, otherwise do not reject ho.

Decision: Since 0.5068 < 5.99, do not reject ho.

Conclusion: There is insufficient evidence of the lack of fit, not enough evidence to refute the researchers proportional claims.

Contingency Tables with Chi Square

With "Pearson Product Moment Correlation" x and y are not frequencies or categories.
With variables that are categorical we need to use chi square to determine if they are related.

T-tests are used to determine if r is statistically significant, and is used with scored data.

Ex. Is gender related to political affiliation?

GenderPolitical
Male D
FemaleR

These are categorical data.
You must use chi square for a Contingency Table.

Data Table:
PeopleGender Political
1MD
2 FD
3FD
4MR
5MR
6MD
7F D
8FR
9 FD
10FR

Political
4 fold Contingency Table
Demo RepubTotal
Gender
Male 2 2 4
Female 4 2 6
Total 6 4 10

Expected frequencies in a contingency table are computed using observed frequency data.

OBSERVED VALUES
PoliticalDemo RepubTotal
Gender
Male 2 2R1= 4
Female 4 2R2= 6
TotalC1 = 6C2 = 4N = 10

Expected VALUES
PoliticalDemo RepubTotal
Gender
MaleC1xR1/N = 2.4 C2xR1/N = 1.6R1= 4
FemaleC1xR2/N = 3.6 C2xR2/N = 2.4R2= 6
TotalC1 = 6C2 = 4N = 10

In the expected frequency table the cells represent counts one would expect if the two categorical variables are totally unrelated.

The chi square says that if observed frequencies fit the expected frequencies, we know that the variables are also not related or are independent of one another.

Decision rule at alpha = .05.
df = (# rows - 1)(# columns - 1) = 1
Use table D for chi square critical value = 3.84
Reject ho if the chi square test statistic > 3.84, otherwise do not reject ho.

Decision:Since 0.2778 < 3.84, do not reject ho.
Gender and politics are not related.


Is therapy and improvement related?

ho: Therapy and improvement are not related (independent).
h1: Therapy and improvement are related (dependent).

Observed data:

OBSERVED VALUES
ImprovementYES NOTotal
Type
Therapy75 25R1= 100
Placebo5842R2= 100
TotalC1 = 133C2 = 67 N = 200

If you are interested in a relationship, use a contingency table.

Expected data:

Expected VALUES
ImprovementYES NOTotal
Type
TherapyC1xR1/N = 66.5 C2xR1/N = 33.5R1= 100
PlaceboC1xR2/N = 66.5 C2xR2/N = 33.5R2= 100
TotalC1 = 133C2 = 67N = 200

Decision rule:
alpha = .05
df = 1, df = (# rows - 1)(# columns - 1)
Table D, critical value = 3.84
Reject ho if the chi square statistic is greater than 3.84, otherwise do not reject ho.
Decision: Reject ho since 6.49 > 3.84
Conclusion: There is evidence that therapy is related to improvement.

What is the degree of this relationship?

Use Cramer's V

O < or = V < = 1
What is the magnitude of the relationship between therapy and improvement? V = .18

This value is significantly different from zero due to our decision of reject ho for the hpothesis test.


Go Home Sign