I. Norm-Referenced Measurement
1. Definition - In norm-referenced testing an examinee's
performance is compared with the performance of a specific group.
A raw score alone tells us nothing. What does saying someone
got 17 out of 20 right mean. This allows us to compare a person's
score with the typical performance of a group, usually one representative
of the population of the country. Then we know how someone is
performing compared to other kids their age.
2. Representativeness - This is extent norm group is
characteristic of the population. Should match as closely as
possible demographics of population. Usually use age, grade level,
gender, geographic region, ethnicity, race, and SES.
3. Size - Should be large enough to insure stability
of test scores and inclusion of all groups that are represented
in population. Should include at least 100 subjects at each age
and grade level.
4. Relevance - What is the norm group you are selecting?
National norms are usually used in IQ tests. For educational
achievement, local norms may be more appropriate. May want to
compare only against their race or ethnic background. Depends
on what you want to use the results for.
II. Derived Scores
1. Age and Grade Equivalent Scores - This is arrived at
by determining the average performance of children at an age
or grade. You therefore find out at what age or grade the average
child got the score of the examinee and that is their age or
grade score. It does not mean that a child is functioning at
that level. Grade scores should never be used. They are too misleading.
2. Ratio IQ - This is the first way that IQ's were calculated.
It is MA/CA x 100. This generally is not used anymore and if
you see a test that uses this procedure you should question its
utility.
3. Percentile Ranks - Allows us to determine an individual's
position relative to the standardization sample. It is the point
in the distribution at or below which that percentage of individuals
fall. They are easy for people to understand, especially parents.
4. Standard Scores - The deviation IQ is a version of
a standard score. These are raw scores that have been transformed
to have a given mean and standard deviation. They express how
far an individual's score lies from the mean in terms of the
standard deviation.
III. Reliability
1. Definition - Reliability refers to the stability of
measurement. That is, it is reproducible and stable. It is expressed
by a reliability (correlation) coefficient (range 0 to 1.0).
A test should not be trusted if its reliability is low. Generally
a reliability of .80 or higher is considered acceptable.
2. Test-Retest Reliability - An index of stability. You
administer the same test to the same people at two different
occasions, usually a short period of time (2 weeks to a month).
The correlation represents the extent to which the score is stable
over time.
3. Internal Consistency Reliability - Also called split-half
reliability. Divide a test into two equivalent halves (usually
odd versus even) and then obtain correlation.
IV. Validity
1. Definition - Generally it refers to does the test measure
what it is supposed to measure? How meaningful is your result?
Tests can be used for different purposes and therefore the validity
of a test can vary depending on its use. IQ tests are a valid
predictor of a child's success in school. It is not a valid predictor
of your ability to be a janitor. Tests should be used only for
those purposes for which its validity has been established. A
test must be reliable to be valid but it can be reliable without
being valid. Reliability is necessary but not sufficient.
2. Content Validity - Do the items on the test cover the
domain that the test purports to measure. If you are measuring
intelligence, does test cover all aspects of intelligence. It
should not be confused with Face Validity which
refers to what it appears to measure. It does not necessarily
mean it has content validity.
3. Criterion-Related Validity - Refers to relationship
between test score and some type of criterion or outcome such
as a rating or other test score. Two types: (a) Concurrent
Validity - How it relates to some other currently available
measure. If came up with a new IQ test would see how it relates
to standard in field, the WISC-R. (b) Predictive Validity
- Refers to correlation between test score and performance on
relative criterion after a period of time has passed. Is the
test a predictor of future performance. Is IQ a predictor of
future school performance?
4. Construct Validity - How does the test relate to the
theoretical constructs that the test purports to measure. For
example, does an IQ test measure the construct of intelligence
the way the author designed. Usually done through factor analysis.
WISC-R, does it break down to verbal and performance sections.
V. Definitions of Intelligence
1. Binet (1916) - "...judgement, otherwise called
good sense, practical sense, initiative, the faculty of adapting
one's self to circumstances. To judge well, to comprehend well,
to reason well, these are the essential activities of intelligence."
2. Terman (1923) - The ability to carry on abstract thinking.
3. Stoddard (1943) - "...the ability to undertaker
activities that are characterized by (1) difficulty, (2)
complexity, (3) abstractness, (4) economy, (5)
adaptedness to goal, (6) social value, and (7)
the emergence of originals, and to maintain such activities under
conditions that demand a concentration of energy and a resistance
to emotional forces."
4. Freeman (1955) - "...adjustment or adaptation
of the individual to his total environment, or limited aspects
thereof....the capacity to reorganize one's behavior patterns
so as to act more effectively and more appropriately in novel
situations...the ability to learn...the extent to which a person
is educable...the ability to carry on abstract thinking...the
effective use of concepts and symbols in dealing with a problem
to be solved..."
5. Wechsler (1958) - The aggregate or global capacity
of the individual to act purposefully, to think rationally, and
to deal effectively with his environment."
6. Das (1973) - "....the ability to plan and structure
one's behavior with an end in view."
7. Humphreys (1979) - "....the resultant of the process
of acquiring, storing in memory, retrieving, combining, comparing,
and using in new contexts information and conceptual skills."
8. Conclusion - I don't care if you know all the different
theories I just gave or the ones given in the book. Just remember
that there are two major divisions in theories. Those that view
intelligence as a unitary construct and those that see it is
having many components.
VI. History of Intelligence Testing
1. Roots - Interest in intelligence and intelligence testing
began in earnest during the latter part of the 19th century as
psychology began to emerge as a disciple of its own.
2. Galton (1869) - Some regard him as the father of the
testing movement. He was concerned with the study of individual
differences and began studying mental inheritance.
3. Binet (1905) - With Simon developed the Binet-Simon
Scale, the first test of intelligence to have valuable practical
application.
4. Terman (1916) - Published the Stanford Revision and
Extension of the Binet-Simon Intelligence Scale.
5. Wechsler - Published the Wechsler-Bellevue Intelligence
Scale in 1939. Since that time the Wechsler scales (WAIS, WISC,
WPPSI) have become the most frequently used tests
VII. Intelligence Classifications
1. Profound Retardation - IQ < 20
2. Severe Retardation - IQ = 20-39
3. Moderate Retardation - IQ = 40-54
4. Mild Retardation - IQ = 55-69
5. Borderline Retardation - IQ = 70-79
6. Dull Normal - IQ = 80-89
7. Average - IQ = 90-109
8. Bright Normal - IQ = 110-119
9. Superior - IQ = 120-129
10. Very Superior - IQ = 130+
V. Intelligence Tests Used Today
1. WISC-3 - Appropriate for children age 6 through 17.
It is the most commonly used test today. Provides you with a
Verbal IQ (measure of verbal comprehension), Performance IQ (perceptual
organizational skills), and Full Scale IQ.
2. WAIS-R - Appropriate for individuals ages 16 through
74. Most commonly used adult test. Provides you with same scores
as the WISC-R.
3. WPPSI-R - Appropriate for children 3 to 7 years of
age. Also gives VIQ, PIQ, and FSIQ.
4. Stanford-Binet - Revision just came out a little more
than a year ago. Appropriate for ages 2 to adult. Provides measures
of verbal reasoning, abstract/visual reasoning, quantitative
reasoning, short-term memory, and a composite IQ.
VIII. Stability of IQ
1. Infant Intelligence Tests - Tests of IQ during first
year of life do not accurately predict IQ performance later in
childhood although they are useful in identifying motor problems
or extreme IQ deficiencies. These tests are made up primarily
of sensorimotor tasks.
2. Later Testing - From about age 8 on the tests become
very good predictors of adult intelligence although there is
still a lot of variability.
3. Prediction - IQ tests were generally developed to predict
how well someone will do in school. They do this very well.
IX. The Genetic Hypothesis
1. Controversial - This is a very controversial area that
explains differences between groups (black and white) in intelligence
is inherited.
2. Arthur Jenson - The strongest proponent of this position.
He believes that lower-class and black children are not as good
as middle class and white children in abstract reasoning and
problem solving (higher level intellectual abilities).
3. Study - He was initially not looking to test a genetic
hypothesis but rather he was looking for a cumulative deficit.
This would be an increase in decrement of IQ with age as a result
of environmental deprivation.
4. California Study - First attempt to find this. Looked
at children 5-10 years. Found age decrement for black children
on verbal IQ. Also found that overall, blacks scored about 1
standard deviation below whites.
5. Georgia Study - To further test the hypothesis of a
cumulative deficit he looked to find the worst environment he
could. Found it in rural Georgia. Tested children in grades K-12
in an integrated school (1/2 white, 1/2 black). Blacks were very
low SES, whites were low to low middle SES. Findings:
(a) IQ - White = 102 Black = 71 (b) significant
age decrement in blacks but not whites in both verbal and performance
IQ. He concluded difference due to heredity.
6. Results and Types of Intelligence - Level I
intelligence is rote memory. Mostly short-term memory tasks like
digit span. Level II intelligence involves more complex
cognitive processing and problem solving skills. Jenson found
the Black-White difference on Level II only. He also found Blacks
to do the worst on the least culturally loaded tests and best
on vocabulary and information which are the most culturally loaded.
Other studies have not confirmed this however.
6. Criticisms - If true would expect mixed-race persons
to be lower than whites. When compared to white children of similar
backgrounds there was no difference in IQ. In general there is
no evidence to support this or deny it. That is why it continues
to be perpetuated.
X. Heredity - Environment Issue
1. Studies in Families - (a) In general, the more
the genetic similarity, the higher the correlation. (b)
It does however support the role of both heredity and environment..
2. Study With Adopted Children - Look at scores of children
adopted before 6 months into middle SES homes. Group I:
Mother had IQ<75. Group II: Biological fathers occupation
was at very low level. Group III: Biological parents both
I and II. Results: Tested children at 8 years. I, IQ=105;
II, IQ=110; III, IQ=104.
3. Conclusion - Heredity may set limits but the rest is
determined by how much stimulation you get in your environment.