Minnesota Interdisciplinary Training in Education Research (MITER) Program
University Technology Center 1313 SE Fifth St., Suite 118 Minneapolis, MN
55414
presented by William Shadish, University of California, Merced February 22, 2008
VIEW THE PRESENTATIONS
Causal Conclusions from Quasi-Experimental Data?
Duration: 1 hour 7 minutes and 39 seconds
Abstract
Early social experiments in the 1960s encountered significant
technical and logistical problems, leading some researchers to
prefer other methodologies. During the last 10 years, however,
experiments have re-emerged as a more widely-used methodology.
This talk will review the events that prompted this renaissance,
and then examine progress in the use of several different kinds
of designs: the randomized experiment, the regression
discontinuity design, and the simple nonequivalent comparison
group design with a pretest. For two quasi-experimental designs,
empirical studies now suggest that they can provide estimates of
effects that are as good as those from randomized
experiments—although we still have much to learn about the
conditions under which this optimistic conclusion might hold.
Can Nonrandomized Experiments Yield Accurate Answers? A
Randomized Experiment Comparing Random to Nonrandom Assignment?
MITER Brown Bag Talk given on February 22, 2008, by William
Shadish
Duration: 1 hour 19 minutes and 48 seconds
Abstract
This talk presents final analyses from a study with M.H.
Clark (Southern Illinois University, Carbondale) and Peter M.
Steiner (Institute for Advanced Studies, Vienna, Austria) in
which participants were randomly assigned to a randomized or a
nonrandomized experiment. In the randomized experiment,
participants were randomly assigned to mathematics or vocabulary
training; in the nonrandomized experiment, they chose their
training. The study held all other features of the experiment
constant; it carefully measured pretest variables that might
predict the condition that participants chose; and all
participants were measured on vocabulary and mathematics
outcome. The analyses used covariates to create propensity
scores, used Rubin’s (2001) diagnostics for balance over
covariates after propensity score adjustment, used
covariate-adjusted randomized results as the benchmark, and
compared propensity score adjustments to ordinary linear
regression adjustments. Ordinary linear regression reduced bias
in the nonrandomized experiment by 84 – 94% using
covariate-adjusted randomized results as the benchmark.
Propensity score stratification, weighting and covariance
adjustment reduced bias by about 58 – 96%, depending on the
outcome measure and adjustment method. Propensity score
adjustment performed poorly when the scores were constructed
from predictors of convenience (sex, age, marital status and
ethnicity) rather than from a broader set of predictors that
might include these. We present some results clarifying the
circumstances under which propensity scores might work better or
worse, and conclude with implications for practice.
Dr. Shadish is the author (with T.D. Cook & D.T. Campbell,
2002) of Experimental and Quasi- Experimental Designs for
Generalized Causal Inference, (with T.D. Cook & L.C. Leviton,
1991) of Foundations of Program Evaluation, (with L.
Robinson & C. Lu, 1997) of ES: A Computer Program and Manual
for Effect Size Calculation, co-editor of five other
volumes, the author of over 100 articles and chapters, and
former President of the American Evaluation Association.