Resampling: a marriage of computers and statistics. Rudner, Lawrence M. & Shafer, Mary Morello

Volume:

A peer-reviewed electronic journal. ISSN 1531-7714

Permission is granted to distribute this article for nonprofit, educational purposes if it is copied in its entirety and the journal is credited. Please notify the editor if an article is to be used in a newsletter.

	Find similar papers in
		ERICAE Full Text Library Pract Assess, Res & Eval ERIC RIE & CIJE 1990- ERIC On-Demand Docs
	Find articles in ERIC written by
		Rudner, Lawrence M. & Shafer Shafer, Mary Morello

Rudner, Lawrence M. & Shafer, Mary Morello (1992). Resampling: a marriage of computers and statistics. Practical Assessment, Research & Evaluation, 3(5). Retrieved August 18, 2006 from http://edresearch.org/pare/getvn.asp?v=3&n=5 . This paper has been viewed 5,108 times since 11/13/99.

Resampling: A Marriage of Computers and Statistics

Lawrence M. Rudner
Clearinghouse on Assessment and Evaluation

Mary Morello Shafer
American Institutes for Research

Suppose your superintendent asked you to determine whether voucher students are doing better than non-voucher students in your district's elementary schools. You might perform a simple t test or an analysis of variance to find your answer. You would report mean differences and probability levels. And if your superintendent is like most non-statisticians, he or she would accept the magic of statistics without questioning the validity of the assumptions made to use the t test.

Thanks to advances in computer technology, educational researchers are beginning to use simpler statistical methods. These techniques let us empirically address a wider range of questions with smaller data sets and with fewer, less restrictive assumptions. Using such techniques, we can focus on reasoning and on understanding the data, not on complicated formulas and tables. The techniques promise to make statistics a useful, easily learned tool for educational policy makers and researchers.

This article introduces computationally intensive statistics, collectively called resampling techniques. After defining these statistics, we'll use one technique to answer our opening question. We'll then present the arguments for and against resampling.

RESAMPLING DEFINED

Resampling is simply a process for estimating probabilities by conducting vast numbers of numerical experiments. Today, resampling is done with the aid of high speed computers.

In Science News, Peterson (1991) compares resampling techniques to the trial-and-error way gamblers once used to figure odds in card or dice games. Before the invention of probability theory, gamblers would deal out many hands of a card game to count the number of times a particular hand occurred. Thus, by experimentation, gamblers could figure the odds of getting a certain hand in their game.

Probability theory freed researchers from the drudgery of repeated experiments. With a few assumptions, researchers could address a wide range of topics. While the advances in statistics paved the way for elegant analysis, the costs came high:

We could analyze only certain types of statistics, such as the mean and standard deviation.
We had to make certain assumptions, like the normality assumption, about the underlying distribution.
And researchers needed specialized training to apply, understand, and appreciate statistics.

But resampling techniques overcome all these limitations today:

We can analyze virtually any statistic.
We don't have to make any assumptions about the distribution of the data.
And the techniques are easily to understand.

All resampling techniques rely on the computer to generate data sets from the original data. The techniques differ, however, in how they generate the data sets. Four techniques are important:

the bootstrap, invented by Bradley Efron;
the jacknife, invented by Maurice Quenouille and later developed by John W. Tukey;
cross-validation, developed by Seymour Geisser, Mervyn Stone, and Grace G. Wahba; and
balanced repeated replication, developed by Philip J. McCarthy.

A RESAMPLING EXAMPLE FOR EDUCATION

Back to the question comparing the grades of voucher and non-voucher students: Using the bootstrap technique, we can empirically construct the distribution of mean grade differences for students in these two groups. If the observed difference is unusual, then we would reject the null hypothesis that grades are unrelated to voucher status.

For simplicity, let's assume that the district has 13 voucher students and 39 non-voucher students, and the mean difference is 10 standard score units. To empirically construct the distribution, we'd follow these steps:

Create a data base with all the student grades.
Randomly sort the data base.
Compute the mean for the first 13 students.
Compute the mean for the other 39 students.
Record the test statistic--the absolute value of the mean difference.
Then repeat steps 2 though 5 many times.

That way, we'd get the distribution of mean differences when we randomly select students. The probability of observing a mean difference of 10 when everything is random is the proportion of experimental test statistics in step 5 that are greater than 10.

Noreen (1989) noted several striking aspects of this approach:

Researchers make no assumptions about the distribution of grades (for example, no normality assumption).
The data are not a random sample from some population.

ARGUMENTS FOR RESAMPLING

Diaconis and Efron (1983) argue that the resampling method frees researchers from two limitations of conventional statistics: "the assumption that the data conform to a bell-shaped curve and the need to focus on statistical measures whose theoretical properties can be analyzed mathematically." Instead, Peterson says, this method "addresses a key problem in statistics: how to infer the 'truth' from a sample of data that may be incomplete or drawn from an ill-defined population."

The resampling method forces researchers to clarify the problem: With no formulas to fall back on, you have to explicitly define the question you want to answer. According to Simon and Bruce (1991), the method prevents researchers from "simply grabbing the formula for some test without understanding why they chose that test." As Peterson explains, instead of asking which formula to use, you "begin tackling such questions as what makes certain results statistically 'significant.'"

In Scientific American, Diaconis and Efron apply the bootstrap method to various types of problems and then compare the results from the bootstrap with the results from conventional statistical tests, including the correlation coefficient and principal components. Most of the time, the bootstrap method yielded the same answers that the more conventional methods did. Of course, the bootstrap may not give a true picture of every sample, just as conventional tests sometimes find deceptive answers to problems.

Because resampling techniques like the bootstrap are so easy to use and understand, Simon and Bruce advocate teaching these techniques to students first--that way, students learn how to translate their "scientific" question into a "statistical" question. By learning how to think clearly about their problem, students won't "select their methods blindly."

They cite a study where one group of students learned resampling techniques, and the other learned conventional methods. The students taught the resampling techniques did much better solving statistical problems than the other students taught conventional methods. Further, the students who learned the resampling techniques enjoyed statistics, and their attitudes toward math improved during the course. However, the attitudes of the students who learned conventional techniques got worse during the course.

ARGUMENTS AGAINST RESAMPLING

Critics question the resampling method itself. They argue, as Stephen E. Fienberg says, that "you're trying to get something for nothing. You use the same numbers over and over again until you get an answer that you can't get any other way. In order to do that, you have to assume something, and you may live to regret that hidden assumption later on" (Peterson, 1991, p. 57).

Other critics question the accuracy of the estimates that resampling yields--if, for example, the researcher doesn't make enough experimental trials. In some situations, resampling may be less accurate than conventional parametric methods.

ADDITIONAL READING

The classic introduction to this field:

Diaconis, P., and B. Efron. (1983). Computer-intensive methods in statistics. Scientific American, May, 116-130.

Numerous examples, as well as Basic, Pascal, and Fortran source code for conducting several resampling experiments:

Noreen, Eric. (1989). Computer-intensive methods for testing hypothesis. New York: Wiley.

Discusses using resampling methods to teach statistical concepts:

Peterson, I. (July 27, 1991). Pick a sample. Science News, 140, 56-58.

Low-cost IBM PC software for learning and applying resampling:

Simon, J. L. (1990). Resampling stats: Probability and statistics, a radically different way. University of Maryland, College of Business and Management.

Arguments for using and teaching resampling:

Simon, J. L., and P. Bruce. (1991). Resampling: A tool for everyday statistical work. Chance, 4(1), 22-32.

Descriptors: Computer Oriented Programs; Computer Uses in Education; *Educational Research; Elementary Secondary Education; *Estimation (Mathematics); Nonparametric Statistics; *Probability; *Research Methodology; Sampling; Statistical Distributions; *Statistics; Tech

Sitemap 1 - Sitemap 2 - Sitemap 3 - Sitemap 4 - Sitemape 5 - Sitemap 6