Using Internet Datasets
Mathematics and science teachers frequently seek sources of data sets for classroom use. Data sets offer students experiences with reading tabular information, evaluating the validity of the source of the data, statistically analyzing data, and drawing implications about the context of the data. In addition, the real-life sources of these data sets are generally more interesting to students than are sets of numbers in textbook appendices. Teachers seeking data sets on the World Wide Web initially encounter a gold mine of potentially useful items. As in the mining process, teachers may discover a number of barriers to easy access.

The first one is the time required to access the data. Teachers must plan ahead to allow students sufficient access time, or arrange to download the data while the class is engaged in another activity.

An additional potential hurdle occurs when some sites require particular software tools to extract the data. Normally, this specialized application tool is available for downloading at the site. However, students may encounter difficulties and become frustrated while trying to access these data sets. Therefore, teachers should be prepared to assist with the application program.

Finally, some data sets are readily accessible, but upon examination appear as a meaningless string of numbers. Users will need to find an accompanying document (often with a similar file name and the suffix .DOC) that describes the meaning of the numbers. Often the data appears to be similar to a spreadsheet, with coded entries. Once students are made aware of these codes, they can readily interpret these data sets.

Titanic graphic

As an example of a data set and its accompanying document, examine the information about the Titanic's passengers and crew collected for the Journal of Statistical Education's Dataset Archive (

The data are found in the document titled Population at Risk and Death Rates for an Unusual Episode in [titanic.dat]. The data consist of four long columns of the digits 0 through 3.

For example
1 1 1
1 1 1 0
2 1 1 0
3 0 0 1

Important information about the context of the data and the creation of the dataset is found in [titanic.doc]. This includes the size and source of the set, a descriptive abstract and the "story behind the data", a description of the variables by columns and values, notes on instructional uses of the dataset, a source for additional contextual information, and how to contact the person who submitted the set to the archive. This type of information is provided for each set of data in this archive. Because of their standardized teacher-friendly format, this site is an excellent place for teachers to begin their use of Internet accessible data sets.

The best news is that almost every major organization, from the National Oceanic and Atmospheric Administration (NOAA) and the National Aeronautics and Space Administration (NASA) to the National Association for Stock Car Auto Racing (NASCAR) has data at its Web site. Students will be able to find a data set on a topic they will find interesting.

* Adapted with permission from Henry, M. A. (1998, April). Data Sets. Missouri Science News Notes, 4 (10), pp. 10-11.

