In a recent letter in Diabetes Care, Lockwood (1) pointed out that there was a strong correlation (r = 0.54, P = 0.000057) between the statewide self-reported diabetes prevalence in 2000 and the total statewide air toxic release inventory (TRI) in 1999 for the 50 states and Washington, D.C. He pointed out that “[although] […] the correlation between air emissions and the prevalence of diabetes does not prove a cause-and-effect relationship, the significance of the relationship demands attention.”

I agree that the correlation does not prove a cause-and-effect relationship, but the demand for attention is questionable. The demand for attention is based on the magnitude of the observed correlation, but to attribute possible cause requires at least a plausible mechanism and individual-level data (not statewide averages). Lockwood developed an impression that dioxins are the main culprit in the hypothetical exposure-response relationship, but it is difficult to understand how the reported correlation is useful for developing the relationship since dioxins are not, as he noted, one of the chemicals inventoried in the TRI.

As an example of how looking at statewide averages (group data) can lead to questionable results, the self-report diabetes data were downloaded from the CDC behaviorial risk factor surveillance systems Web site (2), as were the latitudes and longitudes of each of the state capitals and the state population sizes in 2000. Correlations were calculated among these variables using the same techniques as Lockwood (1) used.

The correlations were instructive. Table 1 shows the Pearson correlation between statewide diabetes prevalence and a selected state level variable in addition to the associated significance level of the correlation (P values available only to three significant figures).

The correlation between statewide diabetes prevalence and the latitude of the state capital is the same magnitude as that reported by Lockwood (1) for the correlation between statewide diabetes prevalence and statewide toxic air emissions. The correlations with the other variables are about the same size and are all statistically significant.

The conclusion is that to reduce the risk of diabetes a person should move to a northwestern state with a low population, whose state name is near the beginning of the alphabet—Alaska is a reasonable choice based on an unreasonable application of statistics. However, this application is not very different from the methods used by Lockwood.

I hope that this demonstrates that a highly significant correlation between two variables based on statewide data doesn’t show anything.

Table 1—

The Pearson correlations between statewide diabetes prevalence and selected state level variables, with associated significance levels

rP
Latitude of the state capital −0.54 <0.001 
Longitude of the state capital −0.31 <0.02 
State population +0.46 <0.001 
Numerical position of the alphabetized state list (i.e. Alabama = 1, Wyoming = 51) +0.49 <0.001 
rP
Latitude of the state capital −0.54 <0.001 
Longitude of the state capital −0.31 <0.02 
State population +0.46 <0.001 
Numerical position of the alphabetized state list (i.e. Alabama = 1, Wyoming = 51) +0.49 <0.001 
1
Lockwood A: Diabetes and air pollution (Letter).
Diabetes Care
25
:
1487
–1488,
2002
2
CDC Behavioral Risk Factor Surveillance Systems [Diabetes Prevalence Data]. Available from http://apps.nccd.cdc.gov/brfss/list.asp?cat=DB&yr=2000&qkey=1364&state=US,
2000
. Accessed 6 August 2002.

Address correspondence to Mark J. Nicolich, Statistician, 24 Lakeview Rd, Lambertville, NJ 08530. E-mail: [email protected].