Perry et al. (1) performed a pathway-based approach aiming to identify biological pathways associated with type 2 diabetes. They used genome-wide association (GWA) data from the type 2 diabetes study in the U.K. Wellcome Trust Case Control Consortium (WTCCC) for the initial analysis and validated the findings with data from the Diabetes Genetics Initiative (DGI) and Finland–United States Investigation of NIDDM Genetics (FUSION) studies. The Wnt signaling pathway was the most strongly associated, and they therefore postulated this was the most interesting candidate pathway. However, after correcting for multiple testing, none of the top-ranking pathways reached statistical significance. Perry et al. concluded that type 2 diabetes genes are likely to reside in multiple pathways.

We recently performed comparable genome-wide pathway analysis in two of the three GWA datasets used by Perry et al. (the WTCCC and DGI) and found overlapping but also different results to theirs (2). However, we encountered several problems using these pathway methods. Our main conclusion is therefore that pathway-based approaches have many limitations that need to be addressed before these methods can be used to provide accurate results and conclusions can be drawn.

First, in classification systems like Kyoto Encyclopedia of Genes and Genomes (KEGG) or BioCarta, the majority of human genes are currently not sorted on any pathway. Of the 18 type 2 diabetes susceptibility loci recently identified, only 5 (CDKN2A-2B, PPARG, NOTCH2, VEGFA, and TCF7L2) could be assigned to known biological pathways. In addition, β-cell function, one of the mechanisms suggested to underlie type 2 diabetes, has not been specifically described as a pathway in either KEGG or BioCarta. Thus, although type 2 diabetes genes may well play a role in multiple pathways, we feel that this conclusion cannot be drawn based on the results from pathway-based analyses.

Second, as Perry et al. discuss, larger pathways are favored to become significantly overrepresented in pathway analysis. This is due to the statistical attribute that the power of tests increases as the numbers for comparison become larger, which is the case in analyzing lager pathways. One of the top associated pathways in both our study and that of Perry et al. is the Wnt signaling pathway, which comprises many genes. It is therefore highly likely to become statistically overrepresented in pathway analyses. We analyzed 30 randomly selected sets of genes, encompassing around 1,500 genes per set, and in 16 of the 30 sets the Wnt signaling pathway was in the list of the top 10 ranked pathways, and in 5 of the 30 sets it was even ranked in the top 3.

We would like to emphasize that the limitations of pathway-based analyses in GWA data should be kept in mind when drawing conclusions based on overrepresented pathways.

No potential conflicts of interest relevant to this article were reported.

1.
Perry
JR
,
McCarthy
MI
,
Hattersley
AT
,
Zeggini
E
the Wellcome Trust Case Control Consortium
Weedon
MN
,
Frayling
TM
:
Interrogating type 2 diabetes genome-wide association data using a biological pathway-based approach
.
Diabetes
2009
; 
58
:
1463
1467
2.
Elbers
CC
,
van Eijk
KR
,
Franke
L
,
Mulder
F
,
van der Schouw
YT
,
Wijmenga
C
,
Onland-Moret
NC
:
Using genome-wide pathway analysis to unravel the etiology of complex diseases
.
Genet Epidemiol
2009
; 
33
:
419
431
Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. See http://creativecommons.org/licenses/by-nc-nd/3.0/ for details.