Over the past 5 years, microarrays have greatly facilitated large-scale analysis of gene expression levels. Although these arrays were not specifically geared to represent tissues and pathways known to be affected by diabetes, they have been used in both type 1 and type 2 diabetes research. To prepare a tool that is particularly useful in the study of type 1 diabetes, we have assembled a nonredundant set of 3,400 clones representing genes expressed in the mouse pancreas or pathways known to be affected by diabetes. We have demonstrated the usefulness of this clone set by preparing a cDNA glass microarray, the PancChip, and using it to analyze pancreatic gene expression from embryonic day 14.5 through adulthood in mice. The clone set and corresponding array are useful resources for diabetes research.

Microarray analysis has been used in studies of both type 1 (1,2) and type 2 diabetes (38). These studies made use of commercial, high-density oligonucleotide arrays as well as cDNA glass microarrays and filter arrays. Due to the fact that array design is often directed by clone availability without consideration of the tissues relevant to diabetes research, most of the elements did not show any differential expression in diabetes studies. To overcome these limitations, and to provide a cost-effective resource to the diabetes research community, we have developed a 3.4K clone set that contains 3,139 clones representing mRNAs expressed in the pancreas, 231 clones representing pathways affected by diabetes, and 30 clones representing housekeeping genes. The core pancreas clone set of 3,139 cDNAs was assembled using both experimental and computational approaches to select clones that were expressed in islets and pancreas. We developed a glass cDNA microarray, the “PancChip,” based on the 3.4K clone set. Subsequently, we used the PancChip to study gene expression in the developing mouse pancreas between embryonic day (E) 14.5 and adulthood. The 3.4K clone set used to prepare the PancChip is available to the diabetes research community through the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK)-funded biotechnology centers.

Glass microarray preparation.

The bacterial clones selected for the array were grown to confluence overnight in flat bottom 96-well square block plates with a volume of 1 ml LB-Amp medium per well. Plasmid DNA was prepared for each clone using a Qiagen 3000 robot, and the purified DNA used as template to amplify the inserts with PCR. The clones were assigned a PCR score of “pass” indicating a single strong band and “fail” if there were no bands or multiple bands. For the subgroup of genes named “select array,” primer pairs were synthesized for each gene and the cDNA amplified from a mixture of intestine, liver, and pancreas RNA using RT-PCR. All PCR products were purified using Millipore Mutiscreen PCR Cleanup 96 well plates, resuspended in 50 μl deionized sterile water, diluted with an equal volume of DMSO (Sigma), and printed on poly-l-lysine–coated slides with an Affymetrix 417 arrayer. There are 3,840 elements on the PancChip (Table 1), including pancreas specific clones, clones representing signal transduction pathways, controls, and blanks (9). Information about the PancChip (Version 2.0) is available at http://www.cbil.upenn.edu/EPConDB/pancChip.html.

Preparation of RNA.

Six adult CD1 female mice were killed, and the pancreas and heart were immediately homogenized in 10 ml denaturing solution (4 mol/l guanidium thiocyanate, 0.1 mol/l Tris-Cl pH 7.5, 1% β-mercaptoethanol) per organ. Total RNA was extracted using the acid-phenol extraction method (10). Mouse islets, as well as embryonic and newborn pancreas and liver tissues, were immediately homogenized in 1 ml TRIzol Reagent (Gibco). RNA was purified following the manufacturer’s protocols with the exception that 20 μg glycogen (Roche) was added to each sample. Subsequently, RNA pellets were washed with 75% ethanol and resuspended in 300 μl TES (10 mmol/l Tris pH 7.5, 1 mmol/l EDTA, 0.1% SDS). The RNA was re-extracted with 600 μl phenol:chloroform:isoamyl alcohol (25:24:1) and precipitated with 1/10 volume 3 mol/l sodium acetate and 3 volumes ethanol and stored at −80°C until use. For the developmental time course, a common control was prepared by pooling samples. The common control pool consisted of 2.5 μg of each of the E16.5 and E18.5 samples for a total of 15 μg from each of these time points, as well as 5.3 μg of each of the newborn, P7, and adult samples for a total of 32 μg from each of these later time points. For our initial expression survey with Incyte mouse GEM 1 (1.12 for newborn pancreas) and human GEM arrays, labeling, hybridization, and signal quantification were performed by the manufacturer using 200 μg total RNA supplied by us. For the experiments performed using the PancChip, these steps were performed as described below.

RNA labeling.

cDNAs were labeled with the 3DNA Submicro Array kit (Genisphere) according to the manufacturer’s protocol and recommendations. Total RNA (2.0 μg) and 2 pmoles Cy3 capture sequence primer or Cy5 capture sequence primer were brought to 10 μl with diethyl pyrocarbonate (DEPC)-treated water and incubated for 10 min at 80°C. The RNA mixture was then cooled to 42°C. An equal volume of reaction mix (2× first-strand Buffer [InVitrogen], 1 mmol/l dATP, 1 mmol/l dGTP, 1 mmol/l dCTP, 1 mmol/l dTTP, 20 mmol/l dithiothreitol [DTT], 40 units RNasin [Promega], and 200 units Superscript II reverse transcriptase [InVitrogen]) was added, and the reaction was incubated for 2 h at 42°C. The reaction was terminated by bringing it to 0.074N NaOH and 7.4 mmol/l EDTA and incubated at 65°C for 10 min. Finally, the reaction was neutralized by adjusting it to 0.175 mol/l Tris-Cl, pH 7.5. The Cy3 and Cy5 reactions were combined and precipitated with 20 μg linear polyacrylamide (Ambion), 1 volume 7.5 mol/l ammonium acetate, and 9 volumes ethanol at −20°C for 30 min. Following precipitation the pellet was air dried.

Prehybridization.

Prehybridization was performed for all arrays (9). A coplin jar containing 50 ml prehybridization buffer (5× SSC, 0.1% SDS, and 1% BSA) was brought to 45°C. The arrays were incubated for 45 min at 45°C, rinsed five times in deionized water at room temperature, rinsed once in isopropanol, and then placed into a 50-ml conical tube and centrifuged 1 min at 1,000 rpm. The prehybridization was done no more than 1 h before hybridization.

Hybridization.

In preparation for hybridization, the cDNA pellet was resuspended in 2 μl sterile deionized water, 3.0 μl oligo dT70 blocker (0.25 μg/μl), 2.5 μl of the Cy3 dendrimer, 2.5 μl of the Cy5 dendrimer, and 1 μl high-end differential enhancer (all from Genisphere) were added to the cDNA. Mouse Cot1 DNA, 2.5 μg (1 mg/ml, Gibco-BRL), and 1 μl anti-fade reagent (Genisphere) were added to 100 μl hybridization buffer (40% formamide, 4× SSC, 1% SDS; Genisphere), and the hybridization buffer was warmed to 45°C. The prepared hybridization buffer (19 μl) was added to the cDNA/dendrimer mix and incubated at 45°C for 15 min. This hybridization mix was added to a prehybridized glass microarray, covered with a glass coverslip (22 × 40 mm), and incubated in a Corning hybridization chamber overnight at 45°C. The labeled arrays were washed three times for 10 min each: once at 55°C in 2× SSC, 0.2% SDS, once in 2× SSC at room temperature, and once in 0.2× SSC at room temperature and then dried by centrifugation in a slide rack for 3 min at 1,000 rpm in a Sorvall SH-3000 rotor.

Scanning and image analysis.

All slides were scanned immediately following hybridization and washing using an Affymetrix 418 scanner. The laser power was set to 100%, and the gain of the photomultiplier-tube was varied to avoid signal saturation in any spots. The image analysis was performed with ArrayVison 6.0 (Imaging Research). Signal and background pixel classification were determined by segmentation limited to 75–125% of the set spot diameter. Signal and background intensities were determined by the median pixel values. All of the array data are available through http://www.cbil.upenn.edu/EPConDB/query.html.

Data preprocessing and normalization.

In the arrays used for the developmental time course, 2,914 genes had a passing PCR score (see “Glass Microarray Preparation”) and were considered for further analysis. Genes with high variation among the six replicates (coefficient of variation [CV] > 0.7) were examined for outliers. The data were normalized using the statistical software package R (11). For the comparisons across time-points, scaled print-tip group lowess normalization was used to remove spatial effects and other systematic variation (12). The local background intensities as calculated by ArrayVision were not subtracted from the data intensities. The subtraction of local background significantly increases data variance (E. Manduchi, L.M.S., J.E. Brestelli, G.R. Grant, K.H.K., C.J.S. Jr., Physical Genomics, In Press) and is not necessarily a good control for nonspecific binding of the labeled nucleic acids.

Vector projection.

The normalized data were analyzed for trends over time using vector projection. Vector projection is a method that allows identification of these trends in gene expression data and incorporates all of information from all time points (Terry Speed, Department of Statistics, University of California at Berkley, and Genetics and Bioinformatics, Walter and Eliza Hall Institute Australia; and Ingrid Lönnstedt, Department of Mathematics, Uppsala University, personal communication). Specifically, vector projection facilitates quick identification of genes that match the predetermined patterns of interest shown in Fig. 3 (e.g., “late” gene expression defined as peak expression during the adult time point). Each gene has a vector of its normalized expression values across time. These values are projected onto the space spanned by the pattern of interest (vector of coefficients or weightings for each time point). Examination of the Normal QQ-plot of all projection values allows identification of the extreme projection scores. Genes with strong patterns will have the largest (or smallest) values of the inner product.

The pancreas clone set.

To produce a custom cDNA microarray, the PancChip, we first needed to identify a set of clones that represented genes expressed in the pancreas. Experimental and bioinformatics approaches were used to identify these clones. First, expression studies were performed with Incyte GEM arrays using RNA from newborn and adult mouse pancreas, newborn mouse liver, a mouse insulinoma cell line, and human islets. In each case, we identified the clones in the top 15% expression level as measured by fluorescence intensity, which given the random collection of cDNAs on the Incyte GEM array, represents genes expressed at low, moderate, and high levels. Second, we identified additional clones for the PancChip from dbEST cDNA libraries nos. 185 (13), 422 (14), 1,144, 1,880, and 2,712 (15). These libraries were chosen because they were all prepared from RNA isolated from pancreatic tissue. Library 1,870 was prepared with C57BL/6J adult mouse pancreas. All others were prepared from human pancreatic islets; no. 2,712 also included RNA from human total pancreas.

Genomics Unified Schema.

To identify and select a nonredundant set of clones, the expressed sequence tags (ESTs) and clones identified above were mapped to IMAGE clones using the Genomics Unified Schema (GUS) data system (16) accessible through AllGenes (http://www.allgenes.org). RNA entries contained within GUS are organized as Database of Transcribed Sequences (DoTS) assemblies; each entry or assembly represents a consensus of overlapping—confirmed and putative—transcribed sequences. The mouse clones identified by expression analysis described above could be directly mapped to DoTS assemblies. For human cDNA libraries, EST sequences over 100 bp were first retrieved and trailing polyA and leading polyT regions removed. Mouse orthologs for these ESTs and for the genes identified by the expression analysis of human islets were identified through stringent BLASTX similarity against the nonredundant protein database (NRDB) at the National Center for Biotechnology Information (NCBI) using a cut off P value of 1 × 10−50. A set of IMAGE clones was chosen from combined groups of nonoverlapping mouse DoTS assemblies. Finally, one IMAGE clone was chosen to represent each of the DoTS assemblies identified, with preference given to clones containing the 3′ end of the assembly.

Using this combination of expression analysis and database mining, we obtained a nonredundant core clone set of 3,139 mouse IMAGE clones, each representing a unique assembly (“Pancreas clone set,” Table 1). Most of the genes represented in this set were identified by several paradigms; for example, hundreds were found in an expression array with mouse insulinoma RNA and were also present in an EST library derived from human pancreatic islets. Overall, ∼60% of the clones in the set, or 1,900 elements, were defined to be present in pancreatic islets or insulinoma cells. Of this core set, 2,369 clones showed >95% identity to known protein sequences, and 310 showed “no nonredundant (NR) protein similarities” and represent unique protein coding regions or untranslated regions. In addition, 1,898 of the clones were sequence verified at the Genome Sequencing Center at Washington University. The remaining 1,241 clones were not verified, as they did not match with a parent sequence. These clones are not necessarily incorrectly identified, due to the fact that many parent clones had previously only been sequenced from one end. Of the nonsequence verified clones, 830 or 69% were shown to be expressed during one of the stages of pancreatic development contained in this study, and, therefore, these clones were maintained as part of the 3.4K clone set. The corresponding PancChip information, including annotation, for the 3.4K clone set and the verification status of all clones is available at http://www.cbil.upenn.edu/EPConDB/pancChip.html.

Gene ontology functions.

The cDNA clones in the core pancreas clone set were categorized using a directed acyclic graphical (DAG) classification system defined by the Gene Ontology (GO) Consortium (17,18) (http://www.geneontology.org). The assignments of GO functions to the proteins represented by the cDNA clones in the core pancreas clone set (Fig. 1) were made computationally using an algorithm associating the translated protein domains with GO functions (25). Briefly, to assign GO function(s) to the translated sequences, it is assumed that the sequence of a previously characterized functional domain always functions as characterized. For example, a translated sequence having a domain with a BLAST similarity meeting a P value threshold with a previously characterized DNA binding domain is assigned the GO function “DNA binding.” Not every translated sequence will contain a domain meeting the criteria to be assigned a GO function. Thus, not every clone was assigned a GO function, and some sequences were assigned more than one top-level function. The distributions of the top-level GO function assignments for the core pancreas clone set are shown in Figs. 1 and Table 2.

Additional clones.

To further increase the usefulness of the information obtained from the core pancreas clone set, we identified representative genes from various signal transduction pathways relevant to pancreatic development, which resulted in the identification of an additional 231 IMAGE clones (“Pathways” in Table 1). A group of 108 clones relevant to pancreatic development were added to the collection from laboratory stocks (“In-house” in Table 1). Using an alternative approach to obtain PCR products to spot onto microarrays, we designed primer pairs for a complementary group of 153 genes of importance to pancreatic development and amplified PCR products from a cDNA pool derived from intestinal, liver, and pancreas RNA (“Select” in Table 1). To control for the possibility of residual cDNA being contained in the purified PCR products, we performed PCR reactions without primers (cDNA controls). Finally, all PCR products were purified and spotted on the array. Additional controls included 30 housekeeping genes (19) and 8 yeast intergenic sequences in duplicate (Incyte Genomics).

The cDNA inserts of the clone sets were amplified by PCR and analyzed individually by agarose gel electrophoresis. As can be seen in Table 1, the overall rate of PCR success as defined by the presence of a single band on a gel was ∼80%. An additional 15% of the clones were amplified but contained more than one band. Therefore, of the 3,674 clones in the collection, 2,912 are represented as single-band products on the array. To demonstrate the basic characteristics of the PancChip, we labeled the same total pancreas RNA with both Cy3 and Cy5 using the dendrimer method (20) and hybridized both cDNAs on the same array. Idealized “same versus same” hybridization would show all the points falling perfectly on a line with the deviation of the slope from one reflecting the channel-to-channel differences. As can be seen in a scatter plot of the median intensities of this array (Fig. 2A), the intensities show excellent correlation with a linear regression with R2 = 0.9809. Second, we compared two very different RNA samples on the array. In this experiment, islet RNA was labeled with Cy3 and heart RNA was labeled with Cy5. As we expected, the scatter plot of the median intensities clearly showed two populations (Fig. 2B). The larger population with 92% of the differentially expressed spots was labeled primarily with the islet RNA (Cy3 channel). The other population of only 8% of the differentially expressed spots showed hybridization primarily with the heart RNA (Cy5 channel), indicating that indeed our clone set is enriched for cDNAs expressed in pancreatic islets.

Developmental time course of pancreatic gene expression.

To demonstrate the value of the 3.4K clone set and the PancChip, we used them to investigate gene expression patterns during pancreatic development in mice. We dissected the pancreata from mice representing developmental stages from E14.5 through adulthood, isolated total RNA from the pancreata, and labeled the RNA for hybridization using a dendrimer labeling system (Genisphere; E. Manduchi, L.M.S., J.E.B., G.R. Grant, K.H.K., C.J.S. Jr., Physiological Genomics, In Press). For E16.5 though adulthood, each sample represents a single individual. Six replicates (each with RNA from a distinct individual) were performed for each time point from E16.5 through adulthood. Due to the low amount of RNA from each pancreatic primordium in E14.5 samples, we pooled several pancreata from individual embryos to prepare one RNA sample. Four replicates of this time point were done (each with pools from different groups of individuals). The experimental samples were all labeled with Cy3. We used a pool of all the samples in the study (except samples from E14.5) as the reference RNA and labeled it with Cy5. The labeled RNA was hybridized to the arrays and scanned. The data were analyzed as the ratios of the intensities of the individual time points for each gene over the intensities of the common control for the same gene as has been used for other two-channel microarray experiments with multiple comparisons (11,21).

Time series analysis.

In the developmental time course, only the 2,914 genes with a passing PCR score were considered for further analysis. Gene expression levels were normalized, and the normalized data were analyzed for trends over time as described in research design and methods. Three predetermined patterns were of interest: “late” expression (with peak expression at adult), “early” expression (peak is at E14.5), and genes that have peak expression at birth (Fig. 3). Genes with the most extreme vector projections for each trend were identified (see research design and methods). These genes are also listed in supplemental materials (http://www.cbil.upenn.edu/EPConDB/pancChip.html). We used the GO functions, discussed earlier, to categorize the genes with the top vector projection scores for each trend. There is a shift in the distribution of GO functions across the developmental stages examined (Fig. 4). During the early expression period, the predominant group of genes being expressed fall into the “Nucleic acid binding” category. At birth, the predominant group of genes being expressed has “No NR protein similarities.” Finally, in the late period of expression, the predominant group is “Enzyme,” as we would expect since our samples included the exocrine pancreas whose primary function in the adult is the secretion of digestive enzymes.

When we examined the range of intensities across time for a given trend, we found dramatic shifts in the intensities for a gene across the different time points (Fig. 5). The y-axis is the log2 ratio of the average median intensities of the given developmental time point versus the average median intensities of the pooled common control. Due to the log nature of the scale, the changes seen in the early period (panel A) relate to an eightfold scale in the expression of the identified genes when compared with the common control. Included among the early expression genes for nucleic acid binding proteins are 1) a clone with high identity to heterogeneous nuclear ribonucleoprotein A3 (hnrnpA3) and 2) nucleophosmin, as well as transcription factors. Both hnrnpA3 and nucleophosmin are thought to play roles in cell growth and proliferation. In addition to the nucleic acid binding proteins, genes active in membrane transport functions, including thyroid receptor activator molecule (TRAM1) and lysosomal-associated protein transmembrane 4α (LAPTM4A), are expressed in early pancreatic development.

The scale of the graph for the expression-at-birth panel (Fig. 5B) shows an approximate eightfold difference in gene expression between birth and E14.5 and an approximate twofold difference between birth and adulthood. Fifty percent of the genes identified as highly expressed during the perinatal period are genes with no GO function (genes with no known protein similarities and clones with similarity to other uncharacterized ESTs). Other genes identified as highly expressed during the perinatal period include BAT2 (a voltage sensitive calcium channel) and proline oxidase 1.

The late expression proteins show up to a 20-fold change in expression between our earliest time point (E14.5) and adulthood (Fig. 5C). As can be seen on the graph, many of these genes are digestive enzymes one would expect to be expressed in the adult pancreas, including elastase-2 and amylase. Trefoil factor 2 (TFF2), which has previously been shown to be expressed in the pancreas (2224), also increases expression in the adult versus the earlier stages.

We have assembled a set of clones that represent genes primarily expressed in the pancreas and have used this clone set to construct a glass cDNA microarray as a resource to the diabetes research community. Our core pancreas clone set has been extensively annotated and represents a varied spectrum of cell functions. All cDNAs of the core pancreas clone set have been resequenced, and the majority of the clones were sequence-verified. Most of the nonconfirmed clones are expressed at some point during the developmental stages studied. These clones will, however, be replaced with new cDNAs from Endocrine Pancreas Consortium libraries as they become available.

We have demonstrated the usefulness of the clone set by preparing a glass cDNA microarray, the PancChip, and by profiling pancreatic gene expression patterns from midgestation through adulthood with the PancChip. At the earliest stage in this study (E14.5), nucleic acid binding proteins were the largest group of differentially expressed genes as assigned by GO functions. The nucleic acid binding proteins of the GO function classification include transcription factors and other proteins involved in cell fate decisions. This result fits with expectations of gene expression during embryonic development. During the adult stage of pancreatic development, the largest group of genes being expressed encodes enzymes. Again, this is an expected result that confirms the validity of the PancChip, as we had used total pancreas for our time course, which in the adult consists largely of exocrine tissue whose main function is the secretion of digestive enzymes.

Previous gene profiling experiments made use of commercial arrays to study both type 1 and type 2 diabetes (1,2,5,6,8). None of these arrays were prepared with the intention to focus on diabetes or the pancreas. Because the pancreas clone set is enriched for genes that are expressed in the pancreas or were present in cDNA libraries prepared from pancreatic tissue, it offers the diabetes research community the opportunity to investigate genes of interest at a higher density. Ultimately, it will be desirable to utilize a complete genome-wide cDNA array representing all 20,000–40,000 mammalian genes for all expression profiling experiments, including those related to diabetes. However, until such a resource is available at a low cost, the PancChip will provide a valuable and affordable resource for the diabetes research community. The 3.4K pancreas clone set used to make the PancChip will be distributed through the NIDDK-funded biotechnology centers. Current efforts of the Endocrine Pancreas Consortium are centered on the generation and sequencing of pancreas-specific cDNA libraries from various stages of development from both mouse and human. These new libraries will allow us to dramatically increase the number of clones represented in the clone set and on the PancChip in the future.

FIG. 1.

The distribution of GO functions of proteins represented by the 3,139 core pancreas IMAGE cDNA clone set selected for the PancChip. The GO functions are classifications of cellular functions and were computationally assigned according to the guidelines of the GO Consortium.

FIG. 1.

The distribution of GO functions of proteins represented by the 3,139 core pancreas IMAGE cDNA clone set selected for the PancChip. The GO functions are classifications of cellular functions and were computationally assigned according to the guidelines of the GO Consortium.

Close modal
FIG. 2.

Scatter plots of the median intensities of total RNA hybridized to the PancChip. A: For a same-versus-same comparison, 2.5 μg of mouse pancreas total RNA labeled with Cy5 (red) and 2.5 μg of mouse pancreas total RNA labeled with Cy3 (green) were hybridized to the PancChip. B: To compare total RNA from two different tissues, 2.5 μg heart RNA labeled with Cy5 and 2.5 μg islet RNA labeled with Cy3 were hybridized to the PancChip. The values for blank spots were removed and the median intensities plotted as measured by ArrayVison.

FIG. 2.

Scatter plots of the median intensities of total RNA hybridized to the PancChip. A: For a same-versus-same comparison, 2.5 μg of mouse pancreas total RNA labeled with Cy5 (red) and 2.5 μg of mouse pancreas total RNA labeled with Cy3 (green) were hybridized to the PancChip. B: To compare total RNA from two different tissues, 2.5 μg heart RNA labeled with Cy5 and 2.5 μg islet RNA labeled with Cy3 were hybridized to the PancChip. The values for blank spots were removed and the median intensities plotted as measured by ArrayVison.

Close modal
FIG. 3.

Models of the three predicted gene expression patterns used to predict genes of interest with vector projections. The “early expression” pattern is a pattern defined as a clone with the greatest expression at E14.5, the earliest data obtained in this study. The “expression at birth” is defined as a clone with a pattern of expression, which peaks in the neonate with lower expression in the adult and the embryo. The “late expression” is defined as any clone with the greatest expression in the adult.

FIG. 3.

Models of the three predicted gene expression patterns used to predict genes of interest with vector projections. The “early expression” pattern is a pattern defined as a clone with the greatest expression at E14.5, the earliest data obtained in this study. The “expression at birth” is defined as a clone with a pattern of expression, which peaks in the neonate with lower expression in the adult and the embryo. The “late expression” is defined as any clone with the greatest expression in the adult.

Close modal
FIG. 4.

The median intensity data from six individuals for each time point (four in the E14.5) were analyzed. The genes showing the greatest expression during the “early” period (E14.5), expression at “birth,” and late expression (at adult) were identified using vector projection. The functions of the proteins represented by differentially expressed genes in each stage were characterized using GO functions. The categories for “nucleic acid binding,” “no GO function,” “transporter,” “enzyme,” “ligand binding or carrier,” “structural proteins/molecular chaperones”, and “no NR protein similarities” are shown.

FIG. 4.

The median intensity data from six individuals for each time point (four in the E14.5) were analyzed. The genes showing the greatest expression during the “early” period (E14.5), expression at “birth,” and late expression (at adult) were identified using vector projection. The functions of the proteins represented by differentially expressed genes in each stage were characterized using GO functions. The categories for “nucleic acid binding,” “no GO function,” “transporter,” “enzyme,” “ligand binding or carrier,” “structural proteins/molecular chaperones”, and “no NR protein similarities” are shown.

Close modal
FIG. 5.

The log2 ratio (intensity of Cy3 channel/intensity Cy5 channel) of the mean of the median intensity data from six individuals for each time point (four in the E14.5) for the genes identified by vector projection as having greatest expression at E14.5, the “early” period (A), at “birth” (B), and during “adulthood” (C) are plotted against time. The developmental time points include E14.5, E16.5, E18.5, newborn, p7, and adult. The specific time course of expression for specific genes are labeled and identified by the arrows. A: Heterogeneous nuclear ribonucleoprotein A3 (hnrnp A3), lysosomal-associated protein transmembrane 4α (LAPTM4A), thyroid receptor activator molecule (TRAM1), and nucleophosmin (Npm1). B: BAT2 (a voltage-sensitive calcium channel) and proline oxidase 1. C: Trefoil factor 2, elastase, and amylase.

FIG. 5.

The log2 ratio (intensity of Cy3 channel/intensity Cy5 channel) of the mean of the median intensity data from six individuals for each time point (four in the E14.5) for the genes identified by vector projection as having greatest expression at E14.5, the “early” period (A), at “birth” (B), and during “adulthood” (C) are plotted against time. The developmental time points include E14.5, E16.5, E18.5, newborn, p7, and adult. The specific time course of expression for specific genes are labeled and identified by the arrows. A: Heterogeneous nuclear ribonucleoprotein A3 (hnrnp A3), lysosomal-associated protein transmembrane 4α (LAPTM4A), thyroid receptor activator molecule (TRAM1), and nucleophosmin (Npm1). B: BAT2 (a voltage-sensitive calcium channel) and proline oxidase 1. C: Trefoil factor 2, elastase, and amylase.

Close modal
TABLE 1

The clone sets used to make the PancChip

ClonesSource% PCR successSuccessesTotal reactions
Pancreas clone set 3139 Incyte genomics 81 2546 3139 
Housekeeping 30 Research genetics 90 27 30 
Pathways 231 Incyte genomics 68 157 231 
Yeast controls 16 Incyte genomics 63 10 16 
In-house 108 In-house 49 53 108 
Select 153 In-house 79 119 150 
Blanks 153     
Anchors 10     
Total 3840  79 2912 3674 
ClonesSource% PCR successSuccessesTotal reactions
Pancreas clone set 3139 Incyte genomics 81 2546 3139 
Housekeeping 30 Research genetics 90 27 30 
Pathways 231 Incyte genomics 68 157 231 
Yeast controls 16 Incyte genomics 63 10 16 
In-house 108 In-house 49 53 108 
Select 153 In-house 79 119 150 
Blanks 153     
Anchors 10     
Total 3840  79 2912 3674 
TABLE 2

GO function assignments of the core pancreas clone set

GO functionNumber of clonesPercent of total
Enzyme 815 26.9 
Ligand binding or carrier 526 17.3 
Nucleic acid binding 463 15.3 
Signal transducer 367 12.1 
Cell adhesion molecule 199 6.6 
Transporter 197 6.5 
Structural protein 194 6.4 
Motor 86 2.8 
Chaperone 53 1.8 
Microtubule binding 33 1.1 
Enzyme activator 21 0.7 
Cell cycle regulator 20 0.6 
Enzyme inhibitor 17 0.6 
Defense/immunity protein 16 0.5 
Apoptosis regulator 13 0.4 
Antioxidant 0.2 
Cytoskeletal regulator 0.1 
Protein tagging 0.1 
Total 3,029 100 
GO functionNumber of clonesPercent of total
Enzyme 815 26.9 
Ligand binding or carrier 526 17.3 
Nucleic acid binding 463 15.3 
Signal transducer 367 12.1 
Cell adhesion molecule 199 6.6 
Transporter 197 6.5 
Structural protein 194 6.4 
Motor 86 2.8 
Chaperone 53 1.8 
Microtubule binding 33 1.1 
Enzyme activator 21 0.7 
Cell cycle regulator 20 0.6 
Enzyme inhibitor 17 0.6 
Defense/immunity protein 16 0.5 
Apoptosis regulator 13 0.4 
Antioxidant 0.2 
Cytoskeletal regulator 0.1 
Protein tagging 0.1 
Total 3,029 100 

We gratefully acknowledge support in the form of NIDDK Grant 56947 to K.H.K and NIDDK 56954 to A.M.P. D.M. acknowledges support from the Juvenile Diabetes Research Foundation.

We thank Phillip Phuc Le for computer support and for the design of the web interfaces and Jian Wang for initial clone selection. We also thank Elisabetta Manduchi and members of the Computational Biology and Informatics Laboratory (CBIL) in the Center for BioInformatics for helpful discussions.

1.
Cardozo AK, Heimberg H, Heremans Y, Leeman R, Kutlu B, Kruhoffer M, Orntoft T, Eizirik DL: A Comprehensive analysis of cytokine-induced and nuclear Factor-kappa B-dependent genes in primary rat pancreatic beta-cells.
J Biol Chem
276
:
48879
–48886,
2001
2.
Zimmer Y, Milo-Landesman D, Svetlanov A, Efrat S: Genes induced by growth arrest in a pancreatic b cell line: identification by analysis of cDNA arrays.
FEBS Lett
457
:
65
–70,
1999
3.
Joussen AM, Huang S: Möglichkeiten einer breitspektrumanalyse von genexpressionmustern mittels cDNA-arrays: gene expression profiling using cDNA microarrays.
Der Opthalmologe
98
:
568
–573,
2001
4.
Boel E, Albrektsen T, Fleckner J, Selmer J: Modulation of metabolism through transcriptional control has created new treatment opportunities for type 2 diabetes.
Curr Pharm Biotechnol
1
:
63
–71,
2000
5.
Tobe K, Suzuki R, Aoyama M, Yamauchi T, Kamon J, Kubota N, Terauchi Y, Matsui J, Akanuma Y, Kimura S, Tanaka J, Abe M, Ohsumi J, Nagai R, Kadowaki T: Increased expression of the sterol regulatory element-binding protein-1 gene in insulin receptor substrate-2-/- mouse liver.
J Biol Chem
276
:
38337
–38340,
2001
6.
Aitman T, Glazier A, Wallace C, Cooper L, Norsworthy P, Wahid F, Al-Majali K, Trembling P, Mann C, Shoulders C, Graf D, St Lezin E, Kurtz T, Kren V, Pravenec M, Ibrahimi A, Abumrad N, Stanton L, Scott J: Identification of Cd36 (Fat) as an insulin-resistance gene causing defective fatty acid and glucose metabolism in hypertensive rats.
Nat Genet
21
:
76
–83,
1999
7.
Nadler ST, Stoehr JP, Schuler KL, Tanimoto G, Yandell BS, Attie AD: The expression of adipogenic genes is decreased in obesity and diabetes mellitus.
Proc Natl Acad Sci U S A
97
:
11371
–11376,
2000
8.
Nadler ST, Attie AD: Please pass the chips: genomic insights into obesity and diabetes.
J Nutr
131
:
2078
–2081,
2001
9.
Hegde P, Qi R, Abernathy K, Cheryl G, Dharap S, Gaspard R, Earle-Hughes J, Snesrud E, Lee N, Quakenbush J: A concise guide to cDNA microarray analysis.
Biotechniques
29
:
548
–550,
2000
10.
Chomczynski P, Sacchi N: Single-step method of RNA isolation by acid guanidinium thiocyanate-phenol-chloroform extraction.
Anal Biochem
162
:
156
–159,
1987
11.
Dudoit S, Yang YH, Callow MJ, Speed TP: Statistical methods for identifying differentially expressed genes in replicated cDNA microarray experiments [article online]. Available from http://www.stat.berkeley.edu/tech-reports/index.html. Accessed September
2000
12.
Yang YH, Dudoit S, Luu P, Speed T: Normalization of cDNA microarray data.
Proceedings of SPIE, Microarrays: Optical Technologies and Informatics
Bittner ML, Chen Y, Dorsel AN, Dougherty ER, Eds. 
4266
:
141
–152,
2001
13.
Permutt M, Koranyi L, Keller K, Lacy P, Scharp D, Mueckler M: Cloning and functional expression of a human pancreatic islet glucose-transporter cDNA.
Proc Natl Acad Sci U S A
86
:
8688
–8692,
1989
14.
Takeda J, Yano H, Eng S, Zeng Y, Bell G: A molecular inventory of human pancreatic islets: sequence analysis of 1000 cDNA clones.
Hum Mol Genet
2
:
1793
–1798,
1993
15.
Ferrer J, Wasson J, Schoor K, Mueckler M, Donis-Keller H, Permutt M: Mapping novel pancreatic islet genes to human chromosomes.
Diabetes
46
:
386
–392,
1997
16.
Davidson SB, Crabtree J, Brunk BP, Schug J, Tannen V, Overton GC, C. J. Stoeckert J: K2/Kleisli and GUS: experiments in integrated access to genomic data sources.
IBM Systems Journal
40
:
512
–531,
2001
17.
Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology.
Nat Genet
25
:
25
,
2000
18.
Ashburner M, Ball CA, Blake JA, Butler H, Cherry JM, Corradi J, Dolinski K, Eppig JT, Harris M, Hill DP, Lewis S, Marshall B, Mungall C, Reiser L, Rhee S, Richardson JE, Richter J, Ringwald M, Rubin GM, Sherlock G, Yoon J: Creating the gene ontology resource: design and implementation.
Genome Res
11
:
1425
–1433,
2001
19.
Warrington J, Nair A, Mahadevappa M, Tsyganskaya M: Comparison of human adult and fetal expression and identification of 535 housekeeping/maintenance genes.
Physiol Genomics
2
:
143
–147,
2000
20.
Stears RL, Getts RC, Gullans SR: A novel, sensitive detection system for high-density microarrays using dendrimer technology.
Physiol Genomics
3
:
93
–99,
2000
21.
Fellenberg K, Hauser NC, Brors B, Neutzner A, Hoheisel JD, Vingron M: Correspondence analysis applied to microarray data.
Proc Natl Acad Sci U S A
98
:
10781
–10786,
2001
22.
Tomasetto C, Rio M, Gautier C, Wolf C, Hareuveni M, Chambon P, Lathe R: hSP, the domain-duplicated homolog of pS2 protein, is co-expressed with pS2 in stomach but not in breast carcinoma.
EMBO J
9
:
407
–414,
1990
23.
Lefèbvre O, Wolf C, Kedinger M, Chenard M, Tomasetto C, Chambon P, Rio MC: The mouse one P-domain (pS2) and two domain (mSP) genes exhibit distinct patterns of expression.
J Cell Biol
122
:
191
–198,
1993
24.
Ribieras S, Lefèbvre O, Tomasetto C, Rio MC: Mouse trefoil factor genes: genomic organization, sequences and methylation analyses.
Gene
266
:
67
–75,
2001
25.
Schug J, Diskins S, Mazzarelli J, Bunk BP, Stoeckert CJ: Predicting gene ontology functions from ProDom and CDD protein domains.
Genome Res
12
.

Address correspondence and reprints requests to Klaus H. Kaestner, Department of Genetics, University of Pennsylvania, 415 Curie Blvd., Philadelphia, PA 19104. E-mail: kaestner@mail.med.upenn.edu.

Received for publication 2 January 2002 and accepted in revised form 6 May 2002. Posted on the World Wide Web at http://diabetes.diabetesjournals.org/rapidpubs.shtml on 7 June 2002.

DoTS, Database of Transcribed Sequences; EST, expressed sequence tag; GO, gene ontology; GUS, Genomics Unified Schema; NIDDK, National Institute of Diabetes and Digestive and Kidney Diseases.