With rapid advancements in technologies involving genetic testing, epigenomic and transcriptomic profiling, many landmark consortia have generated rich multimodal data for the human genome. Among those, the consortia focused on human pancreas and islet research have greatly enhanced our understanding of different forms of diabetes. Despite new insights provided at an unprecedented scale and depth, current islet and pancreas data are fragmented across different portals with lack of common metadata/ontology standards, varying maturity of quality control pipelines, and disconnected from clinical research communities. To overcome these challenges, we developed Genomic Knowledgebase (GenomicKB), a graph database that integrates human genomic data, allowing researchers to explore and investigate human genome, epigenome, transcriptome, and 4D nucleome. The database uses a knowledge graph to consolidate genomic datasets and annotations from over 30 consortia and portals, resulting in 347 million entities, 1.36 billion relations, and 3.9 billion properties, which covers extensive pancreas and diabetes-related data including GWAS, disease ontology, and eQTL. Compared with traditional tabular data stored in separate data portals, GenomicKB emphasizes the relations among genomic entities and intuitively connects isolated data entries. With GenomicKB, complicated analysis among multiple modalities is transformed into coding-free queries and facilitates data-driven discoveries. Our knowledge graph structure is machine learning-ready and can be coupled with evolving artificial intelligence algorithms. In the future, GenomicKB will include pancreas and islet imaging and physiology datasets to build a comprehensive resource, PanKGraph, that will accelerate progress towards understanding diabetes etiology and pathophysiology, contribute to new diagnostic tool development, and inform transformative changes in diabetes prevention and care.

Disclosure

F. Feng: None. F. Tang: None. Y. Gao: None. D. Zhu: None. T. Li: None. S. Yang: None. Y. Yao: None. Y. Huang: None. D.C. Saunders: None. S.C. Parker: Research Support; Pfizer Inc. J. Cartailler: None. M. Brissova: None. J. Liu: None.

Readers may use this article as long as the work is properly cited, the use is educational and not for profit, and the work is not altered. More information is available at http://www.diabetesjournals.org/content/license.