Graph databases in systems biology: a systematic review

Dagmar Waltemath Irina Balaur Reinhard Schneider + 10 penulis

Abstrak

Abstract Graph databases are becoming increasingly popular across scientific disciplines, being highly suitable for storing and connecting complex heterogeneous data. In systems biology, they are used as a backend solution for biological data repositories, ontologies, networks, pathways, and knowledge graph databases. In this review, we analyse all publications using or mentioning graph databases retrieved from PubMed and PubMed Central full-text search, focusing on the top 16 available graph databases, Publications are categorized according to their domain and application, focusing on pathway and network biology and relevant ontologies and tools. We detail different approaches and highlight the advantages of outstanding resources, such as UniProtKB, Disease Ontology, and Reactome, which provide graph-based solutions. We discuss ongoing efforts of the systems biology community to standardize and harmonize knowledge graph creation and the maintenance of integrated resources. Outlining prospects, including the use of graph databases as a way of communication between biological data repositories, we conclude that efficient design, querying, and maintenance of graph databases will be key for knowledge generation in systems biology and other research fields with heterogeneous data.

Artikel Ilmiah Terkait

An overview of graph databases and their applications in the biomedical domain

Santiago Timón-Reina R. Martínez-Tomás M. Rincón

1 Januari 2021

Abstract Over the past couple of decades, the explosion of densely interconnected data has stimulated the research, development and adoption of graph database technologies. From early graph models to more recent native graph databases, the landscape of implementations has evolved to cover enterprise-ready requirements. Because of the interconnected nature of its data, the biomedical domain has been one of the early adopters of graph databases, enabling more natural representation models and better data integration workflows, exploration and analysis facilities. In this work, we survey the literature to explore the evolution, performance and how the most recent graph database solutions are applied in the biomedical domain, compiling a great variety of use cases. With this evidence, we conclude that the available graph database management systems are fit to support data-intensive, integrative applications, targeted at both basic research and exploratory tasks closer to the clinic.

neo4jsbml: import systems biology markup language data into the graph database Neo4j

T. Duigou Sandra Dérozier J. Faulon + 1 lainnya

16 Januari 2024

Systems Biology Markup Language (SBML) has emerged as a standard for representing biological models, facilitating model sharing and interoperability. It stores many types of data and complex relationships, complicating data management and analysis. Traditional database management systems struggle to effectively capture these complex networks of interactions within biological systems. Graph-oriented databases perform well in managing interactions between different entities. We present neo4jsbml, a new solution that bridges the gap between the Systems Biology Markup Language data and the Neo4j database, for storing, querying and analyzing data. The Systems Biology Markup Language organizes biological entities in a hierarchical structure, reflecting their interdependencies. The inherent graphical structure represents these hierarchical relationships, offering a natural and efficient means of navigating and exploring the model’s components. Neo4j is an excellent solution for handling this type of data. By representing entities as nodes and their relationships as edges, Cypher, Neo4j’s query language, efficiently traverses this type of graph representing complex biological networks. We have developed neo4jsbml, a Python library for importing Systems Biology Markup Language data into a Neo4j database using a user-defined schema. By leveraging Neo4j’s graphical database technology, exploration of complex biological networks becomes intuitive and information retrieval efficient. Neo4jsbml is a tool designed to import Systems Biology Markup Language data into a Neo4j database. Only the desired data is loaded into the Neo4j database. neo4jsbml is user-friendly and can become a useful new companion for visualizing and analyzing metabolic models through the Neo4j graphical database. neo4jsbml is open source software and available at https://github.com/brsynth/neo4jsbml.

GREG—studying transcriptional regulation using integrative graph databases

Antonio Mora Xiaowei Huang Chengshu Xie + 1 lainnya

1 Januari 2020

Abstract A gene regulatory process is the result of the concerted action of transcription factors, co-factors, regulatory non-coding RNAs (ncRNAs) and chromatin interactions. Therefore, the combination of protein–DNA, protein–protein, ncRNA–DNA, ncRNA–protein and DNA–DNA data in a single graph database offers new possibilities regarding generation of biological hypotheses. GREG (The Gene Regulation Graph Database) is an integrative database and web resource that allows the user to visualize and explore the network of all above-mentioned interactions for a query transcription factor, long non-coding RNA, genomic range or DNA annotation, as well as extracting node and interaction information, identifying connected nodes and performing advanced graphical queries directly on the regulatory network, in a simple and efficient way. In this article, we introduce GREG together with some application examples (including exploratory research of Nanog’s regulatory landscape and the etiology of chronic obstructive pulmonary disease), which we use as a demonstration of the advantages of using graph databases in biomedical research. Database URL: https://mora-lab.github.io/projects/greg.html, www.moralab.science/GREG/

AIMedGraph: a comprehensive multi-relational knowledge graph for precision medicine

Xueping Quan Linghua Yan W. Cai + 2 lainnya

1 Januari 2023

Abstract The development of high-throughput molecular testing techniques has enabled the large-scale exploration of the underlying molecular causes of diseases and the development of targeted treatment for specific genetic alterations. However, knowledge to interpret the impact of genetic variants on disease or treatment is distributed in different databases, scientific literature studies and clinical guidelines. AIMedGraph was designed to comprehensively collect and interrogate standardized information about genes, genetic alterations and their therapeutic and diagnostic relevance and build a multi-relational, evidence-based knowledge graph. Graph database Neo4j was used to represent precision medicine knowledge as nodes and edges in AIMedGraph. Entities in the current release include 30 340 diseases/phenotypes, 26 140 genes, 187 541 genetic variants, 2821 drugs, 15 125 clinical trials and 797 911 supporting literature studies. Edges in this release cover 621 731 drug interactions, 9279 drug susceptibility impacts, 6330 pharmacogenomics effects, 30 339 variant pathogenicity and 1485 drug adverse reactions. The knowledge graph technique enables hidden knowledge inference and provides insight into potential disease or drug molecular mechanisms. Database URL: http://aimedgraph.tongshugene.net:8201

PloverDB: A high-performance platform for serving biomedical knowledge graphs as standards-compliant web APIs

Eric W. Deutsch Amy K. Glen Stephen A. Ramsey

10 Maret 2025

Knowledge graphs are increasingly being used to integrate heterogeneous biomedical knowledge and data. General-purpose graph database management systems such as Neo4j are often used to host and search knowledge graphs, but such tools come with overhead and leave biomedical-specific standards compliance and reasoning to the user. Interoperability across biomedical knowledge bases and reasoning systems necessitates the use of standards such as those adopted by the Biomedical Data Translator consortium. We present PloverDB, a comprehensive software platform for hosting and efficiently serving biomedical knowledge graphs as standards-compliant web application programming interfaces. In addition to fundamental back-end knowledge reasoning tasks, PloverDB automatically handles entity resolution, exposure of standardized metadata and test data, and multiplexing of knowledge graphs, all in a single platform designed specifically for efficient query answering and ease of deployment. PloverDB increases data accessibility and utility by allowing data providers to quickly serve their biomedical knowledge graphs as standards-compliant web services. Availability and Implementation: PloverDB’s source code and technical documentation are publicly available under an MIT License at github.com/RTXteam/PloverDB.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.