Bringing Graph Databases and Network Visualization Together (Dagstuhl Seminar 22031)

Da Yan Hsiang-Yun Wu Juan Sequeda + 1 penulis

Abstrak

This report documents the program and the outcomes of Dagstuhl Seminar 22031 “Bringing Graph Databases and Network Visualization Together”. Due to the ongoing restrictions caused by the COVID-19 pandemic, this purely on-site seminar had a reduced number of participants. Twenty-two researchers and practitioners from the Network Visualization and Graph Database communities met to initiate collaborative work and exchange between the two communities. The seminar served to establish a common understanding of the state of the art and the terminology in both communities

Artikel Ilmiah Terkait

The World of Graph Databases from An Industry Perspective

Yuanyuan Tian

23 November 2022

Rapidly growing social networks and other graph data have created a high demand for graph technologies in the market. A plethora of graph databases, systems, and solutions have emerged, as a result. On the other hand, graph has long been a well studied area in the database research community. Despite the numerous surveys on various graph research topics, there is a lack of survey on graph technologies from an industry perspective. The purpose of this paper is to provide the research community with an industrial perspective on the graph database landscape, so that graph researcher can better understand the industry trend and the challenges that the industry is facing, and work on solutions to help address these problems.

Understanding Graph Databases: A Comprehensive Tutorial and Survey

Oluwatosin Agbaakin Sydney Anuyah Victor Bolade

15 November 2024

This tutorial serves as a comprehensive guide for understanding graph databases, focusing on the fundamentals of graph theory while showcasing practical applications across various fields. It starts by introducing foundational concepts and delves into the structure of graphs through nodes and edges, covering different types such as undirected, directed, weighted, and unweighted graphs. Key graph properties, terminologies, and essential algorithms for network analysis are outlined, including Dijkstras shortest path algorithm and methods for calculating node centrality and graph connectivity. The tutorial highlights the advantages of graph databases over traditional relational databases, particularly in efficiently managing complex, interconnected data. It examines leading graph database systems such as Neo4j, Amazon Neptune, and ArangoDB, emphasizing their unique features for handling large datasets. Practical instructions on graph operations using NetworkX and Neo4j are provided, covering node and edge creation, attribute assignment, and advanced queries with Cypher. Additionally, the tutorial explores common graph visualization techniques using tools like Plotly and Neo4j Bloom, which enhance the interpretation and usability of graph data. It also delves into community detection algorithms, including the Louvain method, which facilitates clustering in large networks. Finally, the paper concludes with recommendations for researchers interested in exploring the vast potential of graph technologies.

An overview of graph databases and their applications in the biomedical domain

Santiago Timón-Reina R. Martínez-Tomás M. Rincón

1 Januari 2021

Abstract Over the past couple of decades, the explosion of densely interconnected data has stimulated the research, development and adoption of graph database technologies. From early graph models to more recent native graph databases, the landscape of implementations has evolved to cover enterprise-ready requirements. Because of the interconnected nature of its data, the biomedical domain has been one of the early adopters of graph databases, enabling more natural representation models and better data integration workflows, exploration and analysis facilities. In this work, we survey the literature to explore the evolution, performance and how the most recent graph database solutions are applied in the biomedical domain, compiling a great variety of use cases. With this evidence, we conclude that the available graph database management systems are fit to support data-intensive, integrative applications, targeted at both basic research and exploratory tasks closer to the clinic.

Graph data warehousing

A. Ghrab

2020

Over the last decade, we have witnessed the emergence of networks in a wide spectrum of application domains, ranging from social and information networks to biological and transportation networks. Graphs provide a solid theoretical foundation for modeling complex networks, and revealing valuable insights from both the network structure and the data embedded within its entities. As the business and social environments are getting increasingly complex and interconnected, graphs became a widespread abstraction at the core of the information infrastructure supporting those environments. Modern information systems consist of a large number of sophisticated and interacting business entities that naturally form graphs. In particular, integrating graphs into data warehouse systems received a lot of interest from both academia and industry. Indeed, data warehouses are the central enterprise's information repository, and are critical for proper decision support and future planning. Graph warehousing is emerging as the field that extends current information systems with graph management and analytics capabilities. Many approaches were proposed to address the graph data warehousing challenge. These efforts laid the foundation for multidimensional modeling and analysis of graphs. However, most of the proposed approaches partially tackle the graph warehousing problem by being restricted to simple abstractions such as homogeneous graphs or ignoring important topics such as multidimensional integrity constraints and dimension hierarchies. In this dissertation, we conduct a systematic study of the graph data warehousing topic, and address the key challenges of database and multidimensional modeling of graphs. We first propose GRAD, a new graph database model specifically tuned for warehousing and OLAP analytics. GRAD aims to provide analysts with a set of simple, well-defined, and adaptable conceptual components to support rich semantics and perform complex analysis on graphs. Then, we define the multidimensional concepts for heterogeneous attributed graphs and highlight the new types of measures that could be derived. We project this multidimensional model on property graphs and explore how to extract the candidate multidimensional concepts and build graph cubes. Then, we extend the multidimensional model by integrating GRAD and show how graph modeling based on GRAD facilitates multidimensional modeling, and enables supporting dimension hierarchies and building new types of OLAP cubes on graphs. Afterwards, we present TopoGraph, a graph data warehousing framework that extends current graph warehousing models with new types of cubes and queries combining graph-oriented and OLAP querying. TopoGraph goes beyond traditional OLAP cubes, which process value-based grouping of tables, by considering in addition the topological properties of the graph elements. And it goes beyond current graph warehousing models by proposing new types of graph cubes. These cubes embed a rich repertoire of measures that could be represented with numerical values, with entire graphs, or as a combination of them. Finally, we propose an architecture of the graph data warehouse and describe its main building blocks and the remaining gaps. The various components of the graph warehousing framework can be effectively leveraged as a foundation for designing and building industry-grade graph data warehouses. We believe that our research in this thesis brings us a step closer towards a better understanding of graph warehousing. Yet, the models and framework we proposed are the tip of the iceberg. The marriage of graph and warehousing technologies will bring many exciting research opportunities, which we briefly discuss at the end of the thesis. Durant l’última dècada, hem estat testimonis de l’aparició de xarxes en un ampli espectre de dominis d’aplicació, que van de les xarxes socials i d’informació a xarxes biològiques i de transport. Els grafs proporcionen un fonament teòric sòlid per a modelar xarxes complexes i revelen informació valuosa tant de l'estructura de la xarxa com de les dades integrades a les seves entitats. A mesura que els entorns empresarials i socials són cada cop més complexos i interconnectats, els grafs es van convertir en una abstracció generalitzada en el nucli de la infraestructura d'informació que dona suport a aquests entorns. Els sistemes d'informació moderns consisteixen en un gran nombre d'entitats empresarials i la seva interacció, que formen grafs de forma natural. En particular, la integració de grafs en sistemes de magatzem de dades va rebre molt d’interès tant de l’àmbit acadèmic com de la indústria. De fet, els magatzems de dades són el repositori central d'informació de l'empresa i són fonamentals per a un suport adequat a la presa de decisions i una planificació futura. Els magatzems de dades en graf (graph data warehousing) és un camp emergent que estén els sistemes d’informació tradicionals amb capacitats d’administració i d’anàlisi de dades en format grafs. Fins ara, s'han proposat molts enfocaments per afrontar el repte de l'emmagatzematge de dades en graf. Aquests esforços van posar els fonaments pel modelatge i l'anàlisi de grafs d'una perspectiva multidimensional. Tanmateix, la majoria dels plantejaments proposats aborden parcialment el problema de l'emmagatzematge de grafs restringint-se a abstraccions simples com ara grafs homogenis o ignorant temes importants com ara restriccions d’integritat multidimensionals i jerarquies de dimensió. En aquesta tesi realitzem un estudi sistemàtic del tema d'emmagatzematge de dades en graf i tractem els reptes clau de la base de dades i el modelatge multidimensional de grafs. Primer proposem GRAD, un nou model de base de dades de grafs específicament ajustat per a emmagatzematge i analítica OLAP. GRAD pretén proporcionar als analistes un conjunt de components conceptuals simples, ben definits i adaptables per donar suport a elements semàntics complexos i realitzar anàlisis complexos sobre grafs. A continuació, definim els conceptes multidimensionals per a grafs heterogenis amb atributs i ressaltem els nous tipus de mesures que es poden derivar. Projectem aquest model multidimensional en property graphs i explorem com extreure conceptes multidimensionals candidats i construir cubs de grafs. A continuació, ampliem el model multidimensional integrant GRAD i mostrem com el modelatge de grafs basat en GRAD facilita el modelatge multidimensional i permet suportar jerarquies de dimensions i crear nous tipus de cubs OLAP en grafs. Després, presentem TopoGraph, un marc d’emmagatzematge de dades en graf que amplia els models d’emmagatzematge de grafs actuals amb nous tipus de cubs i consultes que combinen la consulta orientada a grafs i OLAP. TopoGraph va més enllà dels cubs tradicionals OLAP, que processen l'agrupació de taules basada en el valor, considerant a més les propietats topològiques dels grafs. I va més enllà dels models d’emmagatzematge en graf actuals proposant nous tipus de cubs de grafs. Aquests cubs incorporen un ric repertori de mesures que es podrien representar amb valors numèrics, amb grafs sencers o com a combinació d’aquests. Finalment, proposem una arquitectura per al magatzem de dades en graf i descrivim els blocs de construcció principals i els buits restants. Els diversos components del marc d'emmagatzematge de grafs es poden aprofitar eficaçment com a base per dissenyar i construir magatzems de dades de grafs a nivell industrial. Creiem que la nostra recerca en aquesta tesi ens apropa un pas més cap a una millor comprensió de graph warehousing.

Graph databases in systems biology: a systematic review

Dagmar Waltemath Irina Balaur Reinhard Schneider + 10 lainnya

23 September 2024

Abstract Graph databases are becoming increasingly popular across scientific disciplines, being highly suitable for storing and connecting complex heterogeneous data. In systems biology, they are used as a backend solution for biological data repositories, ontologies, networks, pathways, and knowledge graph databases. In this review, we analyse all publications using or mentioning graph databases retrieved from PubMed and PubMed Central full-text search, focusing on the top 16 available graph databases, Publications are categorized according to their domain and application, focusing on pathway and network biology and relevant ontologies and tools. We detail different approaches and highlight the advantages of outstanding resources, such as UniProtKB, Disease Ontology, and Reactome, which provide graph-based solutions. We discuss ongoing efforts of the systems biology community to standardize and harmonize knowledge graph creation and the maintenance of integrated resources. Outlining prospects, including the use of graph databases as a way of communication between biological data repositories, we conclude that efficient design, querying, and maintenance of graph databases will be key for knowledge generation in systems biology and other research fields with heterogeneous data.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.