Neural Graph Reasoning: Complex Logical Query Answering Meets Graph Databases

Mikhail Galkin Hongyu Ren J. Leskovec + 2 penulis

Abstrak

Complex logical query answering (CLQA) is a recently emerged task of graph machine learning that goes beyond simple one-hop link prediction and solves a far more complex task of multi-hop logical reasoning over massive, potentially incomplete graphs in a latent space. The task received a significant traction in the community; numerous works expanded the field along theoretical and practical axes to tackle different types of complex queries and graph modalities with efficient systems. In this paper, we provide a holistic survey of CLQA with a detailed taxonomy studying the field from multiple angles, including graph types (modality, reasoning domain, background semantics), modeling aspects (encoder, processor, decoder), supported queries (operators, patterns, projected variables), datasets, evaluation metrics, and applications. Refining the CLQA task, we introduce the concept of Neural Graph Databases (NGDBs). Extending the idea of graph databases (graph DBs), NGDB consists of a Neural Graph Storage and a Neural Graph Engine. Inside Neural Graph Storage, we design a graph store, a feature store, and further embed information in a latent embedding store using an encoder. Given a query, Neural Query Engine learns how to perform query planning and execution in order to efficiently retrieve the correct results by interacting with the Neural Graph Storage. Compared with traditional graph DBs, NGDBs allow for a flexible and unified modeling of features in diverse modalities using the embedding store. Moreover, when the graph is incomplete, they can provide robust retrieval of answers which a normal graph DB cannot recover. Finally, we point out promising directions, unsolved problems and applications of NGDB for future research.

Artikel Ilmiah Terkait

Top Ten Challenges Towards Agentic Neural Graph Databases

Lei Chen Tianshi ZHENG Hang Yin + 16 lainnya

24 Januari 2025

Graph databases (GDBs) like Neo4j and TigerGraph excel at handling interconnected data but lack advanced inference capabilities. Neural Graph Databases (NGDBs) address this by integrating Graph Neural Networks (GNNs) for predictive analysis and reasoning over incomplete or noisy data. However, NGDBs rely on predefined queries and lack autonomy and adaptability. This paper introduces Agentic Neural Graph Databases (Agentic NGDBs), which extend NGDBs with three core functionalities: autonomous query construction, neural query execution, and continuous learning. We identify ten key challenges in realizing Agentic NGDBs: semantic unit representation, abductive reasoning, scalable query execution, and integration with foundation models like large language models (LLMs). By addressing these challenges, Agentic NGDBs can enable intelligent, self-improving systems for modern data-driven applications, paving the way for adaptable and autonomous data management solutions.

Federated Neural Graph Databases

Qi Hu Jianxin Li Zihao Wang + 6 lainnya

22 Februari 2024

The increasing demand for large-scale language models (LLMs) has highlighted the importance of efficient data retrieval mechanisms. Neural graph databases (NGDBs) have emerged as a promising approach to storing and querying graph-structured data in neural space, enabling the retrieval of relevant information for LLMs. However, existing NGDBs are typically designed to operate on a single graph, limiting their ability to reason across multiple graphs. Furthermore, the lack of support for multi-source graph data in existing NGDBs hinders their ability to capture the complexity and diversity of real-world data. In many applications, data is distributed across multiple sources, and the ability to reason across these sources is crucial for making informed decisions. This limitation is particularly problematic when dealing with sensitive graph data, as directly sharing and aggregating such data poses significant privacy risks. As a result, many applications that rely on NGDBs are forced to choose between compromising data privacy or sacrificing the ability to reason across multiple graphs. To address these limitations, we propose Federated Neural Graph Database (FedNGDB), a novel framework that enables reasoning over multi-source graph-based data while preserving privacy. FedNGDB leverages federated learning to collaboratively learn graph representations across multiple sources, enriching relationships between entities and improving the overall quality of the graph data. Unlike existing methods, FedNGDB can handle complex graph structures and relationships, making it suitable for various downstream tasks.

Query cost estimation in graph databases via emphasizing query dependencies by using a neural reasoning network

P. Li Jiong Yu Tiquan Gu + 3 lainnya

26 Mei 2023

With the increasing complexity of graph queries, query cost estimation has become a key challenge in graph databases. Accurate estimation results are critical for database administrators or database management systems to perform query processing or optimization tasks. An efficient and accurate estimation model can improve the estimation quality and make the produced results credible. Although learning‐based methods have been applied in query cost estimation, most of them are directed at relational queries and cannot be directly used for graph queries. Furthermore, most estimation approaches focus on the correlations between predicates or columns. The dependencies between query schema and query filter conditions and the correlation between query schema are ignored. In this study, we construct a novel deep learning model composed of reasoning and retrieval processes that can accurately capture the potential logical relationships in graph queries. This solves the above problems to some extent. In addition, we propose a query estimation framework that divides the estimation task into query workload generation, training data collection, feature extraction and encoding, and estimation model construction. The results of the experiment on real‐world datasets show that our estimation model can improve the estimation quality and outperforms other compared deep learning models in terms of estimation accuracy.

Comparative Analysis of Logic Reasoning and Graph Neural Networks for Ontology-Mediated Query Answering With a Covering Axiom

Nikita Severin Ilya Makarov Olga Gerasimova

2023

The problem of query answering over incomplete attributed graph data is a challenging field of database management systems and artificial intelligence. When there are rules on data structure expressed in the form of the ontology, the theoretical complexity of finding exact solution satisfying ontology constraints increases. Logic-based methods use theoretical constructions to obtain efficient rewritings of the original queries with respect to ontology and find an answer to the rewriting query over incomplete data. However, there is an opportunity to use faster machine learning methods to label all the data and query over the “most probable” data model without taking into account the ontology. This research paper investigates the effectiveness and trustworthiness of both mentioned approaches for answering ontology-mediated queries on graph databases that integrate an ontology with a covering axiom, which states that every node belongs to either of two classes. The first approach involves finding precise answers through logical reasoning and rewriting the problem into a datalog program, while the second approach employs a trained graph neural network to label data in a binary classification problem and leverages SQL for query answering. We conduct an in-depth analysis of the time performance of these approaches and evaluate the impact of training set selection on their ability of correct query answering. By comparing these approaches across various experiments, we provide insights into their strengths and limitations for answering ontology-mediated queries containing a Boolean conjunctive query. In particular, we showed the importance of logic-based approaches for ontology with a covering axiom and the inability of machine learning methods to find answers for ontology-mediated queries in large networks.

Graph-ToolFormer: To Empower LLMs with Graph Reasoning Ability via Prompt Augmented by ChatGPT

Jiawei Zhang

10 April 2023

In this paper, we aim to develop a large language model (LLM) with the reasoning ability on complex graph data. Currently, LLMs have achieved very impressive performance on various natural language learning tasks, extensions of which have also been applied to study the vision tasks with multi-modal data. However, when it comes to the graph learning tasks, existing LLMs present very serious flaws due to their several inherited weaknesses in performing {multi-step logic reasoning}, {precise mathematical calculation} and {perception about the spatial and temporal factors}. To address such challenges, in this paper, we will investigate the principles, methodologies and algorithms to empower existing LLMs with graph reasoning ability, which will have tremendous impacts on the current research of both LLMs and graph learning. Inspired by the latest ChatGPT and Toolformer models, we propose the Graph-ToolFormer (Graph Reasoning oriented Toolformer) framework to teach LLMs themselves with prompts augmented by ChatGPT to use external graph reasoning API tools. Specifically, we will investigate to teach Graph-ToolFormer to handle various graph data reasoning tasks in this paper, including both (1) very basic graph data loading and graph property reasoning tasks, ranging from simple graph order and size to the graph diameter and periphery, and (2) more advanced reasoning tasks on real-world graph data, such as bibliographic networks, protein molecules, sequential recommender systems, social networks and knowledge graphs.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.