MV4PG: Materialized Views for Property Graphs
Abstrak
Graph databases are getting more and more attention in the highly interconnected data domain, and the demand for efficient querying of big data is increasing. We noticed that there are duplicate patterns in graph database queries, and the results of these patterns can be stored as materialized views first, which can speed up the query rate. So we propose materialized views on property graphs, including three parts: view creation, view maintenance, and query optimization using views, and we propose for the first time an efficient templated view maintenance method for containing variable-length edges, which can be applied to multiple graph databases. In order to verify the effect of materialized views, we prototype on TuGraph and experiment on both TuGraph and Neo4j. The experiment results show that our query optimization on read statements is much higher than the additional view maintenance cost brought by write statements. The speedup ratio of the whole workload reaches up to 28.71x, and the speedup ratio of a single query reaches up to nearly 100x.
Artikel Ilmiah Terkait
Mohanna Shahrad Yunjia Zheng Yu Ting Gu + 1 lainnya
13 Mei 2024
Views are widely used in relational databases to facilitate query writing, give individualized abstractions to different user groups, and improve query execution time with materialization techniques. This paper explores how views could be defined and used in graph database systems (GDBS) with a similar purpose to what can be found in relational systems. We perform our analysis using Neo4j and its query language Cypher which has many of the features typically found in graph query languages, aiming to pave the way for integrating view management into a wider range of GDBS.
Michael J. Carey Glenn Galvizo
13 Mei 2024
The increasing prevalence of large graph data has produced a variety of research and applications tailored toward graph data management. Users aiming to perform graph analytics will typically start by importing existing data into a separate graph-purposed storage engine. The cost of maintaining a separate system (e.g., the data copy, the associated queries, etc …) just for graph analytics may be prohibitive for users with Big Data. In this paper, we introduce Graphix and show how it enables property graph views of existing document data in AsterixDB, a Big Data management system boasting a partitioned-parallel query execution engine. We explain a) the graph view user model of Graphix, b) $\text{gSQL}^{++}$, a novel query language extension for synergistic document-based navigational pattern matching, and c) how edge hops are evaluated in a parallel fashion. We then compare queries authored in $\text{gSQL}^{++}$ against versions in other leading query languages. Finally, we evaluate our approach against a leading native graph database, Neo4j, and show that Graphix is appropriate for operational and analytical workloads, especially at scale.
James Clarkson Georgios Theodorakis Jim Webber
2024
Modern graph database management systems (DBMSs) can process highly dynamic labeled property graphs (LPGs) with many billions of relationships comfortably, but those systems often ignore the temporal dimension of data, how a graph evolved over time. Temporal analytics allow users to query and compute over the graph throughout its history so that valuable line-of-business data is always accessible and never lost. However, existing approaches tend to be ad-hoc and vary in performance depending on the size of the effective graph workload, such as local pattern matching or global graph algorithms. In this work, we describe Aion, a transactional temporal graph DBMS that generalizes previous approaches for LPGs. Aion extends Neo4j, a modern graph DBMS, incurring minimal performance overhead by decoupling the graph’s history from the latest graph version. To support efficient temporal analytics independently of workload characteristics, Aion adopts a hybrid temporal storage approach: (i) for fast full graph restoration at arbitrary time points, it uses TimeStore that indexes updates by time; (ii) for fine-grained graph history accesses, it uses LineageStore that indexes updates by entity identifiers. To enable incremental graph computations for improved latency, Aion introduces a compute-efficient in-memory LPG representation. Our experiments show that Aion achieves comparable or better performance versus existing non-transactional temporal systems and provides up to an order of magnitude speedup over classic Neo4j.
Wei Lu Zhouyu Wang Guodong Jin + 4 lainnya
24 April 2023
Real-world graphs are often dynamic and evolve over time. It is crucial for storing and querying a graph's evolution in graph databases. However, existing works either suffer from high storage overhead or lack efficient temporal query support, or both. In this paper, we propose AeonG, a new graph database with built-in temporal support. AeonG is based on a novel temporal graph model. To fit this model, we design a storage engine and a query engine. Our storage engine is hybrid, with one current storage to manage the most recent versions of graph objects, and another historical storage to manage the previous versions of graph objects. This separation makes the performance degradation of querying the most recent graph object versions as slight as possible. To reduce the historical storage overhead, we propose a novel anchor+delta strategy, in which we periodically create a complete version (namely anchor) of a graph object, and maintain every change (namely delta) between two adjacent anchors of the same object. To boost temporal query processing, we propose an anchor-based version retrieval technique in the query engine to skip unnecessary historical version traversals. Extensive experiments are conducted on both real and synthetic datasets. The results show that AeonG achieves up to 5.73× lower storage consumption and 2.57× lower temporal query latency against state-of-the-art approaches, while introducing only 9.74% performance degradation for supporting temporal features.
Matus Stovcik Barbora Buhnova M. Mačák
2020
: Digitalization of our society brings various new digital ecosystems (e.g., Smart Cities, Smart Buildings, Smart Mobility), which rely on the collection, storage, and processing of Big Data. One of the recently popular advancements in Big Data storage and processing are the graph databases. A graph database is specialized to handle highly connected data, which can be, for instance, found in the cross-domain setting where various levels of data interconnection take place. Existing works suggest that for data with many relationships, the graph databases perform better than non-graph databases. However, it is not clear where are the borders for specific query types, for which it is still efficient to use a graph database. In this paper, we design and perform tests that examine these borders. We perform the tests in a cluster of three machines so that we explore the database behavior in Big Data scenarios concerning the query. We specifically work with Neo4j as a representative of graph databases and PostgreSQL as a representative of non-graph databases.
Daftar Referensi
0 referensiTidak ada referensi ditemukan.
Artikel yang Mensitasi
0 sitasiTidak ada artikel yang mensitasi.