The Suitability of Graph Databases for Big Data Analysis: A Benchmark
Abstrak
: Digitalization of our society brings various new digital ecosystems (e.g., Smart Cities, Smart Buildings, Smart Mobility), which rely on the collection, storage, and processing of Big Data. One of the recently popular advancements in Big Data storage and processing are the graph databases. A graph database is specialized to handle highly connected data, which can be, for instance, found in the cross-domain setting where various levels of data interconnection take place. Existing works suggest that for data with many relationships, the graph databases perform better than non-graph databases. However, it is not clear where are the borders for specific query types, for which it is still efficient to use a graph database. In this paper, we design and perform tests that examine these borders. We perform the tests in a cluster of three machines so that we explore the database behavior in Big Data scenarios concerning the query. We specifically work with Neo4j as a representative of graph databases and PostgreSQL as a representative of non-graph databases.
Artikel Ilmiah Terkait
Robert Pavliš
20 Mei 2024
As the global volume of data continues to rise at an unprecedented rate, the challenges of storing and analyzing data are becoming more and more highlighted. This is especially apparent when the data are heavily interconnected. The traditional methods of storing and analyzing data such as relational databases often encounter difficulties when dealing with large amounts of data and this is even more pronounced when the data exhibits intricate interconnections. This paper examines graph databases as an alternative to relational databases in an interconnected Big Data environment. It will also show the theoretical basis behind graph databases and how they outperform relational databases in such an environment, but also why they are better suited for this kind of environment than other NoSQL alternatives. A state of the art in graph databases and how they compare to relational databases in various scenarios will also be presented in this paper.
Ioannis Ballas Vassilios Tsakanikas Evaggelos Pefanis + 1 lainnya
20 November 2020
Big Data paradigm has placed pressure on well-established relational databases during the last decade. Both academia and industry have proposed several alternative database schemes in order to model the captured data more efficiently. Among these approaches, graph databases seem the most promising candidate to supplement relational schemes. Within this study, a comparison is performed among Neo4j, one of the leading graph databases, and Apache Spark, a unified engine for distributed large-scale data processing environment, in terms of processing limits. The results reveal that Neo4j is limited to the physical RAM memory of the processing environment. Yet, until this limit is reached, the processing engine of Neo4j outperforms the Apache Spark engine.
Johan Sandell Einar Asplund Martin Duneld + 1 lainnya
30 Januari 2024
Choosing and developing performant database solutions helps organizations optimize their operational practices and decision-making. Since graph data is becoming more common, it is crucial to develop and use them in big data with complex relationships with high and consistent performance. However, legacy database technologies such as MySQL are tailored to store relational databases and need to perform more complex queries to retrieve graph data. Previous research has dealt with performance aspects such as CPU and memory usage. In contrast, energy usage and temperature of the servers are lacking. Thus, this paper evaluates and compares state-of-the-art graphs and relational databases from the performance aspects to allow a more informed selection of technologies. Graph-based big data applications benefit from informed selection database technologies for data retrieval and analytics problems. The results show that Neo4j performs faster in querying connected data than MySQL and ArangoDB, and energy, CPU, and memory usage performances are reported in this paper.
Michael J. Carey Glenn Galvizo
13 Mei 2024
The increasing prevalence of large graph data has produced a variety of research and applications tailored toward graph data management. Users aiming to perform graph analytics will typically start by importing existing data into a separate graph-purposed storage engine. The cost of maintaining a separate system (e.g., the data copy, the associated queries, etc …) just for graph analytics may be prohibitive for users with Big Data. In this paper, we introduce Graphix and show how it enables property graph views of existing document data in AsterixDB, a Big Data management system boasting a partitioned-parallel query execution engine. We explain a) the graph view user model of Graphix, b) $\text{gSQL}^{++}$, a novel query language extension for synergistic document-based navigational pattern matching, and c) how edge hops are evaluated in a parallel fashion. We then compare queries authored in $\text{gSQL}^{++}$ against versions in other leading query languages. Finally, we evaluate our approach against a leading native graph database, Neo4j, and show that Graphix is appropriate for operational and analytical workloads, especially at scale.
Yuanyuan Tian
23 November 2022
Rapidly growing social networks and other graph data have created a high demand for graph technologies in the market. A plethora of graph databases, systems, and solutions have emerged, as a result. On the other hand, graph has long been a well studied area in the database research community. Despite the numerous surveys on various graph research topics, there is a lack of survey on graph technologies from an industry perspective. The purpose of this paper is to provide the research community with an industrial perspective on the graph database landscape, so that graph researcher can better understand the industry trend and the challenges that the industry is facing, and work on solutions to help address these problems.
Daftar Referensi
0 referensiTidak ada referensi ditemukan.
Artikel yang Mensitasi
0 sitasiTidak ada artikel yang mensitasi.