Route planning on GTFS using Neo4j

Abstrak

GTFS (General Transit Feed Specification) is a standard of Google for public transportation schedules. The specification describes stops, routes, dates, trips, etc. of one or more public transportation company for a city or a country. Examining a GTFS feed it can be considered as a graph. In addition in the last decades new database management systems was born in order to support the big data era and/or help to write program codes. Their collective name is the NoSQL databases, which covers many types of database systems. One type of them is the graph databases, from which the Neo4j is the most widespread. In this paper I try to find the answer for the question how the Neo4j can support the usage of the GTFS. The most obvious usage of the GTFS is the route planning for which the Neo4j offers some algorithms. I built some storage structures on which the tools provided by Neo4j can be effectively used to plan routes on GTFS data.

Artikel Ilmiah Terkait

Framework for constructing multimodal transport networks and routing using a graph database: A case study in London

Tao Cheng Seula Park

5 Juni 2023

Most prior multimodal transport networks have been organized as relational databases with multilayer structures to support transport management and routing; however, database expandability and update efficiency in new networks and timetables are low due to the strict database schemas. This study aimed to develop multimodal transport networks using a graph database that can accommodate efficient updates and extensions, high relation‐based query performance, and flexible integration in multimodal routing. As a case study, a database was constructed for London transport networks, and routing tests were performed under various conditions. The constructed multimodal graph database showed stable performance in processing iterative queries, and efficient multi‐stop routing was particularly enhanced. By applying the proposed framework, databases for multimodal routing can be readily constructed for other regions, while enabling responses to diversified routings, such as personalized routing through integration with various unstructured information, due to the flexible schema of the graph database.

Performance Analysis of Neo4j and MySQL Databases using Public Policies Decision Making Data

Rahmatian Jayanty Sholichah A. Alamsyah Mahmud Imrona

24 September 2020

Currently, the development of data has increased rapidly, Solutions are needed to be able to manage data efficiently, one that can be offered is to utilize the database. The biggest decision in selecting a database is to select between SQL or NoSQL. MySQL is a database that uses SQL as a query language, consists of tables that store data in the form of columns and rows, then the new format database NoSQL, appeared, it is suitable for handling large amounts of data in a variety of formats. Neo4j is one of NoSQL that is widely used, it is a graph database which provides an easy way to visualize data by storing data in the form of nodes that are connected by edges. In this paper, we compared the performance of MySQL and Neo4j databases in terms of memory usage and execution time, also we presented the flexibility of the databases using P. The results show that MySQL has a faster execution time than Neo4j, although, both these databases have the same time complexity. It is also known that Neo4j has a higher memory usage than MySQL. But Neo4j has better flexibility than MySQL.

Query Performance Comparison of PostgreSQL vs. Neo4j. A Basic Distributed Setup on OpenStack

Marin Fotache Nicoleta Teacă Ciprian Pinzaru + 3 lainnya

19 September 2024

Among the NoSQL technologies, Neo4j is one of the most popular solutions for managing graph databases and an early adopter of transactions (contrary to other NoSQL Systems). Neo4j also provides a powerful high-level data processing language - Cypher. Despite its popularity, there are few comprehensive studies about benchmarking the query performance of Neo4j relative to SQL or other NoSQL counterparts. In this paper, we present a module for converting the TPC-H benchmark database from PostgreSQL to Neo4j, and we built a set of 110 SQL queries that were translated into Cypher. For both database servers, the queries were executed with a 10-minute timeout on OpenStack setups following nine scenarios by combining three database scale factors $(1 \mathrm{~GB}, 5 \mathrm{~GB}$, and 10 GB) with three data distribution variants (with 1,5, and 10 nodes). Results provide support for query performance assessment of these two big data products.

Graphix: “One User's JSON is Another User's Graph”

Michael J. Carey Glenn Galvizo

13 Mei 2024

The increasing prevalence of large graph data has produced a variety of research and applications tailored toward graph data management. Users aiming to perform graph analytics will typically start by importing existing data into a separate graph-purposed storage engine. The cost of maintaining a separate system (e.g., the data copy, the associated queries, etc …) just for graph analytics may be prohibitive for users with Big Data. In this paper, we introduce Graphix and show how it enables property graph views of existing document data in AsterixDB, a Big Data management system boasting a partitioned-parallel query execution engine. We explain a) the graph view user model of Graphix, b) $\text{gSQL}^{++}$, a novel query language extension for synergistic document-based navigational pattern matching, and c) how edge hops are evaluated in a parallel fashion. We then compare queries authored in $\text{gSQL}^{++}$ against versions in other leading query languages. Finally, we evaluate our approach against a leading native graph database, Neo4j, and show that Graphix is appropriate for operational and analytical workloads, especially at scale.

The Suitability of Graph Databases for Big Data Analysis: A Benchmark

Matus Stovcik Barbora Buhnova M. Mačák

2020

: Digitalization of our society brings various new digital ecosystems (e.g., Smart Cities, Smart Buildings, Smart Mobility), which rely on the collection, storage, and processing of Big Data. One of the recently popular advancements in Big Data storage and processing are the graph databases. A graph database is specialized to handle highly connected data, which can be, for instance, found in the cross-domain setting where various levels of data interconnection take place. Existing works suggest that for data with many relationships, the graph databases perform better than non-graph databases. However, it is not clear where are the borders for speciﬁc query types, for which it is still efﬁcient to use a graph database. In this paper, we design and perform tests that examine these borders. We perform the tests in a cluster of three machines so that we explore the database behavior in Big Data scenarios concerning the query. We speciﬁcally work with Neo4j as a representative of graph databases and PostgreSQL as a representative of non-graph databases.

Daftar Referensi

0 referensi

Tidak ada referensi ditemukan.

Artikel yang Mensitasi

0 sitasi

Tidak ada artikel yang mensitasi.