Leveraging Neo4j for Data Science: Evaluating Traversal Efficiency in GDS and APOC for Directed Acyclic Graphs

Abstract

This paper presents a benchmark study of Breadth-First Search (BFS) and Depth-First Search (DFS) traversal algorithms applied to complex Directed Acyclic Graphs (DAGs) within Neo4j, utilizing the Graph Data Science (GDS) and Awesome Procedures on Cypher (APOC) libraries. DAGs are widely used in fields like data science, project management, software engineering, and bioinformatics to manage dependencies without cycles. Our experiments evaluate the performance of GDS and APOC on DAGs generated from Feature Models representing dependencies in Software Product Lines (SPL). Results indicate that GDS consistently outperforms APOC, particularly for large and intricate graph structures. These findings highlight the importance of optimized traversal techniques for managing complex DAGs efficiently, offering insights into scalability and performance improvements for real-world applications.

Department(s)

Computer Science

Document Type

Conference Proceeding

DOI

10.1109/DSIT61374.2024.10880896

Keywords

APOC, BFS, DFS, directed acyclic graphs, feature models, GDS, graph databases, Neo4j, performance benchmarking, traversal algorithms

Publication Date

1-1-2024

Journal Title

Proceedings 2024 7th International Conference on Data Science and Information Technology Dsit 2024

Share

COinS