Leveraging Neo4j for Data Science: Evaluating Traversal Efficiency in GDS and APOC for Directed Acyclic Graphs
Abstract
This paper presents a benchmark study of Breadth-First Search (BFS) and Depth-First Search (DFS) traversal algorithms applied to complex Directed Acyclic Graphs (DAGs) within Neo4j, utilizing the Graph Data Science (GDS) and Awesome Procedures on Cypher (APOC) libraries. DAGs are widely used in fields like data science, project management, software engineering, and bioinformatics to manage dependencies without cycles. Our experiments evaluate the performance of GDS and APOC on DAGs generated from Feature Models representing dependencies in Software Product Lines (SPL). Results indicate that GDS consistently outperforms APOC, particularly for large and intricate graph structures. These findings highlight the importance of optimized traversal techniques for managing complex DAGs efficiently, offering insights into scalability and performance improvements for real-world applications.
Department(s)
Computer Science
Document Type
Conference Proceeding
DOI
10.1109/DSIT61374.2024.10880896
Keywords
APOC, BFS, DFS, directed acyclic graphs, feature models, GDS, graph databases, Neo4j, performance benchmarking, traversal algorithms
Publication Date
1-1-2024
Recommended Citation
Saquer, Jamil M. and Shatnawi, Hazim, "Leveraging Neo4j for Data Science: Evaluating Traversal Efficiency in GDS and APOC for Directed Acyclic Graphs" (2024). Faculty Scholarship. 457.
https://bearworks.missouristate.edu/articles00/457
Journal Title
Proceedings 2024 7th International Conference on Data Science and Information Technology Dsit 2024