Batch neighbour retrieval in single server case #21862

jvolmer · 2025-07-16T11:12:00Z

This is the first PR to retrieve neighbours in batches in traversals. This can drastically improve limited traversal runtimes on graphs with supernodes.

In the graph's SingleServerProvider, this PR adapts how neighbours of a specific vertex are read in the expand function. It now introduces a neighbour provider that is responsible for providing the neighbours in batches (which extracts a lot of code in the SingleServerProvider). For now we loop over all batches to get all neighbours in expand. In a later PR (when the cluster case is also implemented), expand should return one batch per call.

The neighbour provider is set to a specific vertex and provides one batch of neighbours per call to its next function. Internally, it saves read neighbour batches to a cache. If the neighbour provider is set to a vertex for which the cache includes already all neighbours, the neighbour provider provides these cached batches instead of reading them again from memory.

…trieval-single-server-case

mchacki

Good Code in general, I think this is a step in the right direction.
But as you noted it will not yet solve the problem, as it still iterates over all neighbors, it just adds context switches for now. (You want to use those to return so that is good 👍)

I am starting to think about this on a more high level.
Is the intermediate copy into the std::vector actually necessary?
Or should we aim for an iterator approach directly. The iterator we could just keep open.
And only consume the batch we want.

If we cache then the iterator can be feed from the cache on next round.

arangod/Graph/Providers/SingleServerProvider.cpp

arangod/Graph/Providers/SingleServer/SingleServerNeighbourProvider.cpp

arangod/Graph/Providers/SingleServerProvider.cpp

- add a use-cache plug to SingleServerNeighbourProvider because for some usages - e.g. a simple tree traversal - the cache will not be useful and just waste memory. The plug is not yet used. - reverse batch size memory batch for batch vector for more efficient memory usage - get batch size from ExecutionBlock::DefaultBatchSize instead of using a "magic number"

…our-retrieval-single-server-case

jvolmer self-assigned this Jul 16, 2025

cla-bot bot added the cla-signed label Jul 16, 2025

jvolmer added 14 commits July 17, 2025 14:46

Extract single server neighbour provider

e030c5c

Iterate over all batches

4ff214e

Get vertex cache back in

a897295

Extract vertex cache into its own struct

6abcead

Extract ExpansionInfo and NeighbourCache into own files in new dir

82849b7

Add test for vertex cache and improve cache api

20adfcb

Fix js tests

87a1417

Get rid of unused stats member in other cache

33ac412

Do the actual batching when calling the index iterator cursor

38f798b

Write some docu

35020c7

Fix build

9098ac5

Fix assertion failure in resource manager

3dd8ea1

Fix NeighbourCache gtest

fce16c1

Fix usage of HashedStringRef

b3150f6

jvolmer force-pushed the feature/batch-neighbour-retrieval-single-server-case branch from 6b721a2 to b3150f6 Compare July 17, 2025 12:51

Merge remote-tracking branch 'origin' into feature/batch-neighbour-re…

830cca3

…trieval-single-server-case

mchacki reviewed Jul 22, 2025

View reviewed changes

arangod/Graph/Providers/SingleServerProvider.cpp Outdated Show resolved Hide resolved

arangod/Graph/Providers/SingleServer/SingleServerNeighbourProvider.cpp Outdated Show resolved Hide resolved

arangod/Graph/Providers/SingleServerProvider.cpp Show resolved Hide resolved

jvolmer added 2 commits July 28, 2025 15:13

Merge remote-tracking branch 'origin/devel' into feature/batch-neighb…

a5af0b8

…our-retrieval-single-server-case

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Batch neighbour retrieval in single server case #21862

Batch neighbour retrieval in single server case #21862

Uh oh!

jvolmer commented Jul 16, 2025

Uh oh!

mchacki left a comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Batch neighbour retrieval in single server case #21862

Are you sure you want to change the base?

Batch neighbour retrieval in single server case #21862

Uh oh!

Conversation

jvolmer commented Jul 16, 2025

Uh oh!

mchacki left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!