Skip to content

Optimize IPFS retries #5698

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 2 commits into from
Nov 8, 2024
Merged

Optimize IPFS retries #5698

merged 2 commits into from
Nov 8, 2024

Conversation

isum
Copy link
Member

@isum isum commented Nov 6, 2024

This PR reduces the number of unnecessary IPFS retries by requiring an explicit retry policy, so that different components of the graph-node can decide to use different types of retries, or not use retries at all.

Closes #5696

Other improvements:

IPFS client pool now reuses the initial response instead of doubling the number of requests.

Future improvements:

This PR includes the first steps in defining deterministic IPFS errors so that some retries can be eliminated altogether. This functionality is not part of this PR as it requires more thought and testing and will be added in a future PR.

@isum isum added the area/ipfs label Nov 6, 2024
@isum isum self-assigned this Nov 6, 2024
@isum isum marked this pull request as ready for review November 6, 2024 19:37
#[tokio::test]
async fn new_unchecked_creates_the_client_without_checking_the_gateway() {
let server = mock_server().await;

IpfsGatewayClient::new_unchecked(server.uri(), &discard()).unwrap();
}

#[tokio::test]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

did you replace these with something else?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

These tests were for the .can_provide() method used by the IPFS client pool to select the appropriate client for requests, but this is no longer needed, so the method and its tests have been removed.

I also removed tests for non-retriable errors, because from the perspective of IPFS clients, all errors are now retriable, except for deterministic errors that are not implemented yet.

&path,
self.max_file_size,
Some(self.timeout),
RetryPolicy::None,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Any way we can make sure this is tested to prevent regression?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The IPFS client's internal retries caused the high number of IPFS requests here, because the polling monitor has it's own infinite retries with priorities. By disabling the internal retries of the IPFS client, we allow the polling monitor to behave as expected and rely on it's retry functionality.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Understood. It's kind of hard to differentiate the different layers of retries. I'm happy to leave it as is since now we know it won't retry on the client level.

Copy link
Contributor

@mangas mangas left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See comments. Like the fact you simplified some of the traits and cleared a good amount of code.

@isum isum force-pushed the ion/optimize-ipfs-retries branch from b2aee12 to d56ce13 Compare November 8, 2024 12:01
@isum isum merged commit c215d5e into master Nov 8, 2024
6 checks passed
@isum isum deleted the ion/optimize-ipfs-retries branch November 8, 2024 12:09
encalypto pushed a commit that referenced this pull request Jan 10, 2025
* ipfs: optimize retries

* ipfs: add more tests
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Reduce unnecessary IPFS retries
2 participants