Skip to content

fetch: pass negotiation data around as a struct #5105

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from

Conversation

tiennou
Copy link
Contributor

@tiennou tiennou commented Jun 10, 2019

This is extracted from the shallow support transport level PR, and since it very clearly has an impact on API compatibility, we should arrange to be ready for anything. For example, shallow clones need to keep negotiated "things" (the list of server-received shallow OIDs, that should technically only be enforced if the fetch actually unpacks or something).

As the public interface changes to support some options to the negotiation step (like depth level for shallow clones), the underlying transport will need to be updated as well, so let's be prepared.

We will need to pass some options to the negotiation step (like depth level for shallow clones)
typedef struct {
const git_remote_head * const *refs;
size_t count;
} git_fetch_negotiation;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is somewhat weird because, for the first time, we're the ones passing "sys-level" structs to "userspace", expecting that they might (or might not) be able to understand some of those options. I've refrained from adding a version field, because it's "feature-based", hence a monotonic number might not be sufficient.

@ethomson
Copy link
Member

Hmm. This would break the public API. I'm wondering if there's a way that we can do this without breaking the API. In #4747, instead of changing the negotiate_fetch callback, we could keep negotiate_fetch as-is, and add a new endpoint negotiate_fetch_shallow.

Existing consumers that do not supply the negotiate_fetch_shallow pointer will simply not support shallow fetching (which is the same behavior they have today). But consumers may now opt-in to shallow support by providing the new endpoint.

@tiennou
Copy link
Contributor Author

tiennou commented Jun 11, 2019

I want a future-proof solution: adding a new callback might work once or twice. In that case, I'm the first with #4747, but the 2nd is whoever implements protocol-v2.

My concern is that I have a hunch I need some kind of "feature" system to be able to say that a given custom transport cannot do shallow (or anything else that our default transports handle). Maybe I should even be passing the actual opts requested down to the transport, so that it can fail based on its own terms, but how can it detect that we've asked for a shallow clone (or a v2 transport feature) if it hasn't been updated ? Or is that out of scope anyway, and we state that the API cannot guarantee that a shallow clone (really, anything that is not basic) is possible if a custom transport is involved ?

To be clear, I'd prefer to not break the API, I could implement shallow given more brain-power, and have a PR. But I'd still be worried that this interface is a trap to us: I fail to see how we would not end up in a situation with a PR postponed to 2.0 because we cannot touch 1.0 APIs, but I'd happily be proven wrong. Hence, I'm ok breaking it for/on 1.0, as long as it frees us from future extensibility concerns: I don't plan on adding anything else on top of it, I see it as an interface problem that would be preferable to have fixed.

@ethomson
Copy link
Member

I'm not sure that a struct is really the right way forward here. But either way: we made a guarantee that we'd stop breaking API, and we need to do that. So we need a way to support new stuff with the old stuff.

We've been really successful in making a lot of API changes while staying backward compatible. I don't think this is any different. So the new API doesn't need to be negotiate_fetch_shallow, necessarily, but it does need to be negotiate_fetch_something_else and we do need to support falling back to the existing negotiate_fetch when we need to.

I'm still not excited about dumping versioned structs at end-users and making them deal with that. We don't even have a compelling story for how we're going to deal with it, and we opted ourselves into it. Pushing this unknown onto users, too, seems unfair.

@ethomson
Copy link
Member

Re-reading what I wrote: to be more constructive (or to try), I'm saying something like

negotiate_fetch_ext which takes the new arguments (be those regular arguments or a struct). And if it's not available then whatever functionality simply isn't supported.

@pks-t
Copy link
Member

pks-t commented Jun 13, 2019

I feel like we might want to introduce a new callback or transport field features, as @tiennou seems to imply. We'll definitely have to be aware of what specific features a given transports supports going forward. So what about a new int git_transport_register_ext(git_transport_cb cb, git_transport_options *opts), where the git_-ransports_options would have a bitfield listing all supported features?

For old transports that use the pre-existing git_transport_register, we will assume that they do support everything that's currently required. For transports declaring that some newer features are supported, we might re-route to call "modern" functions instead of the old ones. E.g. instead of calling negotiate_fetch, we might expect such a transport to supply a new negotiate_fetch_ext.

@tiennou
Copy link
Contributor Author

tiennou commented Jun 13, 2019

My thinking was going along :

  • we define a "featurelist" object: that can be as simple as stashing a git_strarray features in some of our "subsystem configuration" structs (this one, the refdb one, the repository "version/extension" interface has a need for something along those exact same lines, a (potential) cert "provider" interface), but could be hidden behind a typedef so we're not exposing that (undecided).
  • we provide a helper function to do some kind of feature intersection check, à la GIT_ERROR_CHECK_VERSION.
  • we can now union individual callbacks that change signature instead of the whole struct, and document accordingly that you get the extended signature if you support a given feature (I can hear docurium dying a little here 😅).

I kinda like how it can be made pretty generic and be generally useful. I'm not sure how it would react to real usecases though, but let's discuss something.

In our current case, there's many of those "featurelists" : what the transport is capable of, what we're asking it to do, what the remote is capable of. Ideally I'd bump the transport-level version to add a "featurelist" member, make all the necessary callbacks changes (since you'd have to touch the code anyway, so either the negotiate callback gets that responsibility, or we add another "can you do that for me" callback (I'm leaning toward that, as I'm not sure how often this kind of thing really takes place at the transport level). That's still in flux though.

EDIT After re-reading @pks-t's comment above, I favored a stringly-based approach over a bitfield for namespacing reasons. That's debatable as well, but I feel that it would make backward-compatibility more complex to handle (eg. once you define that 1 is this fancy "ymmv" feature, you have to carefully track in which version it gets deprecated/removed).

@pks-t
Copy link
Member

pks-t commented Jul 11, 2019

Mh. I'm not particular sure I like having strings in there, though. In effect, to the caller it's kind of the same whether he has to know a certain string or a certain define for a bit, but the latter compresses a lot better and is a lot easier to check for (for f in features: if !strmcmp(f, "foo") found = 1 && break vs. features & FEATURE_FOO).

And I don't quite understand the deprecation/removal thing. We have to track that regardless of whether it is implemented as a string list or a bitfield, wouldn't we?

@tiennou
Copy link
Contributor Author

tiennou commented Jul 11, 2019

And I don't quite understand the deprecation/removal thing. We have to track that regardless of whether it is implemented as a string list or a bitfield, wouldn't we?

It's easier to handle deprecation of a mostly-decentralized string list than the indices taken in a bitfield (case in point, BitTorrent's BEP10). I've just noticed that the (IMHO needed) decentralization is actually a consequence of trying to having generic featurelists for things like git's own repository extension system. This specific example of "fetch/transport" features could do with bitfields, if we're careful.

@pks-t
Copy link
Member

pks-t commented Aug 9, 2019

Huh. I don't know, to be honest, I just feel like it puts additional burden on the user of such APIs. Let's get a tie-breaker in :) @ethomson, you've got any opinion on this?

@ethomson
Copy link
Member

ethomson commented Aug 9, 2019

Huh. I don't know, to be honest, I just feel like it puts additional burden on the user of such APIs. Let's get a tie-breaker in :) @ethomson, you've got any opinion on this?

I admit that I've been a little checked out for the last week or two, but I'm not sure what I'm weighing in on. I'm having a little trouble following the conversation ☝️ since it feels a little theoretical.

I'd love to see proposed code - even pseudocode - for what is being talked about, because I keep looking at the code in this PR and I think that I'm missing the bigger picture.

@tiennou
Copy link
Contributor Author

tiennou commented Aug 12, 2019

Here's the "last" version of that struct, after some prodding onto the semantics of shallow clones, for a quick summary of what the "complete" feature would need.

https://github.com/tiennou/libgit2/blob/32035359613eaf1a524290fc355213adea10e461/include/git2/sys/transport.h#L37

This PR is about making that struct explicit, so that further > 1.0 changes can be made without touching the function signature (my hunch is that backward compatibility issues with those are going to be painful, especially when packed into callback-structs, but this is a hunch).

The rest is (unhealthy ? 😉) speculation/discussion about the similarities between the "repository extension system", and the concept of "what this transport is capable of understanding". This issues can be seen internally between the local and smart transport, as both will usually have support added "in sync", but user-provided ones would not at first and either behave unexpectedly, or have to use the version field, which wraps around to the "repository extension" conundrum.

I think I'd go with something like this (misnaming is likely to have occurred):

/* sys/features.h */
#define GIT_FEATURE_BUILD(featurelist, …) /* static char wrangler used for registration */
static git_feature__check(const char *featurelist, const char *feature);

/* sys/transport.h */
#define GIT_TRANSPORT_FEATURES__SHALLOW "shallow"

typedef struct {
    unsigned int version;
    const char *featurelist;
    /* ... */
} git_transport;

/* smart.c */
static int smart_negotiate_whatever(…)
{
    git_transport transport = /* */;
    if (git_feature__check(transport->featurelist, GIT_TRANSPORT_FEATURES__SHALLOW) < 0 &&
        opts->shallow_depth > 0 /* version, potentially */) {
        return GIT_ENOTSUP;
    }
}

static git_transport smart_transport;
static int register_smart(…)
{
    smart_transport.version = /* 1 */;
    GIT_FEATURE_BUILD(&smart_transport.featurelist,
        GIT_TRANSPORT_FEATURES__SHALLOW, /* more and more */, NULL);
    git_register_transport(&smart_transport);
}

It seems to me that something like this could be useful for the repository extension support as well (which, per the switch to SHA256 is likely to be made use of soon). Sorry for being too theorical sometimes 😉.

@tiennou tiennou mentioned this pull request Oct 3, 2019
@tiennou tiennou mentioned this pull request Nov 5, 2019
2 tasks
Base automatically changed from master to main January 7, 2021 10:09
@ethomson
Copy link
Member

I still like this idea, but don't want to break the API. 🙏

@ethomson ethomson closed this Oct 19, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants