Skip to content

Shallow support #4331

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 10 commits into
base: main
Choose a base branch
from
Open

Shallow support #4331

wants to merge 10 commits into from

Conversation

tiennou
Copy link
Contributor

@tiennou tiennou commented Aug 22, 2017

This PR enhance our shallow object support in the following ways :

  • we behave better when revwalking (we don't error with GIT_ENOTFOUND).
  • we don't return objects with known-missing parents (since our repository is shallow and thus doesn't have those objects).

Supersedes #3853, depends on #4445 and #4446.

This PR adds graft support at the ODB layer, with git_repository being responsible for telling git_odb about known grafts. Shallow object support is then grafted (heh) on top of that.

Notes :
- I'm not that convinced that having grafts at the ODB layer makes sense. FWIW, the crux of the issue is that object parsing is not done when objects are returned by the ODB, but by the caller (usually the object parse machinery, but revwalk also perform crude parsing for obvious reasons).

  • @carlosmn pointed out the preferred approach (ie. that's what git does) is to have grafts be "transparent". IMHO, reporting an error in that case might help in the following cases :
    • every object parsed is checked against the list of known grafts (so perf). The error code would just be a subset of GIT_ENOTFOUND.
    • as a library user, I would have to recheck every object to know if I'm looking at a graft (with the caveat that this PR doesn't actually exposes grafts, only shallow commits). Having the error code makes me automatically aware (and I can just treat it as GIT_EITEROVER if I don't actually care).

@tiennou
Copy link
Contributor Author

tiennou commented Aug 22, 2017

I'll keep working on handling protocol part (eg. git clone --depth and git fetch --unshallow), which doesn't really matter for correct handing of grafts & shallow commits (at least locally).

@tiennou
Copy link
Contributor Author

tiennou commented Nov 17, 2017

Just realized I should ping #3058 so we don't get duplicated work. Also, I have a local branch with the beginning of the protocol support which I'll happily publish if someone plans to work on that.

@justinfx
Copy link

Is there any progress on this merge? Looking to see a fix for shallow clones!

@tiennou tiennou force-pushed the shallow/shallow-support branch 3 times, most recently from 38d217d to a9d8e11 Compare November 26, 2017 19:55
@tiennou
Copy link
Contributor Author

tiennou commented Nov 27, 2017

I'd appreciate a quick technical "looks good" review, since actually shallow support at the protocol level will depend on some of the infrastructure I laid out here — except I'll need an atomic way of updating the shallow file which wasn't needed by the local-only fix.

So, here are the questions :

  • ⚠️ GitHub is utterly confused about the commit ordering in this PR.
  • commits 9150c16, 2197072, 5f00ac7 are about merging the commit parsing done when revwalking with the "standard" object-lookup one, and could be submitted separately.
  • commit d057213 adds grafts registration support at the git_odb level, which is used by 8a76d68 aea2e72 c54ea4c and 82c3a5f. I'm thinking this could be moved up to the git_repository level, opinions ?
  • 974d720 and 7df1483 handle shallow objects on top of the aforementioned graft API.

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So first the usual apology that we've taken so long to review. I now finally got some time to read up to commit c54ea4c. For all my comments, keep in mind I'm not familiar with the grafting/replace mechanism.

So one thing I'm a bit confused about is how grafting and replace work with each other or against each other. As far as I understand it, a graft is a new commit with the exact same content as another commit, but with parents replaced. This allows you to create for example shallow histories by just replacing a cutoff-point commit by a graft with zero commits. The thing to note here is that this graft is part of the ODB, that is it is a commit object part of the ODB with its own OID.

Afterwards, there is the replace-mechanism coming into the picture. This is a refdb-level mechanism in that it creates references in refs/replace/<OID>, where each OID in that namespace points directly at another OID. The meaning is then to replace the OID of the reference's name with the OID that is being pointed to.

A common usecase is then to use grafts for those replaces. Given a commit c with cid=OID(c), you simply first create a graft commit g with gid=OID(g). It holds that CONTENTS(c)=CONTENTS(g), where contents is the root tree, commit message etc. except for the parents. You then create a ref refs/replace/$cid with contents $gid to put that graft into place.

If I'm not mistaken, you are replacing the old logic used by grafts, which weren't using the new replacement logic. Was this a knowing decision, and if so why?

One error I can see is in how you create the grafts. Instead of creating new graft commits which have the same contents but different parents and inserting them into the ODB, you create them ad-hoc and just replace existing commit's parents if they're grafted. I think this is wrong. Especially the replace mechanism makes use of the graft being present in the ODB, so your implementation does not allow for that usecase.

So what I'd have expected is a function git_odb__create_graft(&graft, odb, commit) which creates a new grafting object. I'm not yet decided if the ODB is the right place to put it or whether that should be somewhere else, but we definitly need such a function. After the user has put the graft (which is simply a git_commit) into the ODB, you would then have an API in place to enable the user to put those grafts into place with the refdb. I guess this should be an API of the repository, as we need to access both the refdb to create the replace-reference as well as the ODB to locate the grafting and grafted commits. Something like git_repository__replace, but I'm also undecided on that one.

Please let me know if you think my reasoning makes any sense. Even so, I'm glad somebody dared to tackle that topic! It's been on my own todo list for too long, so please keep up that good work! 👍

src/object.c Outdated
@@ -51,14 +51,39 @@ static git_object_def git_objects_table[] = {
{ "REF_DELTA", 0, NULL, NULL },
};

int git_object__init_with_odb(git_object **object_out, git_odb_object *odb_obj, git_repository *repo)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would really love some explanation inside of the commit message detailing why exactly you do this step :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have loved too 😆. IIRC it's a side-effect of the parse merging. After a more thorough look, it might just be confusion : the revwalk-parsing works on git_commit, and the normal parsing did it, so I made them both do it. Will investigate if it's really necessary (I remember a crash when freeing though).

src/commit.c Outdated
@@ -476,6 +496,17 @@ int git_commit__parse(void *_commit, git_odb_object *odb_obj)
return -1;
}

int git_commit__parse(void *_commit, git_odb_object *odb_obj)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, so it seems you created the new function to be able to parse commits from raw data. This looks a lot like what I have been doing in PR #4374, but I went even further and actually decoupled the commits completely from the ODB. Seeing that your approach wouldn't solve my particular use case (as I cannot use the ODB to parse the object), would my approach be able to solve your use case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think you saw below, this is all in the name of merging the parsing done when quickly revwalk with the "standard" way. After a quick look, I don't think your approach allows me to do that (it's just decoupling IIUC).

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yup, you're right. I tend to review by commits, so I didn't see this until later.

GIT_COMMIT_PARSE_RAW_HEADER = (1 << 5),
GIT_COMMIT_PARSE_RAW_MESSAGE = (1 << 6),
GIT_COMMIT_PARSE_ALL = 0xFFFF,
} git_commit__parse_flags;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What's the purpose of those flags? Is it an optimization or is it that you cannot parse some stuff because of certain conditions? If it's for the purpose of optimization, I'd like to avoid doing that. We currently don't have any notion of "partial" objects, and I bet this could theoretically lead to subtle bugs when passing around those partial objects.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem, revwalk parse merge.

buffer = parents_start;
for (i = 0; i < parents; ++i) {
git_oid oid;
if ((error = git_commit__parse_commit(commit, obj, GIT_COMMIT_PARSE_PARENTS|GIT_COMMIT_PARSE_COMMITTER)) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok, I guess that answers my question. Nice to see some reduction in duplicated code.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

src/commit.c Outdated
/* Perform necessary grafts */
git_repository_odb__weakptr(&odb, repo);
if (odb)
git_odb__graft_object(commit, odb);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Huh? Either it is GitHub which confused the commit ordering or this commit is too early in that it introduces using git_odb__graft_object previous to it being implemented.

src/repository.c Outdated
{
git_buf odb_path = GIT_BUF_INIT;
git_odb *odb;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not your fault, but god is that diff awful to read.

src/repository.c Outdated
(error = git_odb__add_default_backends(odb, odb_path.ptr, 0, 0)) < 0) {
git_buf_free(&odb_path);
git_odb_free(odb);
return error;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A goto out with common cleanup would be nice here.

src/repository.c Outdated
assert(repo && out);
if ((error = git_repository_item_path(&odb_path, repo,
GIT_REPOSITORY_ITEM_OBJECTS)) < 0 ||
(error = git_odb_new(&odb)) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I know you're just using the preexisting code, but we actually have a memory leak here whenever git_odb_new errors. I mean in that case we probably have other problems because that indicates memory allocation errors, but still. Would be cool if you could refactor this a bit to not have multiple statements in a single if-condition and use goto cleanup-style to avoid those memory leaks.

src/buffer.h Outdated
@@ -215,4 +215,8 @@ int git_buf_splice(
const char *data,
size_t nb_to_insert);

#define git_buf_foreach_line(line_start, line_end, line_num, buf) \
while ((line_start = git__strsep(buf.ptr, "\n")) != NULL && \
(line_end = line_start + strlen(line_start)) != NULL && ++line_num)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please indent those lines part of the macro

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Furthermore, please wrap all uses of the macro parameters in braces.

src/repository.c Outdated
return error;

if (git_buf_joinpath(&graft_path, graft_path.ptr, "grafts")) {
git_buf_free(&graft_path);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe have a common goto cleanup again? It's becoming hard to grasp what needs cleaning at which part of the code with those interspersed frees and returns.

@tiennou
Copy link
Contributor Author

tiennou commented Dec 1, 2017

I'll repeat what I said on Slack for the record : since I'm coming at this from the shallow-support angle, which cgit uses grafts for, I've tried to keep it logical and do the same thing. This means I'm clueless about the git-replace stuff, apart from what I've read from the manual : it "grafts" complete objects, and uses refs to keep track of them, hence those "replacement objects" can be pushed/pulled around, while grafts (and thus shallow objects) can't.

This PR only exists because of shallow-support, hence I've checked how cgit does it, and it does not use git-replace for that, so I considered git-replace support "out of scope".

So one thing I'm a bit confused about is how grafting and replace work with each other or against each other. As far as I understand it, a graft is a new commit with the exact same content as another commit, but with parents replaced.

"replaces" are real objects and AFAIU work like you expected. "grafts" are just some OIDs loaded from .git/info/grafts, which are used to override a commit's parents…

The thing to note here is that this graft is part of the ODB, that is it is a commit object part of the ODB with its own OID.

…thus, no. "grafts" have no real existence (IOW object form). I located my graft function at the ODB level because 1) I got confused thinking that .git/info/ was part of the odb, 2) it felt the most obvious place I could tack them on and impact both revwalking and object lookups.

If I'm not mistaken, you are replacing the old logic used by grafts, which weren't using the new replacement logic. Was this a knowing decision, and if so why?

"replacing" I don't understand. If you meant "implementing", then yes, because that's how cgit does it.

We seem to agree on how git-replace works. I'll just add that I'm not sure cgit "grafts" anything when it's used, since I didn't see anything related to replaces in the graft-handling code (% overlook probability), hence why I consider git-replace out of scope.

I'd be happy to discuss what final API we want : for shallow support, I only need "internal" things, so I don't particularly care (I mean, I could not provide any public API apart from the list of shallow roots and be done with it 😉). Right now, that's handled by git_odb__graft_register, which could be made public.

I'll reinvestigate the layering I chose : because of my 2 points above, I might not need to go near the odb at all, but I have to consider a .git/shallow/.git/info/grafts file appearing "somehow" (IOW caching, fun things™)

If it's warranted I can split out that PR in 3 pieces (already have them locally, and if it fixes GitHub's confusion that's be even better) : the commit parser merging (because of #4374 and being generally cleanup), the graft support itself (this), and shallow support (which requires this).

src/commit.c Outdated
@@ -476,6 +496,17 @@ int git_commit__parse(void *_commit, git_odb_object *odb_obj)
return -1;
}

int git_commit__parse(void *_commit, git_odb_object *odb_obj)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As I think you saw below, this is all in the name of merging the parsing done when quickly revwalk with the "standard" way. After a quick look, I don't think your approach allows me to do that (it's just decoupling IIUC).

GIT_COMMIT_PARSE_RAW_HEADER = (1 << 5),
GIT_COMMIT_PARSE_RAW_MESSAGE = (1 << 6),
GIT_COMMIT_PARSE_ALL = 0xFFFF,
} git_commit__parse_flags;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Idem, revwalk parse merge.

buffer = parents_start;
for (i = 0; i < parents; ++i) {
git_oid oid;
if ((error = git_commit__parse_commit(commit, obj, GIT_COMMIT_PARSE_PARENTS|GIT_COMMIT_PARSE_COMMITTER)) < 0)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

src/odb.c Outdated
return 0;
}

int git_odb__graft_object(void *obj, git_odb *odb)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Was expecting a "layering violation", because of git_odb_object. Also, I think I saw a precedent, hence thought it would be fine. If it's not, it should be changed to git_commit * because you only graft onto commits.

src/odb.c Outdated
GITERR_CHECK_ALLOC(id);

git_oid_cpy(id, oid);
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's the cgit graft approach, replacing parents on object parse.

src/object.c Outdated
@@ -51,14 +51,39 @@ static git_object_def git_objects_table[] = {
{ "REF_DELTA", 0, NULL, NULL },
};

int git_object__init_with_odb(git_object **object_out, git_odb_object *odb_obj, git_repository *repo)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would have loved too 😆. IIRC it's a side-effect of the parse merging. After a more thorough look, it might just be confusion : the revwalk-parsing works on git_commit, and the normal parsing did it, so I made them both do it. Will investigate if it's really necessary (I remember a crash when freeing though).

@@ -196,8 +158,7 @@ int git_commit_list_parse(git_revwalk *walk, git_commit_list_node *commit)
} else
error = commit_quick_parse(
walk, commit,
(const uint8_t *)git_odb_object_data(obj),
git_odb_object_size(obj));
obj);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent

const uint8_t *buffer,
size_t buffer_len)
git_commit_list_node *node,
git_odb_object *obj)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Indent

@tiennou tiennou force-pushed the shallow/shallow-support branch 2 times, most recently from 9dc2027 to d6198bf Compare December 5, 2017 22:33
@tiennou
Copy link
Contributor Author

tiennou commented Dec 6, 2017

Rebased. I managed to remove most of the "weirdness" by moving the grafts at the repo level.

size_t i;
git_oid *parent_oid;
git_commit *commit = git__calloc(1, git_odb_object_size(obj));
commit->object.repo = walk->repo;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note : that was what my old __init_from_odb function was about. Right now I'm cheating by providing only what's necessary for the parser to work (access to the repo for applying the grafts). I can either bring back the __init_from_odb function, or make the parser not crash if there's no repo around. Suggestions welcome !

src/commit.h Outdated
GIT_COMMIT_PARSE_ALL = (1 << 1),
} git_commit__parse_flags;

int git_commit__parse_commit(git_commit *commit, git_odb_object *obj, git_commit__parse_flags flags);
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've cut down on the number of flags here because of a bug parsing parents. I think it's better this way, as it makes it less likely you ask for an invalid combination, and we can always split those apart if the need arises.

@pks-t
Copy link
Member

pks-t commented Dec 8, 2017

Okay, so I think I've been confused a bit by grafts and replaces. What makes me curious is the following part from git-replace(1):

--graft [...]
Create a graft commit. A new commit is created with the same content as except that its
parents will be [...] instead of 's parents. A replacement ref is then created to
replace with the newly created commit. See contrib/convert-grafts-to-replace-refs.sh for an
example script based on this option that can convert grafts to replace refs.

This paragraph implies that the grafted commit is stored inside of the ODB, as otherwise it would be impossible to put the reference into place. From what you say, it looks to me like there's two different things here:

  • an in-memory grafted commit which is not written into the ODB, used by the old grafts
  • an in-ODB grafted commit which is written into the ODB, used by the new replace mechanism

So I guess both grafting and replaces use the same underlying mechanism of grafting, but differ only in how they are being referenced and persisted. While old-style grafts make use of out-of-band information, the new replace mechanism is simpler by making use of the ODB and refdb. It would be nice though if we could also check how git.git does it, instead of relying on another third-party implementation.

Anyway. Seeing that both replaces and grafts make use of the same mechanisms, it would be nice if they shared some common logic with each other later on.

@tiennou
Copy link
Contributor Author

tiennou commented Dec 8, 2017

Your interpretation of git-replace looks correct to me. That --graft option creates a "new style" replace object that works the same as the "old style" graft mechanism (same name, different implementation).

What I'm sure of is that cgit shallow objects use old-style grafts, which don't appear in the ODB at all (or you'd have strange things happen when pushing a shallow clone back to its origin). This means that, even though the old graft mechanism could be considered "deprecated" in the face of "git-replacing", which is superior, it's still used by the shallow code because you don't want the user to mess up its shallow clones.

I think that because of that specific point, both "object replacement methods" cannot share the same implementation, but I'll have to investigate how git-replace(1) actually works and what kind of functionality it uses to be sure.

@pks-t
Copy link
Member

pks-t commented Dec 8, 2017

Okay, that's most welcome, thanks a lot.

By the way, I'm all in favor of splitting this up, as you proposed. Especially if you've already got the parts ready it's always easier to get smaller batches reviewed and merged than one big batch.

@tiennou tiennou force-pushed the shallow/shallow-support branch from d6198bf to 6f3e6f9 Compare December 13, 2017 00:32
This was referenced Dec 13, 2017
@tiennou tiennou changed the title Shallow commits & grafts support Shallow support Dec 13, 2017
@tiennou
Copy link
Contributor Author

tiennou commented Dec 13, 2017

@pks-t Okay, I've split the 3 logical parts off. Sadly I can't really base them on top of one another because they're in my fork (and opening PRs against that would be silly), so it's not as clean as we might have wanted.

One thing I'd like to discuss still is how we report those shallow objects. Right now, this is completely "transparent" (as per spec, may I say), but you, as a user, would have no way of knowing whether this is a "pristine commit" or a shallow one, without suspiciously checking the OID of every "root" commit against the list of shallow roots. For example, in GitX, I'd be able to open a prompt asking whether the user wants a few more commits loaded when revwalking (or something). Arguably, I might be overthinking it though 😉.

@tiennou tiennou force-pushed the shallow/shallow-support branch from 6f3e6f9 to 3d9668e Compare December 15, 2017 22:59
@tiennou tiennou force-pushed the shallow/shallow-support branch 2 times, most recently from d7d149e to fb472e9 Compare February 6, 2018 23:15
@tiennou tiennou force-pushed the shallow/shallow-support branch from fb472e9 to 220610f Compare April 22, 2018 13:32
@tiennou tiennou force-pushed the shallow/shallow-support branch 2 times, most recently from e39e357 to 27ed7ce Compare August 2, 2018 21:15
@tiennou tiennou force-pushed the shallow/shallow-support branch 2 times, most recently from 38e0e57 to cf8a290 Compare October 25, 2018 21:45
@tiennou
Copy link
Contributor Author

tiennou commented Oct 25, 2018

This full PR/patch set has been rebased, and should be ready for review.

I'm drawing a line in the sand as to what this is about: this only makes detection of shallow repositories possible. I'll do another PR about the "protocol" parts of shallow when I feel it's ready, because it's (arguably) harder. Hence, some safety measures are missing: I think a fetch done by us from a shallow repo would unshallow it, since we can't tell the remote about what we're purposefully not interested in (though it could also confuse the server 😬 ?). Same for pushing, though I'm clueless about what's involved there. It might be worthwhile to prevent misbehaving by erroring when those operations are tried on a shallow repository.

My point about checking for a grafted commits when a revwalk ends is also still undecided. I like "erroring differently" better, but I understand that might be considered an API-break. Maybe git_commit_is_grafted or something like it…

This allows us to pick which data from a commit we're interested in. This will be used by the revwalk code, which is only interested in parents' and committer data.
This represents (old-style) grafted commits, a.k.a an array of overridden parents for a commit's OID.
This wires git_repository to open the .git/info/grafts file and load its contents as git_commit_grafts objects.
@tiennou tiennou force-pushed the shallow/shallow-support branch from cf8a290 to a165a6f Compare February 3, 2019 09:22
@r-cheologist
Copy link

What's the status of this?
Really looking forward to shallow clone support ...

@GreatBahram
Copy link

GreatBahram commented May 11, 2019

Is there any progress on this problem? There are lots of issues that are demand on this feature.

@Qix-
Copy link
Contributor

Qix- commented May 11, 2019

Ping @pks-t.

@kalkin
Copy link

kalkin commented Sep 7, 2019

As far as I understand this stops git revwalk from exiting with an error if a local commit is missing.
@carlosmn @ethomson @pks-t What needs to be done to get this PR merged?

Copy link
Member

@pks-t pks-t left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So. I do think that by now, this really is a very important feature and the one that's most commonly requested by users of libgit2, which is why I've decided to finally take the bitter pill and review this. But I'm pleasantly surprised, this really is a lot more straight forward than I initially expected. Good job, @tiennou!

Please don't let yourself be discouraged by all the different requests for changes, most of it really is quite trivial. Let me know if you don't expect to be able to work on this anytime soon, then I will help you out to get this landed more quickly.

}

git_oid_cpy(&graft->oid, oid);
k = git_oidmap_put(grafts, &graft->oid, &res);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Well, this really shows how old the PR is as it still uses the old-style map interfaces 🙄


#include "graft.h"

int git__graft_register(git_graftmap *grafts, const git_oid *oid, git_array_oid_t parents)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd like to avoid the term "register" here. It really sounds a lot like our global registration API for certain subsystems, which caused some confusion for me. I'd thus prefer more generic "add"/"remove" functions.
Also, the double underscore for private functions is usually done after the subsystem. So it should rather be "git_graft__add".
And a third thing: as we're really talking about grafts here and not about a graft, maybe the final version should read "git_grafts__add"?

git_oidmap_set_value_at(grafts, idx, NULL);
git__free(graft);
}
git_oidmap_delete_at(grafts, idx);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is hopefully all going to look a lot lett confusing with the new map APIs. It really has been a pain to use the old macro-based functions.

git_oidmap_clear(grafts);
}

int git__graft_for_oid(git_commit_graft **out, git_graftmap *grafts, const git_oid *oid)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What about git_grafts__get or git_grafts__lookup?

} git_commit_graft;

/* A special type of git_oidmap with git_commit_grafts as values */
typedef git_oidmap git_graftmap;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good enough for the time being, I guess. How about we directly use a structure typedef struct { git_oidmap map; } git_graftmap? This would make it a lot easier to extend in the future, for example to use things like file caches if one had the need to refresh grafts or store things like the source of the graft if I were to add new graft entries?

if ((error = git_commit__parse_internal(commit,
git_odb_object_data(odb_obj),
git_odb_object_size(odb_obj),
flags)) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Alignment is a bit off here.

git_graftmap *grafts = git_oidmap_alloc();
git_array_oid_t parents = GIT_ARRAY_INIT;

git_oid *oid1 = git_array_alloc(parents);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This needs to be checked for allocation errors

cl_git_pass(git_oid_fromstr(&oid_src, "2f3053cbff8a4ca2f0666de364ddb734a28a31a9"));
git_oid_cpy(oid1, &oid_src);

git_oid_fromstr(&oid_src, "f503807ffa920e407a600cfaee96b7152259acc7");
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cl_git_pass

* @param repo The repository
* @return 0 on success, an error otherwise.
*/
GIT_EXTERN(int) git_repository_shallow_roots(git_oidarray *out, git_repository *repo);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We're probably unable to mark git_oidarray as const, right? Maybe we should instead just return a git_grafts structure? After all, that's how shallow roots are implemented.

cl_assert_equal_s(git_oid_tostr_s(&oid_2), "be3563ae3f795b2b4353bcce3a527ad0a4f7f644");

git_revwalk_free(w);
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This test's commit can probably be amended to the other commit that introduced shallow tests.

@Qix-
Copy link
Contributor

Qix- commented Sep 27, 2019

If OP has gone inactive since submitting this let me know and I'll be happy to take over with a new PR.

@pks-t
Copy link
Member

pks-t commented Sep 27, 2019

@tiennou is one of the most active people nowadays in the project, so I'd be surprised if he was inactive now. But open source being what it is it might still take some time ;)

@tiennou
Copy link
Contributor Author

tiennou commented Sep 28, 2019

It's likely I'll only able to take a look at it in a couple of days. No sweat for the "late" review @pks-t, better late than never ! Frankly, if you feel up for it @Qix- I'll gladly defer the PR/polishing part, as I'd personally prefer to keep working on the ssh backend issue — as the review shows it's been a little while.

Just a quick discussion point : as the overarching goal is to be git-like, I don't really feel compelled to provide anything more featureful than what's needed to support shallow repositories, with the rationale that git has gained git-replace (a.k.a. the refdb-based grafts, which do look nicer). I am thus confused — to me grafts are a low-level detail of a since-replaced-by-a-better-version system, and don't really warrant a more precise cache, or more attention than that. Maybe I'm missing something though, so I'm open to usecases 😉.

@pks-t
Copy link
Member

pks-t commented Oct 3, 2019

I am thus confused — to me grafts are a low-level detail of a since-replaced-by-a-better-version system, and don't really warrant a more precise cache, or more attention than that. Maybe I'm missing something though, so I'm open to usecases .

I fully agree that we shouldn't blow up the grafts mechanism more than needed. But my intention is to have it as self-contained as possible to not spill grafts-related logic into other places. Thus the standalone git_grafts struct, git_grafts__refresh, etc. This will make it more maintainable and help us if there is ever a request to expose some of its functions it to users.

I'll create a separate PR that builds on yours to show what I imagine it to look like :)

@pks-t pks-t mentioned this pull request Oct 3, 2019
Base automatically changed from master to main January 7, 2021 10:09
@Qix-
Copy link
Contributor

Qix- commented Dec 19, 2023

@pks-t This can be closed, I believe.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants