Skip to content

Conversation

arrbee
Copy link
Member

@arrbee arrbee commented Feb 8, 2013

This is an effort to refactor the iterator and diff code to make certain things more consistent and to fix certain corner cases that came up during the implementation of checkout. There are a number of things going on here...

  • Make all iterators support both auto-expansion of directories and explicit expansion
  • Make iterators support including tree entries during iteration
  • Rewrite diff typechange handling using iterators that include tree entries
  • Separate workdir diffs which require special ignore/untracked handling and prefer non auto-expanding iterators from other diffs which can use much simpler code and all auto-expanding iterators.
  • Make any final public diff API cleanups (binary flags, non-mutating transforms)

Once this is done, it should be possible to fix checkout one more time to simplify the code there and repair the few remaining corner cases where the old iterators made it very hard to do the right thing.

The updated git_iterator_advance_into for working directories
will return GIT_ENOTFOUND when the directory is empty.  This
updates diff to take that error into account.
Previously the git_diff_delta recorded if the delta was binary.
This replaces that (with no net change in structure size) with
a full set of flags.  The flag values that were already in use
for individual git_diff_file objects are reused for the delta
flags, too (along with renaming those flags to make it clear that
they are used more generally).

This (a) makes things somewhat more consistent (because I was
using a -1 value in the "boolean" binary field to indicate unset,
whereas now I can just use the flags that are easier to understand),
and (b) will make it easier for me to add some additional flags to
the delta object in the future, such as marking the results of a
copy/rename detection or other deltas that might want a special
indicator.

While making this change, I officially moved some of the flags that
were internal only into the private diff header.

This also allowed me to remove a gross hack in rename/copy detect
code where I was overwriting the status field with an internal
value.
@nulltoken
Copy link
Member

Separate workdir diffs which require special ignore/untracked handling and prefer non auto-expanding iterators from other diffs which can use much simpler code and all auto-expanding iterators.

Some time ago, I was working on some diff enhancements to the LibGit2Sharp API and had the opportunity to have a friendly fight with the diff implementation.

My feeling at the time was that when comparing two hierarchies only ADDED, MODIFIED and DELETED should be considered at the core. Untracked and Ignored only being some special cases of ADDED when the left tree is the working directory.

While working on libgit2/libgit2sharp#196 also libgit2/libgit2sharp#278, I also played a little bit with TypeChanged, Renamed and Copied, which I now see as special cases of MODIFIED (TypeChanged and Renamed) and ADDED (Copied)

I was wondering how you would feel about changing the git_delta_t enum so that we could combine base changes (ADDED, MODIFIED, DELETED) with "decorations" (Renamed, ...). This would allow a delta to be MODIFIED and TypeChanged. Another one could be ADDED, Untracked and Copied....

@arrbee
Copy link
Member Author

arrbee commented Feb 11, 2013

@nulltoken That's an interesting thought. I'm open to it. It definitely feels good to me for typechange, rename, and copy. For untracked and ignored it still feels a little odd. The idea that an untracked file is "added" with just a special modifier doesn't completely match my intuition, but it's growing on me. Unlike the copy/rename annotations which are added with a second pass, I don't foresee adding the untracked/ignored annotations in a second pass because the ignored/not-ignored is cheaper to compute during iteration (at least with the current implementation of ignores -- and I'm not inclined to change it right now).

My one concern is what happens to code like the checkout stuff that currently does a switch statement based on the diff delta status enumeration. If we go to bit flags (and ones in which only certain combinations are ever legal to appear together), I'm slightly worried about what happens with the code readability. For the short term, I guess we can just create compound bit combinations and use them as selectors in the switch statement, so maybe this concern is misplaced.

@arrbee
Copy link
Member Author

arrbee commented Mar 12, 2013

Superceded by #1408. @nulltoken I still want to discuss further your ideas about diff API simplification and cleanup, but for now, this PR as is has outlived its usefulness, I think. Once the iterator refactoring is merged, we can pick up the discussion about where we should go with diff.

@arrbee arrbee closed this Mar 12, 2013
@nulltoken
Copy link
Member

Once the iterator refactoring is merged, we can pick up the discussion about where we should go with diff.

@arrbee 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants