Skip to content

Calculate the default for index relative to git_dir instead of work_tree #490

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 1 commit into from
Closed

Conversation

jcouball
Copy link
Member

@jcouball jcouball commented Sep 9, 2020

Your checklist for this pull request

🚨Please review the guidelines for contributing to this repository.

  • Ensure all commits include DCO sign-off.
  • Ensure that your contributions pass unit testing.
  • Ensure that your contributions contain documentation if applicable.

Description

index when not given, defaults to "#{work_tree}/.git/index". This only works as long as git_dir is "#{work_tree}/.git". In the general case, the index default should be #{git_dir}/index.

This PR adds two tests:

  • TestGitDir#test_index_calculated_from_git_dir which tests that index is calculated correctly when git_dir is not "#{work_tree}/.git/index"
  • TestGitDir#test_git_dir_outside_work_tree which tests that when git_dir is in a non-standard location, Git.open and option functions work.

Fix the case in Base#open where the --git-dir is a file as is the case with Submodules and (non-main) Worktrees. This case was handled for Base.init but not Base.open.

tests/test_helper.rb was changed so that MiniTest classes that do not call set_file_paths do not output a warning because @tmp_path is referenced without being defined.

@jcouball jcouball mentioned this pull request Sep 9, 2020
@hatkyinc2
Copy link
Contributor

I want to thank you for taking the time and trying to resolve this, but with all due respect, your creating a synthetic test based on your thinking about the topic which might not correlate to a real world one, while also I provided for you 2 real world test cases of a submodule and a worktree.

This code https://github.com/ruby-git/ruby-git/blob/master/lib/git/base.rb#L58-L65 was way closer to understanding the core issue, when these features are used, the .git is a file that points to the right place which can not be guessed unless you want to recreate the different cases, in the worktree situation for example the index is to be under ${git_dir}/.git/worktrees/${worktree_shortname]
While the submodule seems to be under
${git_dir}/.git/modules/${module_name}
This current suggested code doesn't address either so I'm not sure what this synthetic test represents besides a manual constructed situation.

Also unless adding these special parameters, under normal git CLI, this construct of a git inside a git that is not a submodule produces issues, both a warning and just generally doesn't work... (the submodules also have an extra submodules file...)

Is it reasonable for me to claim that for a solution to be correct it should pass both real examples for a submodule and a worktree? (I don't know how partial steps fwd look like, maybe at least one should work?)

@hatkyinc2
Copy link
Contributor

hatkyinc2 commented Sep 9, 2020

Well I guess if the whole context you had in mind is of manual usage of these switches you'd probably be right with this...
My mind is on the simple default use cases ...

Maybe that's step one 🤔

@jcouball
Copy link
Member Author

jcouball commented Sep 9, 2020

Is it reasonable for me to claim that for a solution to be correct it should pass both real examples for a submodule and a worktree? (I don't know how partial steps fwd look like, maybe at least one should work?)

I agree.

The tests in this PR were not meant to address the submodule / worktree issues you brought up in #488 and #489 . I was only trying to demonstrate that using -C in place of --work-tree and --git-dir won't work in all cases.

I have really enjoyed interacting with you on these PRs. I will get to the submodule / worktree issues but I just haven't had enough time to get to it yet. I have confidence that we will make it work.

I'd like to merge this PR to master because it tests the public interface as it currently exists and how it is currently documented. I think that is important. Would you 👍 this PR?

@hatkyinc2
Copy link
Contributor

I think I'm digesting in waves...
So why have all of these features/constructs, they aren't explained here, nor on git docs... Is it some legacy from before the new features where added to git?
When in a real world example should someone init a Git with a repository and/or index set? and how does he initially hear about it? only read in code? I guess my simplify PR came from, like what is all of this? I can't find any info about it..

@hatkyinc2
Copy link
Contributor

hatkyinc2 commented Sep 9, 2020

As a type of documentation for what currently exists it's an ok patch. 👍
I do wonder if simplifying and removing all of this would make sense... if it's needed would just allowing the equivalent of the switches directly make more sense? (didn't check if that's already there or not)..

hatkyinc2
hatkyinc2 previously approved these changes Sep 9, 2020
Copy link
Contributor

@hatkyinc2 hatkyinc2 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

tiny step fwd, document with tests what's there 🤷

@hatkyinc2
Copy link
Contributor

It might make sense to also document this somehow in words... like the README as part of this patch..

@jcouball
Copy link
Member Author

jcouball commented Sep 9, 2020

It might make sense to also document this

I've been wanting to add comprehensive Yard documentation but it doesn't seem to be top priority.

@jcouball
Copy link
Member Author

jcouball commented Sep 9, 2020

Ok, I think I see the issue with the test_worktree_branch_checkout_test. I believe it is telling me that there should not be a default for --git-dir. It should be an option that may be set (via the :repository option to open or init). However, if the repository option is not given, then --git-dir should not be passed to the resulting git command.

I'll have to test some. It seems like a lot of code relies on this value being set to something non-nil.

I think this will preserving existing functionality (allowing separate --work-tree and --git-dir settings) along with working trees and submodules.

I plan on making those changes in this PR.

@jcouball
Copy link
Member Author

jcouball commented Sep 9, 2020

I think I'm digesting in waves...
So why have all of these features/constructs, they aren't explained here, nor on git docs... Is it some legacy from before the new features where added to git?
When in a real world example should someone init a Git with a repository and/or index set? and how does he initially hear about it? only read in code? I guess my simplify PR came from, like what is all of this? I can't find any info about it..

One real world use I know of for having the --git-dir live outside the working tree is when you are managing bare repositories like you would on a SCM server. In this scenario, only the bare repository is stored on the server. If you want to use git to look at or make changes to one of bare repositories, you'd have to reconstitute a working directory like I have done in TestGitDir#test_git_dir_outside_work_tree.

Another real world example where this might prove useful is this: https://www.atlassian.com/git/tutorials/dotfiles

@hatkyinc2
Copy link
Contributor

hatkyinc2 commented Sep 9, 2020

OK, another layer of the git onion I guess... just when you think you half get it.

So basically I should think of what I know as a "git directory" as 3 components

  1. The bare git - what's in the .git, which is kinda a transaction log / raw ...
  2. My working directory, which starts as the up to date resulting representation, commonly the current file contents... which is updateable, and 'workable' as the 'work directory', which lets me start setting up changes ...
  3. ? The actual change area / change staging.... so this is where the index file and lockfile and possibly a few other items I don't yet understand come in... So I'm not sure what goes on here, but it's sort of another bare but 'local' to this worktree? or just the transactions since the main git directory / the diff between my local and the origin...

Does this kinda align to the git_dir, work_dir and index we try to manage in the Git object? (not sure about the 3rd part and what about other things that aren't the index, or what exactly is the index)


So for the example you mention and if my understanding is correct, could I not just have one directory setup as bare git,
and the worktree having a remote with a file pointer to that bare git?
Would that not just act the same but "be simpler" meaning would work with default git CLI commands without me managing these low level parameters of git_dir, work_dir and index pointer?
Or what understanding do I lack on the differences between what I describe and what you describe? or the simple usage with remote vs the advance usage with these paramters?

Or the main difference is that we actually share that one transaction log, anything I'd do doesn't need a push, it's just changed for everyone basically if we use some share fs thingy.

I think my main point for simplifying is not using these parameters if they weren't explicitly requested by the user, so if people don't use these advance setups of manually managing the special directories, the git shell command used shouldn't add them. So advance use cases should be the special use cases not the normal because they are versatile. I'm sure most users won't set these parameters. The common use of Git.open(dir).branch(master).checkout should equal as close to cd dir; git checkout master (or a better shell workdir option) or git -C dir checkout master, while currently it generates git --git-dir=xxx --work-tree=yyy -c color.ui=false checkout master (it's weird to me to delegate directory information to git CLI... like I don't do it when I work with the git CLI directly, I just change dir)


So basically the worktree feature extends the work directory feature which used to be 1:1, or allowed multiple with these advance parameter and just offers it in simpler commands? while adding central tracking I guess that didn't exist...

Signed-off-by: James Couball <jcouball@yahoo.com>
@jcouball
Copy link
Member Author

@hatkyinc2 with these changes, I believe this gets past the problem in #489 of calling Git.open. I do still get an error:

Error: test_worktree_bracnh_checkout_test(TestWorktree):
  Git::GitExecuteError: git '--git-dir=/private/tmp/git_test1599863666260/worktree/.git/worktrees/git_test1599863667217' '--work-tree=/tmp/git_test1599863667217' '-c' 'color.ui=false' checkout 'new_branch'  2>&1:error: The following untracked working tree files would be overwritten by checkout:
  	ex_dir/ex.txt
  	example.txt
  	scott/newfile
  	scott/text.txt
  Please move or remove them before you switch branches.
  Aborting

@hatkyinc2
Copy link
Contributor

I've had these errors as well plenty..
I think they are related to the parts I don't understand with how git is handling the 'staging area'... somehow related to the .git/index I believe...
When I looked into these it's a silly situation, when all files are marked change, but if you do git add there is 0 changes... so it's something related to the reference data...

@jcouball
Copy link
Member Author

So basically I should think of what I know as a "git directory" as 3 components

Here is how I think of it:

working tree
The tree of actual checked out files. The working tree normally contains the contents of the HEAD of the branch you have checked out plus any local changes that you have made but not yet committed.

repository
Also called bare repository. A bare repository contains all of the Git administrative and control files that would normally be present in the hidden .git sub-directory. This includes a database of all versions of all files, all branches and tags, a log of all commits, etc. in a repository.

This is, for example, what GitHub stores on it's servers. When you git clone from GitHub, you are copying the bare repository to your computer and then creating a working copy by checking out a version of the work tree.

Many companies use Git internally and serve the bare repositories via ssh. This is probably a big use case of using --git-dir without a working tree.

index
The index is where you stage changes that you made in the working copy (using git add) in order to commit them to the respository.

@hatkyinc2
Copy link
Contributor

hatkyinc2 commented Sep 11, 2020

is the index fully self contained file?
I've had errors referencing a lock file..
Also when I do git add, a commit hash is created, that's somehow separated from the main repository?... or is it at that point already totally equal to my local repository even without a commit?

or is index concept actually a directory that looks a bit like a repository?

@jcouball
Copy link
Member Author

@hatkyinc2 do you Slack? I just created ruby-git.slack.com. Join if you want to chat.

@hatkyinc2
Copy link
Contributor

can't join that slack, can you invite me to my main email, without the . in it

@hatkyinc2
Copy link
Contributor

Should the git commands send some --index of sorts? like we make a fuss about calculating it, but I didn't see it used..

@jcouball
Copy link
Member Author

I lost my working copy of this change so I created PR #499 for this change.

@jcouball jcouball closed this Dec 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants