-
Notifications
You must be signed in to change notification settings - Fork 788
*: Initial support for opening sha256 repositories using ObjectID
#1527
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: v6-transport
Are you sure you want to change the base?
Conversation
Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
The support for SHA256 will require the mapping between ObjectFormat and the correct hash.Hash to be used. Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
The introduction of both new types are a stepping stone to enable SHA256 support concurrently - without the need for a build tag. ImmutableHash provides a way to represent varied sized hashes (for SHA1 or SHA256) with a single type, while keeping similar performance of the existing plumbing.Hash. The names were picked in order to provide a clearer distinction on what they do. ImmutableHash is the result of a hash operation and can't be changed once calculated. ObjectHasher, adds the Git object header before a normal hash operation that hash.Hash or plumbing/hash.Hash do. The performance results shows that SHA1 operations could be slower, however for SHA256 it can be over 3 times faster: ~Hasher/hasher-sha1-100B-16 2202226 574.7 ns/op 174.00 MB/s 272 B/op 7 allocs/op ~Hasher/objecthash-sha1-100B-16 1511851 772.6 ns/op 129.43 MB/s 272 B/op 9 allocs/op ~Hasher/objecthash-sha256-100B-16 5057584 247.4 ns/op 404.21 MB/s 96 B/op 7 allocs/op ~Hasher/hasher-sha1-5000B-16 96476 12523 ns/op 399.25 MB/s 5536 B/op 7 allocs/op ~Hasher/objecthash-sha1-5000B-16 105376 10983 ns/op 455.27 MB/s 272 B/op 9 allocs/op ~Hasher/objecthash-sha256-5000B-16 420741 2828 ns/op 1767.77 MB/s 96 B/op 7 allocs/op ~HashFromHex/hasher-parse-sha1-valid-16 23243548 64.65 ns/op 618.69 MB/s 48 B/op 1 allocs/op ~HashFromHex/objecthash-fromhex-sha1-valid-16 18120699 79.62 ns/op 502.36 MB/s 72 B/op 2 allocs/op ~HashFromHex/objecthash-fromhex-sha256-valid-16 8969871 124.2 ns/op 515.22 MB/s 96 B/op 2 allocs/op Signed-off-by: Paulo Gomes <paulo.gomes@suse.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The binary.ReadHash is now replaced with ReadFrom on the ObjectID struct. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
With the addition of multi-hash support, there is no longer need to have a sha256 specific test that relies on build tags. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The new ObjectID replaces plumbing.Hash and supports both SHA1 and SHA256. The backing array is sized after SHA256, resulting in a larger memory footprint on a per hash basis. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
The previous example of Blame, started from a clone, instead of opening an existing repository. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The plumbing/hash
package feels redundant and can be replaced with Go's crypto
and hash
packages.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The only reason why plumbing.hash
needs to exist currently is to support hash.RegisterHash
within the explicit context of go-git. If we were to remove that, and lean on upstream's instead, then we would need to rethink on the UX.
I'm not sure we should force users to replace their sha1
implementation across their app, if they only want to use it in the context of go-git
.
IIRC, upstream had a recent performance optimisations whereby they use both hashes: sha1cd
when interacting with remote repositories, and sha1
for all local operations.
This feels like an orthogonal topic which could be dealt with as a separate issue/PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Actually, don't we already do that when sha1cd
gets imported? It replaces upstream's sha1
in the registry.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You are absolutely right. That will need to change upstream depending on the approach we take.
I'll make a few tests around this, which may result in either the removal of plumbing/hash
or the landing of the hybrid approach (if the performance gains justifies it).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of the upstream behaviour will be handled separately, the same applies to the changes in go-git
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Perhaps we can remove plumbing/hash
and combine the implementations of sha1 and sha256 hasher and object id. Also, instead of using interfaces for ObjectHasher
, we can use a concrete type *ObjectHasher
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I've opened pjbgf#27 against your branch 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Removal of the upstream behaviour will be handled separately, the same applies to the changes in
go-git
.
I think that's fine and is expected. Ideally, go-git shouldn't directly import crypto hashers to avoid overriding the user's registered hashers. Users should be able to use custom hashers using the crypto
register and interface, while go-git could provide a default import for sha1cd
and crypto/sha256
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added your commits into the PR, PTAL.
There are some additional changes required around hashing and potentially the removal of our plumbing/hash
, but I'd rather do that in a follow-up PR.
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
ObjectID differs from the previous Hash as it has a new format field. For the purposes for equality checking, both values 'sha1' and '' should be perceived as the same, so to avoid cases where the equality check is not being done internally, the value 'sha1' is no longer used. The field format should only be populated when its value is 'sha256'. Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
This commit changes the ObjectFormat type from a string to an int. This would make a default ObjectFormat declaration default to SHA1. A string function is defined to return the string representation of the ObjectFormat.
This combines the ObjectHasher implementation to be hash function agnostic. The SHA1 and SHA256 implementations are now combined into a single implementation that uses the given format to determine the hash function to use.
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Users can opt-out of sha1cd by adding an empty import for the upstream implementation: _ "crypto/sha1" Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Signed-off-by: Paulo Gomes <pjbgf@linux.com>
Introduces a new
plumbing.ObjectID
which supports either SHA1 and SHA256 object formats, removing the need of using build tags for the sha256 support.Follow-up PRs will expand support for other git features.
Relates to #899 #706.