Skip to content

Stricter object dependency checking during creation #3633

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 9 commits into from
Mar 1, 2016

Conversation

ethomson
Copy link
Member

At present, we are very loose about the creation of objects that are based on other objects; for example, when calling git_commit_create, we do not check that the dependent objects (its tree, its commits) are actually valid objects, or that they are the requisite type.

This PR adds a new option GIT_OPT_ENABLE_STRICT_OBJECT_CREATION, that will turn on validation of the referenced objects: that they exist, and that they are the required type.

This validates:

  1. git_commit_create: that the tree is a tree, and exists, and that the parents are commits and exist
  2. git_index_add: that the index entry object exists and is of the specified type, and
  3. git_treebuilder_insert: that the specified entry exists and is of the specified type

These checks are still optional, and default to off, for performance.

/cc @vmg

@ethomson
Copy link
Member Author

Questions:

  1. Are we cool to do this at the git_libgit2_opts level? Do we ever want to only validate these things sometimes?
  2. Ref creation already validates the targets. Is there any reason that validation shouldn't get folded into this logic? (I actually suspect that most people are driving ref creation based on bindings where they have a pretty good idea that the objects they're pointing to exist, and that this check, by default, is probably not super helpful.)
  3. Is there a less terrible name for this option?
  4. Should this get turned on by default?

bool valid = true;

if (git_object__strict_input_validation) {
if (git_object_lookup(&obj, repo, id, type) < 0)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we want to maybe git_odb_read_header instead, to check for the type? Seems much more cheap than a full object lookup.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, that's true, but what would be really nice is to have an object_lookup_type or something that would use the cached object if it was in the cache and then did a odb_read_header if it wasn't. One sec.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Okay, I did indeed add git_object__lookup_type which will use the cache when possible and then go to git_odb_read_header when the requested object is not in the cache. Thanks, @vmg !

Edward Thomson added 4 commits February 28, 2016 12:38
When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate
the tree and parent ids given to commit creation functions.
When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate
the tree and parent ids given to treebuilder insertion.
Edward Thomson added 5 commits February 28, 2016 18:54
When `GIT_OPT_ENABLE_STRICT_OBJECT_CREATION` is turned on, validate
the index entries given to `git_index_add`.
This allows lighter weight validation in `git_object__is_valid` that
does not require reading the entire object.
Use legitimate (existing) object IDs in tests so that we have the
ability to turn on strict object validation when running tests.
@ethomson
Copy link
Member Author

Okay, after some thought, I think that it should be turned on by default. I think that most consumers are willing to sacrifice a quick lookup for safety. Those of us that are in an environment where you're confident about the dependencies of the objects that you're writing then you can turn this setting off. As a result, I pulled reference creation target validation into the same check.

A lot of tests assumed that they could just spam random (or cleverly named) object IDs into the index. I updated these tests to use legitimate object IDs so that we can turn this off by default.

Any objections to this being on by default?

/cc @vmg @carlosmn

ethomson pushed a commit that referenced this pull request Mar 1, 2016
Stricter object dependency checking during creation
@ethomson ethomson merged commit edaffe2 into libgit2:master Mar 1, 2016
ethomson added a commit that referenced this pull request Dec 31, 2017
The documentation for `git_treebuilder_insert` erroneously states that
we do not validate that the entry being inserted exists.  We do, as of
#3633.  Update the documentation
to reflect the new reality.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants