Skip to content

Don't remove quotes if \ or " are present inside #2048

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: main
Choose a base branch
from

Conversation

EliahKagan
Copy link
Member

@EliahKagan EliahKagan commented Jun 8, 2025

Background

#2035 fixed issue #1923, where the ConfigParser would not remove the quotes around single-line values. As discussed in comments there:

  • That improved the common cases where quote removal is all that is needed, in particular where no escape sequences are present.
  • When there are escape sequences within a single line quoted value, that was already not handled correctly--since no single-line quoted values (other than a quoted empty string) had been handled correctly before #2035, in that quote removal was never performed for them.
  • Nonetheless, #2035 may have made the situation worse for some applications if they handled the quoted values themselves, or if the quoted values were more readily recognized to mean parsing had not succeeded.

Let's take the best of both worlds (so far)

This PR keeps the changes from #2035 in the case that they work because the text contained strictly between the beginning and ending " characters contains neither any \ nor any other ". This both:

  • Preserves the benefit of #2035 for #1923. (The only exception is cases where a \ is meant to be preserved rather than treated as an escape character. This is presumably rare--if it ever happens--since that's not the syntax of double-quoted values in Git config files.)
  • Keeps the old, potentially safer behavior of doing no transformations, not even quote removal, in cases where quote removal alone would clearly not produce the correct result.

Changes

  1. c8e4aa0 refactors to prepare for the other changes.
  2. 7bcea08 adds a test for the behavior described above.
  3. f2b8041 makes those tests pass.

But it can get better than this

This is not intended as a long-term alternative to parsing escape sequences. The idea in #2035 (comment) of handling them is good, and this is not meant to discourage or interfere with that. The new test fixture and test can be modified accordingly. See the docstring and comments in test_config_with_quotes_containing_escapes.

For review

It seems to me that the idea here is sound, since it restores the main branch to a state where no changes are expected to produce problems for programs and libraries that use GitPython, if a patch release were to be made.

But even if I am right to think that, there are a few reasons it may be useful to have a review here before merging:

  • I'm less familiar with the config parser than most of the rest of the code of GitPython, so there could be design subtleties I am missing.
  • There are multiple reasonable ways to make this change, and you may have input, stylistically or otherwise.
  • It's easy to introduce bugs when making nontrivial (even if small) changes to parsing logic.

(This follows #2046 and #2047, which followed #2035 and #2036.)

This refactors ConfigParser double-quote parsing near the single
line double-quoted value parsing code, so that:

- Code that parses the name is less intermixed with code that
  parses the value.

- Conditional logic is less duplicated.

- The `END` comment notation appears next to the code it describes.

- The final `else` can be turned into one or more `elif` followed
  by `else` to cover different cases of `"..."` differently. (But
  those are not added here. This commit is purely a refactoring.)

(The `pass` suite when `len(optval) < 2 or optval[0] != '"'` is
awkward and not really justified right now, but it looks like it
may be able to help with readabilty and help keep nesting down
when new `elif` cases are added.)
These are cases where just removing the outer quotes without doing
anything to the text inside does not give the correct result, and
where keeping the quotes may be preferable, in that it was the
long-standing behavior of `GitConfigParser`.

That this was the long-standing behavior may justify bringing it
back when the `"`-`"`-enclosed text contains such characters, but
it does not justify preserving it indefinitely: it will still be
better to parse the escape sequences, at least in the type case
that all of them in a value's representation are well-formed.
This is for single line quoting in the ConfigParser.

This leaves the changes in gitpython-developers#2035 (as adjusted in gitpython-developers#2036) intact for
the cases where it addressed gitpython-developers#1923: when the `...` in `"..."`
(appearing in the value position on a single `{name} = {value}"`
line) has no occurrences of `\` or `"`, quote removal is enough.

But when `\` or `"` does appear, this suppresses quote removal.
This is with the idea that, while it would be better to interpret
such lines as Git does, we do not yet do that, so it is preferable
to return the same results we have in the past (which some programs
may already be handling themselves).

This should make the test introduced in the preceding commit pass.
But it will be even better to support more syntax, at least
well-formed escapes. As noted in the test, both the test and the
code under test can be adjusted for that.

(See comments in gitpython-developers#2035 for context.)
@EliahKagan EliahKagan marked this pull request as ready for review June 8, 2025 05:40
@EliahKagan EliahKagan requested a review from Byron June 8, 2025 05:40
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

1 participant