Skip to content

string.Formatter.parse does not handle auto-numbered positional fields #89867

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
SDesch mannequin opened this issue Nov 3, 2021 · 11 comments
Open

string.Formatter.parse does not handle auto-numbered positional fields #89867

SDesch mannequin opened this issue Nov 3, 2021 · 11 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@SDesch
Copy link
Mannequin

SDesch mannequin commented Nov 3, 2021

BPO 45704
Nosy @ericvsmith

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2021-11-03.11:55:47.784>
labels = ['type-bug', '3.8', '3.9', '3.10', '3.11', '3.7']
title = 'string.Formatter.parse does not handle auto-numbered positional fields'
updated_at = <Date 2021-11-05.18:06:11.745>
user = 'https://bugs.python.org/SDesch'

bugs.python.org fields:

activity = <Date 2021-11-05.18:06:11.745>
actor = 'eric.smith'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = []
creation = <Date 2021-11-03.11:55:47.784>
creator = 'SDesch'
dependencies = []
files = []
hgrepos = []
issue_num = 45704
keywords = []
message_count = 9.0
messages = ['405610', '405619', '405624', '405627', '405636', '405757', '405796', '405802', '405814']
nosy_count = 2.0
nosy_names = ['eric.smith', 'SDesch']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue45704'
versions = ['Python 3.6', 'Python 3.7', 'Python 3.8', 'Python 3.9', 'Python 3.10', 'Python 3.11']

Linked PRs

@SDesch
Copy link
Mannequin Author

SDesch mannequin commented Nov 3, 2021

It appears when adding auto-numbered positional fields in python 3.1 Formatter.parse was not updated to handle them and currently returns an empty string as the field name.

list(Formatter().parse('hello {}'))  # [('hello ', '', '', None)]

This does not align with Formatter.get_field which according to the docs: "Given field_name as returned by parse() (see above), convert it to an object to be formatted."

When supplying an empty string to .get_field() you get a KeyError

Formatter().get_field("", [1, 2, 3], {}). # raises KeyError

@SDesch SDesch mannequin added 3.7 (EOL) end of life 3.8 (EOL) end of life 3.9 only security fixes 3.10 only security fixes 3.11 only security fixes type-bug An unexpected behavior, bug, or error labels Nov 3, 2021
@ericvsmith
Copy link
Member

For reference, the documentation is at https://docs.python.org/3/library/string.html#custom-string-formatting

I guess in your example it should return:
[('hello ', '0', '', None)]

@SDesch
Copy link
Mannequin Author

SDesch mannequin commented Nov 3, 2021

Yes it should return a string containing the index of the positional argument i.e. "0" so that it is compatible with .get_field(). Side note: It's a somewhat weird that .get_field expects a string while .get_value expects an int for positional arguments.

@ericvsmith
Copy link
Member

Side note: It's a somewhat weird that .get_field expects a string while .get_value expects an int for positional arguments.

.parse is just concerned with parsing, so it works on and returns strings. .get_field takes strings because it is the thing that's trying to determine whether or not a field name looks like an integer or not. At least that's how I remember it.

@SDesch
Copy link
Mannequin Author

SDesch mannequin commented Nov 3, 2021

Another thing that occurred to me is the question of what .parse() should do when a mix of auto-numbered and manually numbered fields is supplied e.g. {}{1}. As of now .parse() happily processes such inputs and some other piece of code deals with this and ultimately raises an exception that mixing manual with automatic numbering is not allowed. If .parse() supported automatic numbering it would have to be aware of this too I guess?

@ericvsmith
Copy link
Member

The more I think about this, the more I think it's not .parse's job to fill in the field numbers, it's the job of whoever is calling it.

Just as it's not .parse's job to give you an error if you switch back and forth between numbered and un-numbered fields.

It's literally just telling you what's in the string as it breaks it apart, not assigning any further meaning to the parts. I guess I should have called it .lex, not .parse.

@SDesch
Copy link
Mannequin Author

SDesch mannequin commented Nov 5, 2021

That definition of .parse() definitely makes sense. Do you then think this is out of scope for Formatter in general or just for .parse()?. Just for reference, this is what I currently use to get automatic numbering to work for my use case.

def parse_command_template(format_string):

    auto_numbering_error = ValueError(
        'cannot switch from automatic field numbering to manual field specification')

    index = 0
    auto_numbering = None

    for literal_text, field_name, spec, conversion in Formatter().parse(format_string):
        if field_name is not None:
            if field_name.isdigit():
                if auto_numbering is True:
                    raise auto_numbering_error
                auto_numbering = False

            if field_name == '':
                if auto_numbering is False:
                    raise auto_numbering_error
                auto_numbering = True
                field_name = str(index)
                index += 1

        yield literal_text, field_name, spec, conversion

@ericvsmith
Copy link
Member

I think your code is rational. But since string.Formatter gets such little use, I'm not sure it's worth adding this to the stdlib. On the other hand, it could be used internal to string.Formatter.

We'd need to pick a better name, though. And maybe it should return the field_name as an int.

@ericvsmith
Copy link
Member

That is, return field_name as an int if it's an int, otherwise as a string.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
@iritkatriel iritkatriel added the interpreter-core (Objects, Python, Grammar, and Parser dirs) label Nov 28, 2023
@dg-pb
Copy link
Contributor

dg-pb commented Jan 31, 2025

The more I think about this, the more I think it's not .parse's job to fill in the field numbers, it's the job of whoever is calling it.

I agree with this. .parse correctly returns empty string for empty field name. This is factorised well providing one-to-one relationship between input and output.


However, the main issue is that this is not possible for things with nesting to work.

If parsing only one level, then suggested parse change would return things correctly. E.g.:

list(string.Formatter().parse('{}{}'))
[('', '0', '', None), ('', '1', '', None)]

However, it would not return things correctly for:

list(string.Formatter().parse('{:.{}f}{}'))
[('', '0', '.{}f', None), ('', '1', '', None)]

While its equivalent of manually numbered fields would look like:

fmt = '{0:.{1}f}{2}'
# And correct parse output should be:
list(string.Formatter().parse('{:.{}f}{}'))
[('', '0', '.{}f', None), ('', '2', '', None)]

Auto-numbering is very easy to do outside it. Especially for a flat case. I suggest closing this as "Not planned".

@encukou
Copy link
Member

encukou commented Feb 3, 2025

That's reasonable! Looks like all that's left is to mention this in the parse/get_field docs.

@picnixz picnixz removed interpreter-core (Objects, Python, Grammar, and Parser dirs) 3.11 only security fixes 3.10 only security fixes labels Feb 3, 2025
@picnixz picnixz added stdlib Python modules in the Lib dir and removed 3.9 only security fixes 3.8 (EOL) end of life 3.7 (EOL) end of life labels Feb 3, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

5 participants