pygettext: Add `--omit-header` option #130647

StanFromIreland · 2025-02-27T18:50:33Z

Feature or enhancement

Proposal:

From gettext:

‘--omit-header’

Don’t write header with ‘msgid ""’ entry. Note: Using this option may lead to an error in subsequent operations if the output contains non-ASCII characters.

This is useful for testing purposes because it eliminates a source of variance for generated .gmo files. With --omit-header, two invocations of xgettext on the same files with the same options at different times are guaranteed to produce the same results.

Note that using this option will lead to an error if the resulting file would not entirely be in ASCII.

Will be useful for our tests. PR soon

Has this already been discussed elsewhere?

This is a minor feature, which does not need previous discussion elsewhere

Links to previous discussion of this feature:

No response

Linked PRs

gh-130647: Add --omit-header option to pygettext #130650

The text was updated successfully, but these errors were encountered:

tomasr8 · 2025-02-27T19:19:45Z

This flag is universally useful but to expand on this:

Will be useful for our tests. PR soon

We currently have i18n tests for argparse, optparse and getopt. The snapshots are in Lib/test/translationdata.
The current format of the snapshots is just a list of msgids, which is not great because it doesn't include for example msgid_plural or msgctxt.

With --omit-header (and --no-location) we can use .po files for the snapshots instead. With the header removed, the snapshots will not change unless the source strings change.

encukou · 2025-02-28T09:44:33Z

How important is it to match xgettext behaviour?
A new user-visible option adds some maintenance overhead. This one doesn't look useful to users.
I haven't seen the tests this is useful for, but I imagine we'd still want to test the behaviour without --omit-header.

Would suffice to patch time.strftime in the tests? Or delete the line?

StanFromIreland · 2025-02-28T17:27:06Z

It's not about matching behavior, there are tens of missing options.

It can be used by users (smaller mo files I guess would be a use case), but it is mostly for tests.

Almost everything is possible, but mo files are binary, removing the ~15 lines of text (in the pot, and we can't be sure what exactly their contents are) from the file after its been compiled will not be simple. Adding the option on the other hand is much simpler.

tomasr8 · 2025-02-28T21:02:18Z

How important is it to match xgettext behaviour?

If you're worried about adding lots of CLI options, I actually don't want to add any more options (maybe besides setting the input/output encoding but I want to do more research on that). I'm mostly concerned with making pygettext work correctly and fixing the existing options.

I proposed to add this option for a few reasons which I should've elaborated more on. First, it's useful when you often deal with .po files (part of my work is managing translations for our project) - you often want to have predictable output from the extractor and don't really care about the header. --omit-header lets you easily do that. Code search on GH also reveals quite a few instances of xgettext and pybabel being called with --omit-header inside Python scripts so it indeed seems to be a relatively common thing.

I haven't seen the tests this is useful for, but I imagine we'd still want to test the behaviour without --omit-header.

Yes, in fact for pygettext tests we already patch the date in an ugly way using regex. My comment was about tests that merely use pygettext to extract strings. For example, there is a test for argparse that does this (and optparse and getopt). Currently, we use a not-so-great format where we just dump the msgids in a text file which has its own issues. Having --omit-header would allow us to do this in a nice way.

A new user-visible option adds some maintenance overhead.

I think the maintenance overhead is quite small for such a simple option, but I don't want to presume too much, you obviously have more experience in that regard 🙂

encukou · 2025-03-03T11:40:04Z

That makes sense. Thanks for elaborating!

encukou · 2025-04-30T16:03:23Z

Discussing this a bit more:

Note that using this option will lead to an error if the resulting file would not entirely be in ASCII.

This limitation would make the option unsuitable for our tests.
It's probably better to do something else than what gettext does.

Would it make sense to support SOURCE_DATE_EPOCH instead? It's a pretty standard way to "freeze" the date for all kinds of build tools.
There was opposition to it in gettext, but the arguments I could find seem outdated.

AA-Turner · 2025-04-30T16:09:28Z

Seconding SOURCE_DATE_EPOCH, it seems to make the most sense to me.

A

StanFromIreland added the type-feature A feature request or enhancement label Feb 27, 2025

bedevere-app bot mentioned this issue Feb 27, 2025

gh-130647: Add --omit-header option to pygettext #130650

Open

encukou added this to Gettext issues Feb 28, 2025

picnixz added the triaged The issue has been accepted as valid by a triager. label Feb 28, 2025

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

pygettext: Add `--omit-header` option #130647

pygettext: Add `--omit-header` option #130647

StanFromIreland commented Feb 27, 2025 •

edited by bedevere-app bot

Loading

tomasr8 commented Feb 27, 2025

encukou commented Feb 28, 2025

StanFromIreland commented Feb 28, 2025

tomasr8 commented Feb 28, 2025

encukou commented Mar 3, 2025

encukou commented Apr 30, 2025

AA-Turner commented Apr 30, 2025

pygettext: Add --omit-header option #130647

pygettext: Add --omit-header option #130647

Comments

StanFromIreland commented Feb 27, 2025 • edited by bedevere-app bot Loading

Feature or enhancement

Proposal:

Has this already been discussed elsewhere?

Links to previous discussion of this feature:

Linked PRs

tomasr8 commented Feb 27, 2025

encukou commented Feb 28, 2025

StanFromIreland commented Feb 28, 2025

tomasr8 commented Feb 28, 2025

encukou commented Mar 3, 2025

encukou commented Apr 30, 2025

AA-Turner commented Apr 30, 2025

pygettext: Add `--omit-header` option #130647

pygettext: Add `--omit-header` option #130647

StanFromIreland commented Feb 27, 2025 •

edited by bedevere-app bot

Loading