Skip to content

Freeze the encodings module. #89816

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
ericsnowcurrently opened this issue Oct 28, 2021 · 8 comments
Closed

Freeze the encodings module. #89816

ericsnowcurrently opened this issue Oct 28, 2021 · 8 comments
Labels
3.11 only security fixes build The build process and cross-build type-bug An unexpected behavior, bug, or error

Comments

@ericsnowcurrently
Copy link
Member

BPO 45653
Nosy @malemburg, @gvanrossum, @tiran, @ericsnowcurrently, @FFY00, @kumaraditya303
PRs
  • bpo-45653: partially freeze encodings package #29331
  • bpo-45653: Freeze encodings package modules #29788
  • bpo-45653: fix test_embed on windows #29814
  • bpo-45653: Freeze parts of the encodings package #30030
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2021-10-28.18:11:52.435>
    labels = ['type-bug', 'build', '3.11']
    title = 'Freeze the encodings module.'
    updated_at = <Date 2021-12-13.18:01:38.922>
    user = 'https://github.com/ericsnowcurrently'

    bugs.python.org fields:

    activity = <Date 2021-12-13.18:01:38.922>
    actor = 'christian.heimes'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Build']
    creation = <Date 2021-10-28.18:11:52.435>
    creator = 'eric.snow'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 45653
    keywords = ['patch']
    message_count = 8.0
    messages = ['405211', '405213', '405247', '405372', '405383', '405390', '407323', '408474']
    nosy_count = 6.0
    nosy_names = ['lemburg', 'gvanrossum', 'christian.heimes', 'eric.snow', 'FFY00', 'kumaraditya']
    pr_nums = ['29331', '29788', '29814', '30030']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue45653'
    versions = ['Python 3.11']

    @ericsnowcurrently
    Copy link
    Member Author

    Currently we freeze all the modules imported during runtime initialization, except for the encodings module. It has a lot of submodules and this results in a lot of extra noise in builds. We hadn't frozen it yet because we were still ironing out changes related to frozen modules and the extra noise was a pain. We also waited because we weren't sure if we should freeze all the submodules or just the most likely ones to be used during startup. In the case of the latter, we were also blocked on having __path__ set on the package.

    At this point there are no blockers. So we should freeze the encodings modules with either all submodules or the most commonly used subset.

    @ericsnowcurrently ericsnowcurrently added 3.11 only security fixes build The build process and cross-build type-bug An unexpected behavior, bug, or error labels Oct 28, 2021
    @malemburg
    Copy link
    Member

    encodings is a package. I think you first have to check whether mixing
    frozen and non-frozen submodules are even supported. I've never tried
    having only part of a package frozen.

    Freezing the whole package certainly works.

    @ericsnowcurrently
    Copy link
    Member Author

    On Thu, Oct 28, 2021 at 12:15 PM Marc-Andre Lemburg
    <report@bugs.python.org> wrote:

    encodings is a package. I think you first have to check whether mixing
    frozen and non-frozen submodules are even supported. I've never tried
    having only part of a package frozen.

    It works as long as __path__ is set properly, which it is now. FWIW,
    I tested freezing only some of the submodules a while back and it
    worked fine. That was using a different branch that I never merged
    but it should be fine with the different change that got merged. Of
    course, we'd need to verify that if we went that route.

    @FFY00
    Copy link
    Member

    FFY00 commented Oct 30, 2021

    I just tested partially freezing the package, and it seems to working fine :)

    @malemburg
    Copy link
    Member

    On 30.10.2021 17:54, Filipe Laíns wrote:

    I just tested partially freezing the package, and it seems to working fine :)
    FWIW: I think it's best not bother and simply freeze the whole thing.

    It's mostly char mappings which compress well and there's a benefit
    in sharing these using mmap (which the OS does for you with static
    C data).

    @FFY00
    Copy link
    Member

    FFY00 commented Oct 31, 2021

    I have already opened up the PR, but I can change if desired.

    @gvanrossum
    Copy link
    Member

    New changeset 02b5ac6 by Kumar Aditya in branch 'main':
    bpo-45653: fix test_embed on windows (GH-29814)
    02b5ac6

    @tiran
    Copy link
    Member

    tiran commented Dec 13, 2021

    Eric, I have a simple reproducer for the issue:

    This works:

    $ LC_ALL=en_US.utf-8 TESTPATH=$(pwd)/Lib:$(pwd)/build/lib.linux-x86_64-3.11 ./Programs/_testembed test_init_setpath_config

    This fails because it cannot load ISO-8859-1 / latin-1 codec

    $ LC_ALL=en_US.latin1 TESTPATH=$(pwd)/Lib:$(pwd)/build/lib.linux-x86_64-3.11 ./Programs/_testembed test_init_setpath_config
    Python path configuration:
      PYTHONHOME = (not set)
      PYTHONPATH = (not set)
      program name = 'conf_program_name'
      isolated = 0
      environment = 1
      user site = 1
      import site = 1
      is in build tree = 0
      stdlib dir = ''
      sys._base_executable = 'conf_executable'
      sys.base_prefix = ''
      sys.base_exec_prefix = ''
      sys.platlibdir = 'lib'
      sys.executable = 'conf_executable'
      sys.prefix = ''
      sys.exec_prefix = ''
      sys.path = [
        '/home/heimes/dev/python/cpython/Lib',
        '/home/heimes/dev/python/cpython/build/lib.linux-x86_64-3.11',
      ]
    Fatal Python error: init_fs_encoding: failed to get the Python codec of the filesystem encoding
    Python runtime state: core initialized
    LookupError: unknown encoding: ISO-8859-1

    Current thread 0x00007f9c42be6740 (most recent call first):
    <no Python frame>

    With this patch I'm seeing that encodings.__path__ is not absolute and that __spec__ has an empty submodule_search_locations.

    --- a/Lib/encodings/__init__.py
    +++ b/Lib/encodings/__init__.py
    @@ -98,9 +98,12 @@ def search_function(encoding):
                 # module with side-effects that is not in the 'encodings' package.
                 mod = __import__('encodings.' + modname, fromlist=_import_tail,
                                  level=0)
    -        except ImportError:
    +        except ImportError as e:
                 # ImportError may occur because 'encodings.(modname)' does not exist,
                 # or because it imports a name that does not exist (see mbcs and oem)
    +            sys.stderr.write(f"exception: {e}\n")
    +            sys.stderr.write(f"encodings.__path__: {__path__}\n")
    +            sys.stderr.write(f"encodings.__spec__: {__spec__}\n")
                 pass
             else:
                 break
    $ LC_ALL=en_US.latin1 TESTPATH=$(pwd)/Lib:$(pwd)/build/lib.linux-x86_64-3.11 ./Programs/_testembed test_init_setpath_config
    exception: No module named 'encodings.latin_1'
    encodings.__path__: ['encodings']
    encodings.__spec__: ModuleSpec(name='encodings', loader=<class '_frozen_importlib.FrozenImporter'>, origin='frozen', submodule_search_locations=[])
    exception: No module named 'encodings.iso_8859_1'
    encodings.__path__: ['encodings']
    encodings.__spec__: ModuleSpec(name='encodings', loader=<class '_frozen_importlib.FrozenImporter'>, origin='frozen', submodule_search_locations=[])

    It should have this search location:

    >>> import encodings
    >>> encodings.__spec__
    ModuleSpec(name='encodings', loader=<class '_frozen_importlib.FrozenImporter'>, origin='frozen', submodule_search_locations=['/home/heimes/dev/python/cpython/Lib/encodings'])

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    @kumaraditya303 kumaraditya303 closed this as not planned Won't fix, can't repro, duplicate, stale Aug 20, 2023
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.11 only security fixes build The build process and cross-build type-bug An unexpected behavior, bug, or error
    Projects
    Development

    No branches or pull requests

    6 participants