-
-
Notifications
You must be signed in to change notification settings - Fork 590
python_repository: Exclude pycache files #907
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
python_repository: Exclude pycache files #907
Conversation
what version of rules_python? I made a recent fix 3f0d62d that's in 0.14.0 |
Are you running as |
I have repro'd in 0.15.0 - this is around the python interpreter's distro itself, rather than than when using pip.
I am. I could make the globbing conditional on |
When you set that attribute, it's assumed that you stumbled across #713. rules_python is not supported running as root. I'm hesitant to start making exceptions as it involves many side effects (and I'm afraid we don't even know all of them). |
Apologies - I'm an infrastructure provider for dozens of teams who are using Bazel, many of whom are for sure running bazel as root inside docker containers - realistically to use hermetic python interpreters we need a patch like this, and we would really like to not have to keep an internal fork of What's the reasoning behind making the directory readonly? Presumably part of the point is to avoid generating these pyc files, in which case, excluding them from the glob feels like it's working towards the same aim? Are these other reasons to do so too? I've added a new commit to https://github.com/illicitonion/rules_python/tree/exclude-pycache-in-python-repository which only excludes these files if you're running as root, to minimise the surface area of the change to "people who have already opted in to ignoring reasonable advice", but I do think it may be worth merging this without the only-when-root patch... |
I noticed that people are running as root by accident, not intentionally. Making the pyc read-only prevents the interpreter from updating them while taking advantage of the improved performance. Why isn't it realistic to stop building as root? As I said, we (rules_python maintainers) don't understand all of the side effects that running as root brings. This was only one of the side effects we caught. Speaking more broadly, and outside Bazel, running as root defeats the whole concept of file permissions on UNIX-like systems, so if we want to keep the "correct" part of |
(context: I work on the same team as @illicitonion ) If running in Docker using User Namespaces, then you really want to be running as root inside the container so you end up mapping to a specific user outside the container. This makes e.g. cleanup tasks outside the container more performant. When running inside a single-use docker container as root, where docker is configured with user namespaces, I don't think it's the case that that "running as root defeats the whole concept of file permissions on UNIX-like systems". We've been running builds (using a variety of build systems) as root for several years now without issue. |
@gibfahn thank you for the extra information on how you use Docker. Let's schedule a call to discuss this better. |
hey, similar situation here, curious if consensus was reached. |
We had a really useful call, and think the underlying issue here is that https://github.com/indygreg/python-build-standalone includes a somewhat arbitrary subset of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@illicitonion could you update this PR? I'll merge as is and create a separate issue to address this upstream.
b63e665
to
64b0bdf
Compare
Some of these are missing from the upstream python distributions, and when they are, if running as root, this causes the files to be generated the first time their associated .py file is evaluated, causing analysis and execution cache misses due to extra files matching the repository glob.
64b0bdf
to
738add7
Compare
@f0rmiga done - thanks! |
bazel-contrib#907 ended up accidentally adding excludes to the includes
Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at #713, #749, #907, #1008, #1169, etc. Fixes #1169.
Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at #713, #749, #907, #1008, #1169, etc. Fixes #1169.
Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at #713, #749, #907, #1008, #1169, etc. Fixes #1169.
Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at #713, #749, #907, #1008, #1169, etc. Fixes #1169.
Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at #713, #749, #907, #1008, #1169, etc. Fixes #1169.
Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at #713, #749, #907, #1008, #1169, etc. Fixes #1169. --------- Co-authored-by: Richard Levasseur <rlevasseur@google.com>
…-contrib#2636) Currently, by default, rules_python immediately fails when Bazel is run as root. The reasoning behind this involves .pyc files being generated for hermetic toolchains when they're first used, causing cache misses; to work around this, rules_python opts to make the toolchain installation directory read-only, but running Bazel as root would circumvent this. So rules_python actively detects if the current user is root, and hard fails. This check can be disabled by the root module by setting `python.override(ignore_root_user_error=True)`. (See more context in the linked issues/PRs.) This causes a reverberating effect across the Bazel ecosystem, as rules_python is essentially a dependency of every single Bazel project through protobuf. Effectively, any Bazel project wishing to run as root need to add the override tag above, even if they don't have anything to do with Python at all. This PR changes the default value of the `ignore_root_user_error` to True instead. Besides, it now unconditionally tries to make the toolchain installation directory read-only, and only outputs a warning if it's detected that the current user is root. See previous discussions at bazel-contrib#713, bazel-contrib#749, bazel-contrib#907, bazel-contrib#1008, bazel-contrib#1169, etc. Fixes bazel-contrib#1169. --------- Co-authored-by: Richard Levasseur <rlevasseur@google.com>
After the first time the python_repository is used, files are created in
__pycache__
directories. These invalidate the configured target forthe python_repository target, and cause spurious rebuilds.
Here's an example of the set of pycache files I've seen in one
repository.
PR Checklist
Please check if your PR fulfills the following requirements:
PR Type
What kind of change does this PR introduce?
What is the current behavior?
If you cquery twice in a row, you get different output, because extra files were created in the repository by the first cquery invocation.
Issue Number: N/A
What is the new behavior?
The consecutive cqueries now produce identical results.
Does this PR introduce a breaking change?