Skip to content

gh-137242: Add Android CI job #137186

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 18 commits into
base: main
Choose a base branch
from
Open

gh-137242: Add Android CI job #137186

wants to merge 18 commits into from

Conversation

mhsmith
Copy link
Member

@mhsmith mhsmith commented Jul 28, 2025

See the linked issue for details:

@StanFromIreland StanFromIreland added the infra CI, GitHub Actions, buildbots, Dependabot, etc. label Jul 28, 2025
@mhsmith
Copy link
Member Author

mhsmith commented Jul 30, 2025

The logs are showing a number of "Scudo OOM" warnings similar to those discussed in #121595. However, none of them have been fatal yet.

In API level 35 the memory allocator no longer prints such an alarming message when this happens. Instead it will just be a one line message like this (llvm/llvm-project#68444):

I/scudo   : Can't populate more pages for size class 8720.

Luckily, in API level 35 they've also added the ability to print the allocator statistics at any time, with code like this (llvm/llvm-project@5759e3c):

import ctypes
ctypes.CDLL(None).mallopt(-205, 0)

Based on those statistics, it doesn't look like Python has a memory leak, as the "inuse" column after the test is roughly the same as it was before. So it's probably just temporary high memory use during the test, causing allocations to be directed to the next highest size class.

On the buildbot about a year ago, these warnings were often associated with crashes. As described in Android/README.md, we worked around this by running the emulator once to create it, then editing its configuration files to increase the emulator RAM from 2 GB to 4 GB. After that, the warnings continued, but the crashes stopped.

But that isn't such a good option on GitHub Actions, because the runner is not persistent, so the extra emulator restart would add about a minute to every CI run. In theory you should be able to avoid this by creating the emulator from a custom hardware profile, but in my experiments, the emulator always ended up with 2 GB of RAM regardless of what it said in the profile.

The tests that trigger this are:

  • test.test_ast.test_ast.AST_Tests.test_ast_recursion_limit
  • test.test_compile.TestSpecifics.test_compiler_recursion_limit
  • test.test_compile.TestSpecifics.test_stack_overflow

These tests all involve parsing an extremely long Python statement. Although they're all skipped on WASM because of limited stack space, reducing the recursion limit does not prevent the warning. What does prevent it is reducing the size of the Python statement, e.g. from 500,000 repetitions to 200,000.

Since I haven't seen these tests crash on GitHub Actions yet, I'm not going to change them just now. If they do start crashing, we can change them to reduce the statement size on Android alone, or even on all platforms if it doesn't significantly reduce the strength of the test.

mhsmith added a commit to mhsmith/release-tools that referenced this pull request Jul 30, 2025
@mhsmith mhsmith changed the title Add Android CI job gh-137242: Add Android CI job Jul 30, 2025
@mhsmith mhsmith added needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes labels Jul 30, 2025
@mhsmith mhsmith marked this pull request as ready for review July 30, 2025 16:30
Copy link
Contributor

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One fairly minor comment inline; my other comment is an extension of that one.

Although the action is broken up into steps, the context of those steps is lost when it's embedded in the workflow. Aside from the general ergonomics of being able to follow what stage the workflow is up to, there's very little visibility on the fact that some steps aren't be in invoked - unless you pay close attention, it's not clear that the arm64 run isn't running the tests. I know it's not, and why - but there's no external visibility of that fact (like a clearly skipped build step).

If the constraint against using a workflow is that we would need to reference a specific commit - isn't that commit hash one of the inputs to the release workflow? Does uses not accept variable expansions in workflow references?

# This is coded as an action rather than a workflow so the release-tools
# repository can load it from a dynamically-chosen commit. Cross-repository
# workflow calls must have a Git reference hard-coded in the calling workflow,
# but actions can be run dynamically from the runner's filesystem.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense; but it also raises an eyebrow because this is something that no other platform has needed. I presume this is because no other platform that is generating binary artefacts is doing so with the tooling in release-tools?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@webknjaz, do you have any thougts here?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll take a look, thanks for tagging.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@mhsmith so I've tried to understand the context and this justtification didn't make sense to me.

It is perfectly possible to call reusable workflows from other repositories (in fact, this is what I'm building my reusable-tox.yml ecosystem on).

My understanding is that this is meant to be used in https://github.com/python/release-tools/blob/698deaf2ebff433a6ab9d4b5ded97a40fce109a1/.github/workflows/source-and-docs-release.yml, right?

In any case, I've been moving away from composite actions in favor of reusable workflows. This is because composite actions (or any actions for that matter) are represented as a single step in job runs. And it's rather difficult to inspect what actions are doing. So from the troubleshooting perspective, I'd strongly advise against composite actions.

It is important to make every step visible and transparent. And if you follow the practice I established with reusable workflows as modules in here, this is definitely possible.

I started with in-repo "modules" two years ago because I was solving the problem of making it possible to collect all the job statuses in a single check (for branch protection). This wasn't because it's somehow impossible to host them externally. This was just not something necessary for that purpose.

@encukou I've actually been meaning to ask if there's any workflows that are being duplicated in the python org. If yes, it'd make sense to host them in a shared repository. This could be a .github repo or even that release-tools one (although, I don't know if it makes semantic sense). This is a separate question, though.

That said, if you've faced any confusion or need help adapting this to be a reusable workflow, just let me know where you're stuck. I can help you navigate this or just rewrite it for you.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This makes sense; but it also raises an eyebrow because this is something that no other platform has needed. I presume this is because no other platform that is generating binary artefacts is doing so with the tooling in release-tools?

Right now, release-tools only creates source zips and docs artifacts. See https://github.com/python/release-tools/actions/runs/16450411678 for 3.14 RC2.

The Windows artifacts are built in Azure Pipelines, here's RC2. And Ned builds the macOS artifacts. (release-tools later takes these Windows and macOS artifacts and signs and uploads them.)

We're hoping to build the macOS artifacts using CI in the near future, so what we decide here may help inform how to do that to :)

My understanding is that this is meant to be used in python/release-tools@698deaf/.github/workflows/source-and-docs-release.yml, right?

Yes, see https://github.com/python/release-tools/pull/265/files#diff-4d14704b6b88fb06db888f96c03a8e9b3a5e07a4ee566d97d4111b2c05210e84R220.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One thing I'd like to see is CI in this repo to check Android builds okay, so we don't get caught ought on release day because we only build in release-tools.

This happened in 3.14 RC1 with the plaintext docs, which had broken back in April or so, but RC1 is the first prerelease to build docs and we hadn't caught it here.

One deciding factor for whether we have stuff over here (via composite actions or something else) might be how much difference there'll be between different versions (3.14, 3.15, etc). If a lot, we might not want it all in release-tools and would benefit from versioning things in branches over here.

On the other hand, the docs build is also versioned in branches here, release-tools CI calls a make dist command in this repo.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On the other hand, the docs build is also versioned in branches here, release-tools CI calls a make dist command in this repo.

That's more or less the way I've been trying to suggest.

@bedevere-app
Copy link

bedevere-app bot commented Jul 31, 2025

A Python core developer has requested some changes be made to your pull request before we can consider merging it. If you could please address their requests along with any other requests in other reviews from core developers that would be appreciated.

Once you have made the requested changes, please leave a comment on this pull request containing the phrase I have made the requested changes; please review again. I will then notify any core developers who have left a review that you're ready for them to take another look at this pull request.

@webknjaz
Copy link
Contributor

Does uses not accept variable expansions in workflow references?

@freakboy3742 it does not. But I don't think that it's neccessary or that composite actions would be any different in this regard.

@mhsmith
Copy link
Member Author

mhsmith commented Aug 3, 2025

Does uses not accept variable expansions in workflow references?

@freakboy3742 it does not. But I don't think that it's neccessary or that composite actions would be any different in this regard.

You may not be able to use variable expansions directly in an action reference, but you can use a variable in a checkout step, and then run the action from the checkout. This is how python/release-tools#265 works.

It'd be a bit decoupled but this is not necessarily a bad thing.

I'm not sure about that. The workflows to build and test Python are closely coupled to the scripts, makefiles, etc, which they call, all of which are in the cpython repository. And the command lines may vary from one version of Python to another. So cpython would seem like the appropriate place to maintain them.

Although the action is broken up into steps, the context of those steps is lost when it's embedded in the workflow.

I agree this wasn't ideal. I've just pushed an alternative approach which uses ::group:: markers to split up the output, which results in a much better UI.

@freakboy3742
Copy link
Contributor

Does uses not accept variable expansions in workflow references?

@freakboy3742 it does not. But I don't think that it's neccessary or that composite actions would be any different in this regard.

You may not be able to use variable expansions directly in an action reference, but you can use a variable in a checkout step, and then run the action from the checkout. This is how python/release-tools#265 works.

Sure - I understand how this works when used as an action; I was driving more towards the "can we use this as a workflow" approach. But if a variable isn't valid as an actions reference, then it's a moot point.

It'd be a bit decoupled but this is not necessarily a bad thing.

I'm not sure about that. The workflows to build and test Python are closely coupled to the scripts, makefiles, etc, which they call, all of which are in the cpython repository. And the command lines may vary from one version of Python to another. So cpython would seem like the appropriate place to maintain them.

I agree with @mhsmith here - I don't think decoupling the build scripts from the code repo would be a net win. It's already complicated enough having to update 2 repositories (the development guide and the code repo) whenever the build script is updated; adding third repository that requires an update isn't going to make things any clearer.

Although the action is broken up into steps, the context of those steps is lost when it's embedded in the workflow.

I agree this wasn't ideal. I've just pushed an alternative approach which uses ::group:: markers to split up the output, which results in a much better UI.

This is definitely better - still not as good as "true" actions steps, but it's a lot more legible.

Copy link
Contributor

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The use of ::group:: is a significant improvement. A true workflow would still be my preferred outcome - but I also don't have any better options for how to reference/include it.

I've flagged a couple of other minor issues.

@AlexWaygood AlexWaygood removed their request for review August 4, 2025 10:26
Comment on lines 25 to 71
run: |
# Build and test (Android)
echo "::group::Configure build Python"
./Android/android.py configure-build
echo "::endgroup::"
echo "::group::Compile build Python"
./Android/android.py make-build
echo "::endgroup::"
echo "::group::Configure host Python"
./Android/android.py configure-host $triplet
echo "::endgroup::"
echo "::group::Compile host Python"
./Android/android.py make-host $triplet
echo "::endgroup::"
echo "::group::Make release package"
./Android/android.py package $triplet
echo "::endgroup::"
if [ "$RUNNER_OS" = "Linux" ] && [ "$RUNNER_ARCH" = "X64" ]; then
# https://github.blog/changelog/2024-04-02-github-actions-hardware-accelerated-android-virtualization-now-available/
echo "::group::Enable KVM for Android emulator"
echo 'KERNEL=="kvm", GROUP="kvm", MODE="0666", OPTIONS+="static_node=kvm"' \
| sudo tee /etc/udev/rules.d/99-kvm4all.rules
sudo udevadm control --reload-rules
sudo udevadm trigger --name-match=kvm
echo "::endgroup::"
echo "::group::Unpack release artifact"
mkdir $RUNNER_TEMP/android
tar -C $RUNNER_TEMP/android -xf cross-build/$triplet/dist/*
echo "::endgroup::"
echo "::group::Tests"
# Arguments are similar to --fast-ci, but in single-process mode.
$RUNNER_TEMP/android/android.py test --managed maxVersion -v -- \
--single-process --fail-env-changed --rerun --slowest --verbose3 \
-u "all,-cpu" --timeout=600
echo "::endgroup::"
else
echo "Skipping test: GitHub Actions currently only supports the" \
"Android emulator on Linux x86_64."
fi
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Regardless of the action vs reusable workflow discussion, it's best to move this entire script into a dedicated file (.sh or even .py). Calling it could then be wrapped as a GNU/make target, unifying things around Makefile as a central place for a number of processes. It is one already, after all, so why deviate from that?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So... we have that script already - it's the Android/android.py that is being invoked here.

There's multiple invocations of the script because there are multiple "substeps" inside that script - and this script is calling each of those substeps individually so that the log has clear visibility on which step fails (if and when it fails). It's the need to visualise those individual workflow steps (vs the ::group:: solution currently in the PR) that has been the bulk of discussion on this PR so far.

There's also one step - the KVM configuration - that is Github Actions specific.

I guess one option would be to encode this sequence as a make github-build-test-android (or similar), that executes all these steps, including GitHub-Actions specific logging group statements and KVM controls... but it feels a little weird to encode Github Actions specific calls into the Makefile.

That said - given that we can't easily have a reusable workflow here, and a Makefile target with ::group:: output would effectively give us the same end-user experience, and would also remove the need for an action (as the release workflow would also be make GitHub-build-test-android)... maybe that is the best approach?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, kind of. This is something that I'm exploring in my reusable-tox.yml thing. There, I encourage to move the customizations into the tox land. Though, my workflow also has a mechanism for GHA-specific hooks that do indeed use in-tree actions for customizing things.

Still, I've grown to like moving more things into the unified workflow tool that a give project has (tox/nox/make/whatever). This makes it possible to reuse things outside a single platform.

It is possible to detect running under GHA (https://github.com/ansible/awx-plugins/blob/c4180cd/toxfile.py#L16). And this allows having grouping set up in the Python script that would also work locally (https://github.com/deadsnakes/action/blob/5344dcb/bin/install-python#L22-L33).

So I wouldn't try to put echo "::group::..." into Makefile directly. Instead, it's possible to have make build-n-test-cycle calling ./Android/android.py build-n-test-cycle "${triplet}", which in turn would implement nice logging (and conditional KVM setup) when under GHA, for example.
That would be an entry point to be called under GHA but also could be excersiced locally or called under some other CI if need be.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course ./Android/android.py build-n-test-cycle "${triplet}" could be implemented as a shell script but this might be an unnecessary layer of indirection in this case. Plus coding in Python is more pleasant, predictable, lintable, well-understood.

@mhsmith
Copy link
Member Author

mhsmith commented Aug 9, 2025

Thanks for your ideas, everyone. I've now added an android.py ci command which performs the build, package and test sequence. This has several advantages:

  • The command is so simple, that there's no need to wrap it in a reusable action or workflow; the cpython and release-tools repositories can invoke it directly.
  • It's much easier to test locally than a script in a workflow file.
  • It avoids the circular dependency we'd have with make, because the Makefile doesn't exist until android.py runs configure.
  • cibuildwheel uses the same android.py test script, so handling the GitHub Actions KVM setup at this level will save all the package maintainers from needing to do it manually.

I have made the requested changes; please review again.

I've also made the corresponding changes to python/release-tools#265.

@bedevere-app
Copy link

bedevere-app bot commented Aug 9, 2025

Thanks for making the requested changes!

@freakboy3742: please review the changes made to this pull request.

Copy link
Contributor

@freakboy3742 freakboy3742 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks pretty good to me - the CI configuration is as about as simple as it could get; the output in CI is very navigable; and end users testing under Github will get the added benefit of not needing explicit CI configuration.

I'm happy to approve as is; holding off merging until @webknjaz has had a chance to take a look.

Copy link
Contributor

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would still like this job to be wrapped as a reusable workflow:

  1. for in-repo use — I'd like to eventually get to a point where build.yml would only declare releationships between the jobs and matrices, keeping any steps in reusable-*.yml workflows; this is to unify the structure better + have cleaner separation of concerns/abstraction layers
  2. this would allow having a visually collapsible group of jobs in the UI, consistent with how other groups look
  3. with 3-4 inputs in it, it would allow relying on the same steps in python/release-tools (if an optional actions/upload-artifact is added)
  4. the benefits of having the step wrapped in android.py would remain

However, I think it's okay to merge the PR as is and I could send a follow-up with such a change. I only ask the matrix definition to be fixed first.

@mhsmith mhsmith requested a review from webknjaz August 11, 2025 13:20
Copy link
Contributor

@webknjaz webknjaz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Let's get this in and I'll look into wrapping it as a reusable workflow later.

Copy link
Member

@hugovk hugovk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
awaiting merge infra CI, GitHub Actions, buildbots, Dependabot, etc. needs backport to 3.13 bugs and security fixes needs backport to 3.14 bugs and security fixes OS-android skip news
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants