CI add ccache for GitHub Actions #31895

StefanieSenger · 2025-08-07T18:45:04Z

Reference Issues/PRs

Follow up on #31832

What does this implement/fix? Explain your changes.

This PR adds a step to use a cache for ccache on the GitHub Actions job(s) in the unit-tests.yml.

Thanks for your support @lesteve and @adrinjalali! 🫶

Comments

From my experiments on another branch it seems that this speeds up the build of scikit-learn by 1 min (1 min instead of 2 mins) on ARM.

This won't work on windows.

github-actions · 2025-08-07T18:46:06Z

✔️ Linting Passed

All linting checks passed. Your pull request is in excellent shape! ☀️

_{Generated for commit: bbf689b. Link to the linter CI: here}

StefanieSenger · 2025-08-07T19:15:05Z

Three comments for reviewers:

The actions output of "build scikit-learn" step outputs the ccache statistics at the very end.
The Set compiler ID step is an experiment and I am happy to remove/replace if it's not adding enough value. Then the key would be shorter/less specific. Reasoning:

I have looked at how we define key and restore-keys in build_tools/azure/posix.yml and found doing it the same way is not so useful here. Using something like Agent.JobName doesn't help, because we don't use several jobs that build (lint job doesn't build). Build.BuildNumber doesn't seem to have an equivalent in github actions(?).
Instead, I have tried to use the compiler as part of the key. ccache then implements one cache per os-compiler combination.
The Set compiler ID step extracts the compiler from the runner and it gets re-used in the key later. The alternative way using ${{ matrix.compiler }} would mean that we would have the define a compiler on the matrix and we have done that nowhere else, so I assume that we don't want that to keep it more flexible?

The cache size is pretty low (I had 6MB after multiple experiments on the other branch), but he have set the maximum size to ccache -M 256M in install.sh. Should it be decreased or would we wait for the other jobs to be added and then re-evaluate this?

adrinjalali

Pretty cool!

adrinjalali · 2025-08-08T08:49:46Z

.github/workflows/unit-tests.yml

+        uses: actions/cache@v4
+        with:
+          path: ${{ github.workspace }}/ccache
+          key: ccache-${{ runner.os }}-${{ steps.get-compiler.outputs.compiler }}


do we want to make sure this is only within each PR, and not shared between PRs, to make sure there's no cache poisoning happening?

@lesteve and I had talked about this. If I understood correctly then it would be preferable to allow sharing the ccache across several PRs, because ccache is smart enough to compare the contents of the source c and cpp files before reusing their build. I also think it would quickly clutter if we had a cache for each PR.

Or do you refer to security questions?

Another thought that came to my mind:
Does it make sense to have an action~~/repository setting/makefile command (not sure how it should look like)~~ (here is an actions workflow that would do that) that deletes the ccache to clean out old min dependency builds that are not used anymore or when we need a fresh start. That could be regular or on demand.

My understanding is that GHA cache already has some builtin protections, see doc.

adrinjalali · 2025-08-08T09:45:49Z

can we test this by having another commit and see if cache works?

lesteve · 2025-08-08T11:35:33Z

Nice, it seems to work fine. There may be some tweaks that we can do which I will suggest soon:

first commit build log, building from scratch takes ~2 minutes:

+ ccache -s
Cacheable calls:    73 / 162 (45.06%)
  Hits:              0 /  73 ( 0.00%)
    Direct:          0
    Preprocessed:    0
  Misses:           73 /  73 (100.0%)
Uncacheable calls:  89 / 162 (54.94%)
Local storage:
  Cache size (GB): 0.0 / 0.3 ( 2.71%)
  Hits:              0 /  73 ( 0.00%)
  Misses:           73 /  73 (100.0%)

second commit build log, reuses ccache and building takes ~1 minute and you can see some ccache reuse:

+ ccache -s
Cacheable calls:    73 / 162 (45.06%)
  Hits:             71 /  73 (97.26%)
    Direct:         71 /  71 (100.0%)
    Preprocessed:    0 /  71 ( 0.00%)
  Misses:            2 /  73 ( 2.74%)
Uncacheable calls:  89 / 162 (54.94%)
Local storage:
  Cache size (GB): 0.0 / 0.3 ( 2.81%)
  Hits:             71 /  73 (97.26%)
  Misses:            2 /  73 ( 2.74%)

adrinjalali

LGTM. I'll let @lesteve decide if improvements need to be done here or other PRs.

.github/workflows/unit-tests.yml

Co-authored-by: Loïc Estève <loic.esteve@ymail.com>

.github/workflows/unit-tests.yml

Co-authored-by: Loïc Estève <loic.esteve@ymail.com>

lesteve · 2025-08-08T14:58:05Z

Here is the thinking behind the cache key:

you can not update an existing cache entry, see doc, so you need to create a cache entry with a different name (and use restore-key to be able to use partial match to reuse a previous cache key)
we chose to create a new cache entry if there is a "good reason" to, so we added the following files in the hash
- Cython (including .tp files), C, C++ files (that seems a no-brainer)
- lock-file. I am not 100% sure about this one, but I guess if the lock-file changes it may be a good enough reason to generate a cache entry e.g. compiler or numpy version change? The ccache doc seems to say that cache will not be reused with a different compiler version (it uses the size and the mtime of the compiler in the hash key by default)
Azure is creating a cache entry in every build. This was done in MAINT Increase ccache hit rate on Azure Pipelines #22213 to maximize cache reuse. We can always go back to this simpler strategy if we think it's better.

lesteve · 2025-08-11T08:25:04Z

Let's merge this one and adapt if we notice possible improvements to the cache key strategy.

CI add ccache for GitHub Actions

a94c582

github-actions bot added the Build / CI label Aug 7, 2025

StefanieSenger added the No Changelog Needed label Aug 7, 2025

adrinjalali reviewed Aug 8, 2025

View reviewed changes

empty commit

672380f

adrinjalali approved these changes Aug 8, 2025

View reviewed changes

lesteve reviewed Aug 8, 2025

View reviewed changes

.github/workflows/unit-tests.yml Outdated Show resolved Hide resolved

StefanieSenger and others added 9 commits August 8, 2025 14:11

Update .github/workflows/unit-tests.yml

b2dc698

Co-authored-by: Loïc Estève <loic.esteve@ymail.com>

change key to include hashfiles

00680d7

empty commit

bb57d97

test change c file

484ddec

add more files to hashfiles

fefa2b6

add LOCK_FILE to hashfile

a4fdb5d

try fix

c8168c6

remove experiments

f369920

add accidentally removed line

7dcff0f

lesteve reviewed Aug 8, 2025

View reviewed changes

.github/workflows/unit-tests.yml Outdated Show resolved Hide resolved

Update .github/workflows/unit-tests.yml

6684d96

Co-authored-by: Loïc Estève <loic.esteve@ymail.com>

Put ccache after checkout and before conda

bbf689b

lesteve merged commit a9a7b7d into scikit-learn:main Aug 11, 2025
36 checks passed

StefanieSenger deleted the ccache_for_gha branch August 11, 2025 09:24

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Uh oh!

CI add ccache for GitHub Actions #31895

CI add ccache for GitHub Actions #31895

Uh oh!

StefanieSenger commented Aug 7, 2025 •

edited

Loading

Uh oh!

github-actions bot commented Aug 7, 2025 •

edited

Loading

Uh oh!

StefanieSenger commented Aug 7, 2025

Uh oh!

adrinjalali left a comment

Uh oh!

adrinjalali Aug 8, 2025

Uh oh!

StefanieSenger Aug 8, 2025 •

edited

Loading

Uh oh!

lesteve Aug 8, 2025

Uh oh!

adrinjalali commented Aug 8, 2025

Uh oh!

lesteve commented Aug 8, 2025

Uh oh!

adrinjalali left a comment

Uh oh!

Uh oh!

Uh oh!

lesteve commented Aug 8, 2025 •

edited

Loading

Uh oh!

lesteve commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

Uh oh!

CI add ccache for GitHub Actions #31895

CI add ccache for GitHub Actions #31895

Uh oh!

Conversation

StefanieSenger commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Comments

Uh oh!

github-actions bot commented Aug 7, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

✔️ Linting Passed

Uh oh!

StefanieSenger commented Aug 7, 2025

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

adrinjalali Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

StefanieSenger Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lesteve Aug 8, 2025

Choose a reason for hiding this comment

Uh oh!

adrinjalali commented Aug 8, 2025

Uh oh!

lesteve commented Aug 8, 2025

Uh oh!

adrinjalali left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

lesteve commented Aug 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

lesteve commented Aug 11, 2025

Uh oh!

Uh oh!

Uh oh!

StefanieSenger commented Aug 7, 2025 •

edited

Loading

github-actions bot commented Aug 7, 2025 •

edited

Loading

StefanieSenger Aug 8, 2025 •

edited

Loading

lesteve commented Aug 8, 2025 •

edited

Loading