Skip to content

Conversation

DvdGiessen
Copy link
Contributor

@DvdGiessen DvdGiessen commented Aug 4, 2025

Summary

The Git hash is embedded in the version number. The hash is abbreviated by Git. This commit changes the length of the Git hash abbreviation to a fixed number, so that the length of the version string no longer varies based on external factors.

Made this change because builds of the same MicroPython commit on multiple machines were sometimes giving a version string with different lengths. This change may also help the code size report to be more consistent, because it will less often be impacted by random changes in the version string length, at the cost of always being a few bytes longer.

Testing

By default Git chooses the length "based on the approximate number of packed objects in your repository" (man page). This was causing the version string length to be unstable depending on how big your local repository is.

At the moment of writing, the master branch on GitHub contains 17463 commits. Based on the statistics from my own repository, I set the value to 10, which is one higher than the highest value I saw required for a unique commit abbreviation in my local repository checkout:

$ git rev-list --all | wc -l
38059
$ git rev-list --all | xargs -P0 -I{} sh -c 'git rev-parse --short=4 {} | tr -cd 0-9a-f | wc -m' | sort -n | uniq -c
    345 4
  28172 5
   8855 6
    652 7
     33 8
      2 9

Trade-offs and Alternatives

Alternative is to not do this. Then we don't get the consistency benefits, but will keep semi-randomly saving a few bytes for years to come (until the default gets increased by the Git heuristic).

Note that Git will still use more characters if it required to make the hash unique. So all this setting does is increase the minimum slightly; Git will still use a longer hash if it already knows about another commit with the same abbreviation.

We can always safely increase this number later should the need arise.

Note: Presumably the users with the largest repositories are the core maintainers, so perhaps they could check the statistics for their repository and report if they think a length of 10 is the right choice.

Signed-off-by: Daniël van de Giessen <daniel@dvdgiessen.nl>
Copy link

github-actions bot commented Aug 4, 2025

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:    +0 +0.000% standard
      stm32:    +4 +0.001% PYBV10
     mimxrt:    +0 +0.000% TEENSY40
        rp2:    +8 +0.001% RPI_PICO_W
       samd:    +4 +0.001% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:    +4 +0.001% VIRT_RV32

Copy link

codecov bot commented Aug 4, 2025

Codecov Report

✅ All modified and coverable lines are covered by tests.
⚠️ Please upload report for BASE (master@255d74b). Learn more about missing BASE report.

Additional details and impacted files
@@            Coverage Diff            @@
##             master   #17832   +/-   ##
=========================================
  Coverage          ?   98.38%           
=========================================
  Files             ?      171           
  Lines             ?    22283           
  Branches          ?        0           
=========================================
  Hits              ?    21924           
  Misses            ?      359           
  Partials          ?        0           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@Josverl
Copy link
Contributor

Josverl commented Aug 5, 2025

Do you need this as part of the production firmware binary , or in CI / testing ?
the reason I ask is that there appears to be a size increase of (?some? of ) the firmwares

if its just about CI/testing then adding a build_info file with git hash and build / compiler options might be a better solution as that would not affect firmware size.

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Aug 5, 2025
@DvdGiessen
Copy link
Contributor Author

DvdGiessen commented Aug 5, 2025

I personally don't need this to be in upstream, I already have it in my tree. :) But created the PR since I figured it might be worthwhile to have upstream. Also OK with me if we don't merge this.

Clarified the tradeoff section. Indeed, it will increase code size at this moment (namely, due to the longer version string). But, it will then be more consistent: We are vastly less likely to see random +X sizes on PR's because the commit hash happened to have a collision that prevented it from being shorter, or because the build node had a larger number of references in it's repository (tbh, I don't know how GitHub does that internally, so can't really comment on how likely that is), or because the heuristic Git used changed between versions, et cetera.

@Josverl
Copy link
Contributor

Josverl commented Aug 5, 2025

vastly less likely to see random +X sizes on PR's because the commit hash happened to have a collision that prevented it from being shorter, or because the build node had a larger number of references in it's repository

Ah, that is a quite good point

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants