Skip to content

py/py.mk: Build some core C files as single compilation units. #12714

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Conversation

projectgus
Copy link
Contributor

@projectgus projectgus commented Oct 17, 2023

(Reviewing #12644 reminded me that I meant to try this experiment.)

Motivation

Building some source files as single compilation units allows the compiler to do some whole-of-source-file optimizations that aren't possible with -fgc-sections, or to emit slightly tighter assembly (even after linker relaxations are applied.)

LTO provides very similar benefits, but is not enabled by default in most builds (and not possible at all for some ports like esp32).

Method

Looking at the "Discarded input sections" of firmware.map for a build with -fgc-sections shows a lot of entries like this:

 .text          0x00000000        0x0 build-PYBV11/py/emitglue.o
 .data          0x00000000        0x0 build-PYBV11/py/emitglue.o
 .bss           0x00000000        0x0 build-PYBV11/py/emitglue.o
 .text          0x00000000        0x0 build-PYBV11/py/persistentcode.o
 .data          0x00000000        0x0 build-PYBV11/py/persistentcode.o
 .bss           0x00000000        0x0 build-PYBV11/py/persistentcode.o
 .text.mp_raw_code_load_mem
                0x00000000       0x1c build-PYBV11/py/persistentcode.o
 .text          0x00000000        0x0 build-PYBV11/py/runtime.o
 .data          0x00000000        0x0 build-PYBV11/py/runtime.o
 .bss           0x00000000        0x0 build-PYBV11/py/runtime.o
 .rodata.mp_raise_OSError_with_filename.str1.1
                0x00000000        0xe build-PYBV11/py/runtime.o
 .text.mp_raise_OSError_with_filename
                0x00000000       0x40 build-PYBV11/py/runtime.o

From the above you can see that all symbols from emitglue.o were linked to the final binary, but both persistentcode.o and runtime.o contained symbols not referenced in the final binary that could be removed by the GC linker pass.

Code Size

PYBV11 binary size (without cache):

  • 390496 on master
  • 391196 (+700) if the entire py/ directory was built without -ffunction-sections -fdata-sections
  • 390472 (-24) with only the changes in this PR.

Performance

Performance of this PR versus master (cache disabled):

❯ ./run-perfbench.py -s pyb-run1.txt pyb-run2.txt
diff of scores (higher is better)
N=168 M=100                pyb-run1.txt -> pyb-run2.txt         diff      diff% (error%)
bm_chaos.py                    232.97 ->     237.43 :      +4.46 =  +1.914% (+/-0.02%)
bm_fannkuch.py                  48.76 ->      48.63 :      -0.13 =  -0.267% (+/-0.01%)
bm_fft.py                     1467.10 ->    1514.54 :     +47.44 =  +3.234% (+/-0.00%)
bm_float.py                   3983.10 ->    4065.98 :     +82.88 =  +2.081% (+/-0.02%)
bm_hexiom.py                    30.25 ->      31.29 :      +1.04 =  +3.438% (+/-0.00%)
bm_nqueens.py                 2847.52 ->    2840.84 :      -6.68 =  -0.235% (+/-0.00%)
bm_pidigits.py                 344.66 ->     351.33 :      +6.67 =  +1.935% (+/-0.31%)
core_import_mpy_multi.py       186.48 ->     211.61 :     +25.13 = +13.476% (+/-0.01%)
core_import_mpy_single.py       26.76 ->      30.82 :      +4.06 = +15.172% (+/-0.03%)
core_qstr.py                    19.67 ->      24.29 :      +4.62 = +23.488% (+/-0.01%)
core_yield_from.py             220.02 ->     219.72 :      -0.30 =  -0.136% (+/-0.00%)
misc_aes.py                    265.58 ->     276.24 :     +10.66 =  +4.014% (+/-0.01%)
misc_mandel.py                2073.09 ->    2085.69 :     +12.60 =  +0.608% (+/-0.01%)
misc_pystone.py               1371.56 ->    1426.80 :     +55.24 =  +4.028% (+/-0.00%)
misc_raytrace.py               248.11 ->     251.70 :      +3.59 =  +1.447% (+/-0.01%)
viper_call0.py                 339.03 ->     329.69 :      -9.34 =  -2.755% (+/-0.00%)
viper_call1a.py                330.36 ->     321.47 :      -8.89 =  -2.691% (+/-0.01%)
viper_call1b.py                253.91 ->     251.99 :      -1.92 =  -0.756% (+/-0.01%)
viper_call1c.py                255.86 ->     253.90 :      -1.96 =  -0.766% (+/-0.00%)
viper_call2a.py                325.22 ->     316.61 :      -8.61 =  -2.647% (+/-0.01%)
viper_call2b.py                222.02 ->     221.41 :      -0.61 =  -0.275% (+/-0.01%)

Remaining Work

Currently this is a quick PoC based on PYBV11 only. If the approach seems worthwhile, the next step would be to identify more closely which core files are always 100% linked to the build (or linked depending on make-level options). And/or the cases where some functions can be #ifdef-ed out to make this true for a file.

This work was funded through GitHub Sponsors.

Allows the compiler to do some whole-of-source-file
optimizations that aren't possible with -fgc-sections.

This work was funded through GitHub Sponsors.

Signed-off-by: Angus Gratton <angus@redyak.com.au>
@github-actions
Copy link

Code size report:

   bare-arm:  +176 +0.310% 
minimal x86: +1408 +0.754% 
   unix x64:   +96 +0.012% standard
      stm32:   -24 -0.006% PYBV10
     mimxrt:   -40 -0.011% TEENSY40
        rp2:    +0 +0.000% RPI_PICO
       samd:   -28 -0.011% ADAFRUIT_ITSYBITSY_M4_EXPRESS

@codecov
Copy link

codecov bot commented Oct 17, 2023

Codecov Report

Merging #12714 (cf2fa83) into master (d2a9d70) will increase coverage by 0.00%.
Report is 22 commits behind head on master.
The diff coverage is n/a.

@@           Coverage Diff           @@
##           master   #12714   +/-   ##
=======================================
  Coverage   98.39%   98.39%           
=======================================
  Files         158      158           
  Lines       20940    20953   +13     
=======================================
+ Hits        20603    20616   +13     
  Misses        337      337           

see 6 files with indirect coverage changes

@dlech
Copy link
Contributor

dlech commented Oct 17, 2023

What caused the big increase in the minimal build?

@projectgus
Copy link
Contributor Author

What caused the big increase in the minimal build?

Fair question, @dlech! Some of the core files built as single units in this PR are are only partially linked to those builds (which makes sense as the minimal builds use less features.) When I figured out what to put in the Draft I was only looking at the linker map file for PYBV11.

To make this work properly across ports and feature levels then it will need more to become more fine grained. For example, could remove some object files from the "build as single units" list entirely, or add some conditionals either in the Makefile or in the source files so the minimal builds don't end up growing.

The question I have for now is whether that additional complexity and effort is worth the potential performance/size improvement, or if this is better left as an "interesting, probably not worth it" exercise.

@projectgus
Copy link
Contributor Author

(I edited the last paragraph of the PR post to make that a bit clearer.)

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Nov 3, 2023
@projectgus
Copy link
Contributor Author

Closing this out for now, it seems like any improvement in size/performance is marginal and there would be a more noticeable increase in complexity to avoid code size increases for minimal build configs.

@projectgus projectgus closed this Jan 10, 2024
@projectgus projectgus deleted the experiment/no_function_sections branch November 1, 2024 05:11
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants