From 9e5532738f51e8c16ece14018bec454f25de0f55 Mon Sep 17 00:00:00 2001
From: yangfl <yangfl@users.noreply.github.com>
Date: Fri, 1 May 2020 14:59:57 +0800
Subject: [PATCH] py/mkrules.mk: workaround fused multiply-add inaccuracy

On arm64, `print(float('.' + '9' * 70))` resulted in `1.000000000000001`
rather that `1.0`.

This is because

  dec_val = 10 * dec_val + dig;

was converted into a single fused multiply-add instruction (fmadd) rather than
two (mulsd and addsd on x86), thus caused a slightly different rounding loss
when `dec_val` is about to overflow.

The parsing process of `'.' + '9' * 70` would look like this on x86

  dec_val                      *(uint64_t*)&dec_val   exp_extra
  ...
  10000000000000003053453312.000000 45208b2a2c280292 -25
  100000000000000021944598528.000000 4554adf4b7320336 -26
  1000000000000000288165462016.000000 4589d971e4fe8404 -27
  ...

and on arm64

  ...
  10000000000000003053453312.000000 45208b2a2c280292 -25
  100000000000000039124467712.000000 4554adf4b7320337 -26
  1000000000000000425604415488.000000 4589d971e4fe8405 -27
  ...

Note the difference when `exp_extra` = -26.

Unfortunately there are no easy ways to fix this problem [1], furthermore GCC
does not recognize `#pragma STDC FP_CONTRACT OFF`, while it does enable
FP_CONTRACT as soon as -O2 (or higher). The only feasible solution appears to
me is only `-ffp-contract=off`.

[1] https://stackoverflow.com/a/42134261
[2] https://lists.freedesktop.org/archives/libreoffice/2017-December/079046.html
---
 py/mkrules.mk | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/py/mkrules.mk b/py/mkrules.mk
index c37c25db4bd2f..508269cbceb60 100644
--- a/py/mkrules.mk
+++ b/py/mkrules.mk
@@ -14,6 +14,9 @@ OBJ_EXTRA_ORDER_DEPS += $(HEADER_BUILD)/compressed.data.h
 CFLAGS += -DMICROPY_ROM_TEXT_COMPRESSION=1
 endif
 
+# see 'fused multiply-add' problem
+CFLAGS += -ffp-contract=off
+
 # QSTR generation uses the same CFLAGS, with these modifications.
 # Note: := to force evalulation immediately.
 QSTR_GEN_CFLAGS := $(CFLAGS)