Skip to content

test_notify_all() of test_multiprocessing_forkserver failed on Python on x86 Tiger 3.6 with "ValueError: cannot convert float NaN to integer" #76300

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
vstinner opened this issue Nov 23, 2017 · 7 comments
Labels
tests Tests in the Lib/test dir

Comments

@vstinner
Copy link
Member

BPO 32119
Nosy @mdickinson, @vstinner

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2017-12-15.15:10:40.217>
created_at = <Date 2017-11-23.08:32:46.401>
labels = ['tests']
title = 'test_notify_all() of test_multiprocessing_forkserver failed on Python on x86 Tiger 3.6 with "ValueError: cannot convert float NaN to integer"'
updated_at = <Date 2017-12-15.15:10:40.215>
user = 'https://github.com/vstinner'

bugs.python.org fields:

activity = <Date 2017-12-15.15:10:40.215>
actor = 'vstinner'
assignee = 'none'
closed = True
closed_date = <Date 2017-12-15.15:10:40.217>
closer = 'vstinner'
components = ['Tests']
creation = <Date 2017-11-23.08:32:46.401>
creator = 'vstinner'
dependencies = []
files = []
hgrepos = []
issue_num = 32119
keywords = []
message_count = 7.0
messages = ['306787', '306788', '306789', '306790', '306792', '306793', '308400']
nosy_count = 2.0
nosy_names = ['mark.dickinson', 'vstinner']
pr_nums = []
priority = 'normal'
resolution = 'out of date'
stage = 'resolved'
status = 'closed'
superseder = None
type = None
url = 'https://bugs.python.org/issue32119'
versions = ['Python 3.6']

@vstinner
Copy link
Member Author

test_notify_all() of test_multiprocessing_forkserver failed on Python on x86 Tiger 3.6 with "ValueError: cannot convert float NaN to integer":

http://buildbot.python.org/all/#/builders/46/builds/60

0:52:21 load avg: 1.46 [378/406/1] test_multiprocessing_forkserver crashed (Exit code 1)

Traceback (most recent call last):
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/forkserver.py", line 196, in main
    _serve_one(s, listener, alive_r, old_handlers)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/forkserver.py", line 231, in _serve_one
    code = spawn._main(child_r)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/spawn.py", line 114, in _main
    prepare(preparation_data)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/spawn.py", line 223, in prepare
    _fixup_main_from_name(data['init_main_from_name'])
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/spawn.py", line 249, in _fixup_main_from_name
    alter_sys=True)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 201, in run_module
    mod_name, mod_spec, code = _get_module_details(mod_name)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 153, in _get_module_details
    code = loader.get_code(mod_name)
  File "<frozen importlib._bootstrap_external>", line 762, in get_code
ValueError: cannot convert float NaN to integer

Timeout (0:25:00)!
Thread 0xa000d000 (most recent call first):
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/connection.py", line 379 in _recv
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/connection.py", line 407 in _recv_bytes
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/connection.py", line 250 in recv
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/managers.py", line 757 in _callmethod
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/multiprocessing/managers.py", line 987 in acquire
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/test_multiprocessing.py", line 1044 in test_notify_all
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/case.py", line 605 in run
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/case.py", line 653 in __call
_
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 122 in run
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 84 in __call__
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 122 in run
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 84 in __call__
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 122 in run
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/suite.py", line 84 in __call__
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/unittest/runner.py", line 176 in run
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/support/init.py", line 1888 in _run_suite
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/support/init.py", line 1932 in run_unittest
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest.py", line 172 in test_runner
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest.py", line 173 in runtest_inner
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest.py", line 127 in runtest
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/runtest_mp.py", line 71 in run_tests_slave
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/main.py", line 516 in _main
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/main.py", line 509 in main
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/libregrtest/main.py", line 584 in main
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/regrtest.py", line 46 in _main
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/regrtest.py", line 50 in <module>
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 85 in _run_code
File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/runpy.py", line 193 in _run_module_as_main

Moreover, when test_multiprocessing_forkserver was run a second time, a different test failed, test_forkserver_sigkill():

Re-running test 'test_multiprocessing_forkserver' in verbose mode
(...)
test_forkserver_sigint (test.test_multiprocessing_forkserver.WithProcessesTestProcess) ... ok
test_forkserver_sigkill (test.test_multiprocessing_forkserver.WithProcessesTestProcess) ... FAIL
test_lose_target_ref (test.test_multiprocessing_forkserver.WithProcessesTestProcess) ... ok
(...)
======================================================================
FAIL: test_forkserver_sigkill (test.test_multiprocessing_forkserver.WithProcessesTestProcess)
----------------------------------------------------------------------

Traceback (most recent call last):
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/_test_multiprocessing.py", line 496, in test_forkserver_sigkill
    self.check_forkserver_death(signal.SIGKILL)
  File "/Users/db3l/buildarea/3.6.bolen-tiger/build/Lib/test/_test_multiprocessing.py", line 486, in check_forkserver_death
    self.assertTrue(evt.is_set())
AssertionError: False is not true

@vstinner vstinner added the tests Tests in the Lib/test dir label Nov 23, 2017
@vstinner
Copy link
Member Author

"ValueError: cannot convert float NaN to integer": this error occurred on int(float), where the float comes from os.stat()

File "<frozen importlib._bootstrap_external>", line 762, in get_code

Lib/importlib/_bootstrap_external.py:

def get_code(self, fullname):
    ...
    st = self.path_stats(source_path)
    ...
    source_mtime = int(st['mtime']) <~~~~~~ HERE
    ...

def path_stats(self, path):
    st = _path_stat(path)
    return {'mtime': st.st_mtime, 'size': st.st_size}

def _path_stat(path):
    return _os.stat(path)

=> The float NaN comes from os.stat().st_mtime.

Implementation of os.stat().st_mtime in Modules/posixmodule.c:

static PyObject*
_pystat_fromstructstat(STRUCT_STAT *st)
{
    ...
#if defined(HAVE_STAT_TV_NSEC)
    ansec = st->st_atim.tv_nsec;
    mnsec = st->st_mtim.tv_nsec;
    cnsec = st->st_ctim.tv_nsec;
#elif defined(HAVE_STAT_TV_NSEC2)
    ansec = st->st_atimespec.tv_nsec;
    mnsec = st->st_mtimespec.tv_nsec;
    cnsec = st->st_ctimespec.tv_nsec;
#elif defined(HAVE_STAT_NSEC)
    ansec = st->st_atime_nsec;
    mnsec = st->st_mtime_nsec;
    cnsec = st->st_ctime_nsec;
#else
    ansec = mnsec = cnsec = 0;
#endif
    fill_time(v, 7, st->st_atime, ansec);
    fill_time(v, 8, st->st_mtime, mnsec);
    fill_time(v, 9, st->st_ctime, cnsec);
    ...
}

static void
fill_time(PyObject *v, int index, time_t sec, unsigned long nsec)
{
    ...
    float_s = PyFloat_FromDouble(sec + 1e-9*nsec);  <~~~~~~ HERE
    ...
    PyStructSequence_SET_ITEM(v, index+3, float_s);
    ...
}

Extract of the ./configure script output:

checking for tv_nsec in struct stat... no
checking for tv_nsec2 in struct stat... yes

=> configure defines HAVE_STAT_TV_NSEC2

We need sec and nsec values from fill_time() to debug this issue.

@vstinner
Copy link
Member Author

It may also be compiler issue.

  • CPU: Intel x86
  • compiler: GCC 4.0.1 (Apple Computer, Inc. build 5367)
  • CFLAGS: -fno-strict-aliasing -Wsign-compare -g -O0 -Wall -Wstrict-prototypes
  • OS: macOS Tiger
  • Kernel: Darwin Kernel Version 8.11.1: Wed Oct 10 18:23:28 PDT 2007; root:xnu-792.25.20~1/RELEASE_I386

Extract of pythoninfo:

os.uname: posix.uname_result(sysname='Darwin', nodename='osx', release='8.11.1', version='Darwin Kernel Version 8.11.1: Wed Oct 10 18:23:28 PDT 2007; root:xnu-792.25.20~1/RELEASE_I386', machine='i386')
platform.architecture: 32bit
platform.platform: Darwin-8.11.1-i386-32bit
platform.python_implementation: CPython
sys.version: 3.6.3+ (heads/3.6:ae3c5c7, Nov 22 2017, 20:39:07) [GCC 4.0.1 (Apple Computer, Inc. build 5367)]
sysconfig[CFLAGS]: -fno-strict-aliasing -Wsign-compare -g -O0 -Wall -Wstrict-prototypes

@vstinner
Copy link
Member Author

This issue is very similar, likely the same bug, than bpo-13986.

@vstinner
Copy link
Member Author

Similar error: bpo-14028, "Using a 32-bit Python 2.6.5 on a Linux system", reported at 2012-02-16.

@vstinner
Copy link
Member Author

Interesting info about NaN on Intel x86:

https://stackoverflow.com/questions/14021763/c-multiplication-or-addition-floats-results-nan

"The multiplication is being performed in the x87 registers, and a floating-point stack overflow has occurred due to (possibly unrelated) earlier operations in your program's execution. When the processor is in this failure state, all computations performed on the x87 registers produce NaN results."

--

For FreeBSD, Python main() starts with:

	/* 754 requires that FP exceptions run in "no stop" mode by default,
	 * and until C vendors implement C99's ways to control FP exceptions,
	 * Python requires non-stop mode.  Alas, some platforms enable FP
	 * exceptions by default.  Here we disable them.
	 */
#ifdef __FreeBSD__
	fedisableexcept(FE_OVERFLOW);
#endif

@vstinner
Copy link
Member Author

I really have no clue on the root bug, and the bug only occurred once, so I choose to close it.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
tests Tests in the Lib/test dir
Projects
None yet
Development

No branches or pull requests

1 participant