TST: add macos azure testing to CI #12051

tylerjereddy · 2018-09-28T23:17:22Z

Related to #12036.

I figured I'd start this with Mac OS for Azure because that platform is absent from our current github-integrated CI, which perhaps reduces the pressure to replicate a current matrix entry & allows restoration of a common platform to the test matrix with apparently short queues at the moment.

I tried to add detailed notes to the yml file. I think the discussion surrounding test & support for accelerate on mac is tricky, perhaps in no small part to the opacity of the resolution of bugs there and so on.

If we want to expand the matrix or try an x86 arch target or whatever else I suppose that may be something to discuss. My impression is that we may want to focus the 10 free parallel jobs on Azure mostly to relieve current Windows bottlenecks.

All tests are passing on my fork with this PR branch. You can see some helpful stats on the test data page, including the drop in test time from the previous run when a single core was used for pytest.

As usual, getting to "all green" on the tests with a new platform was a bit painful -- my compromise was to add 1 more skipif decorator on a single f2py test. f2py is always a "treat" to work with, but homebrew gfortran mostly seems to play ok with it.

If core devs / steering committe would like, I can look into trying to activate the github hook for azure so we can see it on this PR for NumPy repo proper -- otherwise, one of the "owners" may have to do that.

cc Microsoft folks who commented elsewhere: @ericsciple @chrisrpatterson

charris · 2018-09-29T01:07:13Z

numpy/f2py/tests/test_semicolon_split.py

 from . import util
 from numpy.testing import assert_equal

+import pytest


Normally try to keep this import up in front of the numpy/local imports.

charris · 2018-09-29T01:08:07Z

numpy/f2py/tests/test_semicolon_split.py

+    @pytest.mark.skipif(platform.system() == 'Darwin',
+                        reason="Prone to error when run with "
+                               "numpy/f2py/tests on mac os, "
+                               "but not when run in isolation")


What does this mean?

It means that running python runtests.py -t "numpy/f2py/tests/test_semicolon_split.py" does not cause issues, but the decorated test will error with python runtests.py -t "numpy/f2py/tests", or the entire test suite. So, context-dependent behavior in the test container.

charris · 2018-09-29T01:37:03Z

Can you try hitting the settings button and then webhooks?

tylerjereddy · 2018-09-29T05:37:51Z

Don't have access to settings. I created a numpy project on azure & triggered a manual build of master: https://dev.azure.com/numpy/numpy/_build/results?buildId=1&view=logs

Of course, master has no .yml so that doesn't do much. I always find the initial seed event for CI set up confusing. For my fork I temporarily switched the default branch, pushed a .yml into that test branch, and then opened a PR against the branch with a yml in it.

I think it may be possible to manually trigger the PR if the feature branch has been pushed to NumPy proper, but I think we try to keep our branch space clean so not doing that without consent.

Other option is the github app, which I think an owner can do--not sure if that'll help trigger builds off PRs when master doesn't have the yml, maybe. I should figure out how to add core devs to the config page, etc.

charris · 2018-09-29T14:30:55Z

Don't worry about the webhook, I can add it if you post it here, I just wondered if you had sufficient permissions for that. The thing about the test succeeding when run individually, but not as part of the package, is disturbing and is something we should understand, it indicates something going wrong. I don't understand the Fortran code there.

charris · 2018-09-29T14:32:06Z

I'm going to merge this just so we can all experiment with it.

charris · 2018-09-29T14:33:17Z

Thanks Tyler.

charris · 2018-09-29T15:04:39Z

Could you add me to this project? My user name is "charlesrharris"

tylerjereddy · 2018-09-29T17:20:25Z

The user name didn't seem to work -- I tried to invite using email "charlesr.harris@gmail.com"

It shows you as a member at: https://dev.azure.com/numpy/numpy

Not sure if that works -- I think you'll have to be the one initiate the webhook on master.

charris · 2018-09-29T17:46:36Z

Interesting, but I was just able to trigger a build, so I must have some permissions, but the tests did not run, maybe just need to find the right button.

It shows you as a member at

Was I before? We need someone to sign up to see how that works. Maybe @eric-wieser .

Not sure if that works

All I need is the webhook address, which should be floating around somewhere. I can paste it in on github. At least that worked for appveyor. Should probably be located in the project somewhere, I can also look there.

charris · 2018-09-29T18:21:43Z

The Microsoft documentation still rates as awful. I note that with both your and mine manual builds the repo is checked out, but no tests are run.

charris · 2018-09-29T18:27:10Z

Looks like webhooks need to be created, and I only see them for some slangy microsoft thingie, a "runbook".

tylerjereddy · 2018-09-29T19:30:52Z

Yeah, the docs are pretty bad

charris · 2018-09-29T23:21:05Z

Maybe you need to add a trigger: section to the yml file, as the actual build and tests are not running.

tylerjereddy · 2018-09-30T02:14:23Z

This got me to thinking that it seems unlikely that my fork and NumPy proper would require a different yml, given that both pushes to and PRs to my fork work just fine with the yml merged to NumPy proper. Indeed, by having the owner of my fork (me) set up Azure, I guess the permissions were correct for injecting the webhooks automatically to the github repo.

Here are the URLs used for my azure webhooks in my settings:
https://dev.azure.com/tylerjereddy/_apis/public/hooks/externalEvents (push)
https://dev.azure.com/tylerjereddy/_apis/public/hooks/externalEvents (pull_request)

One had just the push event selected, and the other has pull requests checked off.

So, by analogy, perhaps you can try:
https://dev.azure.com/numpy/_apis/public/hooks/externalEvents (push)
https://dev.azure.com/numpy/_apis/public/hooks/externalEvents (pull_request)

I see that my repo also has some authentication (secret field?) set up for Azure -- not sure if that will work automagically if you fill in those URLs and select push / pull request, etc.

charris · 2018-09-30T03:17:23Z

Ping returns 404 error

chrispat · 2018-10-01T13:43:47Z

You won't be able to just paste in the web hook URLs as there is a shared secret generated by the setup process. The preferred workflow is to enable install the Github App but you can also do Oauth or PAT if necessary.

tylerjereddy · 2018-10-01T17:01:43Z

@charris Do you want to try the Github app approach or just have an owner create a pipelines account? Someone with the appropriate permissions would need to enable the Github App if we go that route. Seems a little tricky!

charris · 2018-10-01T17:10:44Z

I'd be happy to try the App route, I think travisCI works that way. We can try the owner approach if all else fails -- appveyor runs on my account -- but it would be preferable to have an organization account if that is possible.

charris · 2018-10-01T17:21:57Z

@chrisrpatterson So the next question is how to install the Github app?

charris · 2018-10-01T17:27:19Z

So I'm in the process of installing the app, which microsoft account to use?

charris · 2018-10-01T17:32:22Z

I don't have the permissions on azure to attach the numpy account. How did you set up numpy/numpy?

charris · 2018-10-01T17:36:49Z

I'm thinking I'll need to attach my own MS account. Note this was all much easier with travis as it is integrated with github.

charris · 2018-10-01T17:39:32Z

So now I am logged in on github as numpy, let's try that again.

tylerjereddy · 2018-10-01T17:40:37Z

Yeah, this is messy, let me know what I can do--I just clicked "new organization" to create numpy on dev.azure.com & then "create project" so that we have numpy/numpy

chrispat · 2018-10-01T17:45:55Z

@charris yes integrating with github identities is ongoing work and that will make this much easier.

I would think you guys would want to have an Azure DevOps org for numpy and then you can create projects under that as necessary. @tylerjereddy you would need to add @charris MSA as a project administrator in numpy/numpy in order for him to install the app and configure the pipeline.

Here are some docs that might help https://docs.microsoft.com/en-us/azure/devops/organizations/security/set-project-collection-level-permissions?view=vsts&tabs=new-nav

charris · 2018-10-01T17:48:53Z

So adding the app is easy, but to change azure accounts, uninstall/reinstall.

charris · 2018-10-01T18:53:56Z

Travis does the merge commits also, sends me email on failure. There seem to be a significant number of bogus failures these days...

charris · 2018-10-01T19:01:15Z

@rgommers @pv One of you might want to set up an azure scipy account to make sure you have the name.

tylerjereddy · 2018-10-01T22:51:48Z

Windows Azure build log link with a small number of failures / errors reported in case people are curious how that is going. Starting with Python 3.6 there.

charris · 2018-10-01T23:12:29Z

I think it is skipping the fortran tests in any case, might not be worth loading the fortran compiler. Looks like more tempfile problems. Might try NamedTemporaryFile for some of those.

tylerjereddy · 2018-10-01T23:19:32Z

I don't think I load the fcompiler in Windows build? Some apparent precision issues maybe.

charris · 2018-10-01T23:48:57Z

You're right, apparently I was looking at the Mac process.

charris · 2018-10-02T00:15:20Z

Hmm, I don't think the f2py tests are thread safe in general. Does the mac testing run multiple threads?

charris · 2018-10-02T00:19:39Z

lets leave out xdist for the moment. Probably others as well, I don't think we can rely on thread safety, and some of the tests were written many years ago.

tylerjereddy · 2018-10-02T00:22:16Z

Yeah, I set it to auto but we can just turn it off; shippable is using 2 processes to run tests -- we could revert that too, although seems to be fairly stable lately after the warnings patch.

charris · 2018-10-02T00:31:31Z

How do you set the number of threads? I note that following are installed for appveyor.

cython
nose
pytest-timeout
pytest-xdist
pytest-env
pytest-faulthandler

But I don't know that it runs multiple threads.

tylerjereddy · 2018-10-02T02:27:24Z

Our Appveyor config uses a single process for tests; add more processes by specifying -n 2 for two processes and so on, and -n auto for automatic detection (isn't always perfect). That flag comes after the -- fed to runtests.py so that the args are fed directly to pytest.

Still, if it was well-known that our test suite wasn't thread safe, it wouldn't make sense to install that dependency.

charris · 2018-10-02T02:36:51Z

I haven't studied the problem, but there is this in numpy/f2py/tests/util.py

def get_temp_module_name():
    # Assume single-threaded, and the module dir usable only by this thread

And a global directory stored in the module. I'm guessing some other tests might have problems. It is tricky to debug intermittent failures due to threading, so it would probably be good to start with the simple bugs before escalating.

EDIT: That function looks like a good place for a threading context manager

tylerjereddy · 2018-10-02T02:47:08Z

Yeah, I noticed that global in the distutils work! I suspect we agree there's likely a better way

charris · 2018-10-02T18:43:48Z

I think the windows test that is failing is a bad test. Maybe use os.path.samefile? The test (and others) is also not safe against failure, which I think is the cause of the teardown error.

tylerjereddy · 2018-10-02T19:15:01Z

Yeah, I have a local branch with a fix I'm testing -- os.path.realpath does not resolve symlinks on windows so that test was never safe and Azure is correct to fail it.

tylerjereddy · 2018-10-02T22:08:18Z

The remaining 3 Windows Azure failures are all in TestComplexFunctions. Since not everything fails there, my current suspicion is either that our fall back code for when C99 complex.h isn't supported has some issues, or the Microsoft-specific implementation of complex support is behaving slightly differently.

We clearly try to handle this MSVC-specific complex number stuff in numpy/core/src/npymath/npy_math_private.h. Since loss of precision is one of the failures, I do wonder if it is just an alternative implementation type issue.

More recent versions of MSVC apparently do support the majority of C99, but our appveyor builds still try to use MinGW, though perhaps not for that specific reason (there's also gfortran reasons I suppose).

Maybe we're just going to have to add MinGW to the path and / or install if needed?

tylerjereddy · 2018-10-02T22:20:02Z

I note that the Microsoft repo shows that for the Widows image, MinGW bin/ should be on the PATH.

charris · 2018-10-02T22:29:17Z

My understanding was that mingw was not actually used for the appveyor build, although I couldn't tell you exactly why it is there. I think xoviat (disappeared by Github) added it, although that may not have an accurate recollection. What are the failures?

charris · 2018-10-02T22:35:08Z

Note that once we drop 2.7, we will be using VS 2015+, so the StackOverflow comment may be out of date. OTOH, having some problems with the cuts for complex functions would be normal, unfortunately.

tylerjereddy · 2018-10-02T22:37:11Z

We certainly seem to have access to gfortran for the 3.x Windows builds on appveyor, though not sure that is intentional.

The 3 fails from my notes:

TestComplexFunctions.test_branch_cuts
- numpy\core\tests\test_umath.py:2716: AssertionError
- f = <ufunc 'arcsin'>,
TestComplexFunctions.test_branch_cuts_complex64
- numpy\core\tests\test_umath.py:2716: AssertionError
- f = <ufunc 'arcsin'>,
TestComplexFunctions.test_loss_of_precision
- numpy\core\tests\test_umath.py:2602: AssertionError

I'm currently trying a debug run where I force CC=gcc, which should hopefully redirect to the mingw toolchain instead of the msvc stuff, for diagnostic purposes.

tylerjereddy · 2018-10-02T23:13:32Z

Forcing CC: gcc in the env directive didn't seem to help avoid usage of MSVC for NumPy compilation; that debugging route may be tricky.

mattip · 2018-10-03T05:51:09Z

CPython has extensive test cases for complex math, which are used in tests for cpython complex. They pass on python2 with MSVC 2008. It seems the branch cut test is different, rather than checking returned values against a table it checks continuity. I wonder where the difference lies?

tylerjereddy · 2018-10-03T22:00:21Z

Looks like @mattip will try debugging the residual Windows issues. I can add him to our NumPy Azure account once he has a suitable Microsoft account, I think. Matti can use the numpy_azure_win branch on my NumPy fork as a starting point, if desired. Setting up his own fork for Azure triggers may reduce noise a bit.

There was some discussion that we might want to proceed with temporarily xfailing the offending Windows tests so I can proceed with building up the matrix in the yml file, but I think they'd get xfailed on Appveyor too then, so that may not be desirable.

@stefanv suggested we may want to link against openblas for the Azure MacOS build since that's apparently what is done for wheels.

charris · 2018-10-04T01:08:11Z

Linking against OpenBLAS is certainly something to look at, but I think the first thing to do is get up a complete matrix and track down the remaining problems. If we need to blacklist some functions, knowing the MSVC compiler version will be necessary, so that should be printed out somewhere. Do you know what library is being used?

mattip · 2018-10-04T06:29:32Z

knowing the MSVC compiler version will be necessary, so that should be printed out somewhere

@tylerjereddy added the --show-build-log option to runtests.py on his branch

Since this PR is closed, maybe we should continue the discussion on new issue/pr

charris · 2018-10-04T15:09:56Z

Since this PR is closed, maybe we should continue the discussion on new issue/pr

See #12078.

tylerjereddy · 2018-10-24T12:35:41Z

Would also be useful if Azure & Shippable obeyed [ci skip] in commit message--Shippable is documented to (but maybe we have extra triggers overriding or something), and Azure seems to require ***NO_CI***

tylerjereddy added the component: numpy.testing label Sep 28, 2018

charris reviewed Sep 29, 2018

View reviewed changes

TST: add macos azure testing to CI.

fd053f6

tylerjereddy force-pushed the numpy_azure branch from 642a42c to fd053f6 Compare September 29, 2018 05:14

charris merged commit a000144 into numpy:master Sep 29, 2018

tylerjereddy mentioned this pull request Oct 1, 2018

TST: Add Mac testing #9421

Closed

charris mentioned this pull request Oct 4, 2018

MSVC Test failures #12078

Closed

tylerjereddy deleted the numpy_azure branch January 30, 2019 17:31

TST: add macos azure testing to CI #12051

TST: add macos azure testing to CI #12051

Conversation

tylerjereddy commented Sep 28, 2018 • edited Loading

charris Sep 29, 2018

Choose a reason for hiding this comment

charris Sep 29, 2018

Choose a reason for hiding this comment

tylerjereddy Sep 29, 2018

Choose a reason for hiding this comment

charris commented Sep 29, 2018

tylerjereddy commented Sep 29, 2018

charris commented Sep 29, 2018

charris commented Sep 29, 2018

charris commented Sep 29, 2018

charris commented Sep 29, 2018

tylerjereddy commented Sep 29, 2018

charris commented Sep 29, 2018

charris commented Sep 29, 2018

charris commented Sep 29, 2018

tylerjereddy commented Sep 29, 2018

charris commented Sep 29, 2018

tylerjereddy commented Sep 30, 2018

charris commented Sep 30, 2018

chrispat commented Oct 1, 2018

tylerjereddy commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

tylerjereddy commented Oct 1, 2018

chrispat commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 1, 2018

tylerjereddy commented Oct 1, 2018

charris commented Oct 1, 2018

tylerjereddy commented Oct 1, 2018

charris commented Oct 1, 2018

charris commented Oct 2, 2018

charris commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

charris commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

charris commented Oct 2, 2018 • edited Loading

tylerjereddy commented Oct 2, 2018

charris commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

charris commented Oct 2, 2018

charris commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

tylerjereddy commented Oct 2, 2018

mattip commented Oct 3, 2018

tylerjereddy commented Oct 3, 2018

charris commented Oct 4, 2018

mattip commented Oct 4, 2018

charris commented Oct 4, 2018

tylerjereddy commented Oct 24, 2018

tylerjereddy commented Sep 28, 2018 •

edited

Loading

charris commented Oct 2, 2018 •

edited

Loading