Skip to content

SVD fails to converge (Trac #990) #1588

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
thouis opened this issue Oct 19, 2012 · 18 comments
Closed

SVD fails to converge (Trac #990) #1588

thouis opened this issue Oct 19, 2012 · 18 comments

Comments

@thouis
Copy link
Contributor

thouis commented Oct 19, 2012

Original ticket http://projects.scipy.org/numpy/ticket/990 on 2009-01-31 by trac user MikeTrumpis, assigned to @charris.

I'm running into a "!LinAlgError: SVD did not converge" in both numpy and scipy for a given matrix (npy file linked below). The matrix is very badly conditioned (but SVD is meant for that).

np version 1.3.0.dev6083
mac os 10.4, lapack_lite built with -Wl,-framework -Wl,Accelerate

I can get results for this matrix using the SVD routines of either Matlab or Octave.

This is similar to ticket #1304, but the solution there (r4914) can't help me, since my build doesn't use dlapack_lite.c. Since I can get solutions in Matlab and Octave, perhaps there is some post-LAPACK solution for numpy too?

https://cirl.berkeley.edu/twiki/pub/User/MikeTrumpis/sinc_operator.npy

Mike

In [156]: [u,s,vt] = np.linalg.svd(snc_op, 1, 1)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)

/Users/miket/sandbox/trunk/testdata/siemens/<ipython console> in <module>()

/opt/local/lib/python2.5/site-packages/numpy/linalg/linalg.pyc in svd(a, full_matrices, compute_uv)
   1024                                  work, lwork, iwork, 0)
   1025     if results['info'] > 0:
-> 1026         raise LinAlgError, 'SVD did not converge'
   1027     s = s.astype(_realType(result_t))
   1028     if compute_uv:

LinAlgError: SVD did not converge

In [157]: [u,s,vt] = sp.linalg.svd(snc_op, full_matrices=1, compute_uv=1)
---------------------------------------------------------------------------
LinAlgError                               Traceback (most recent call last)

/Users/miket/sandbox/trunk/testdata/siemens/<ipython console> in <module>()

/opt/local/lib/python2.5/site-packages/scipy/linalg/decomp.pyc in svd(a, full_matrices, compute_uv, overwrite_a)
    919     else: # 'clapack'
    920         raise NotImplementedError,'calling gesdd from %s' % (gesdd.module_name)
--> 921     if info>0: raise LinAlgError, "SVD did not converge"
    922     if info<0: raise ValueError,\
    923        'illegal value in %-th argument of internal gesdd'%(-info)

LinAlgError: SVD did not converge
@numpy-gitbot
Copy link

trac user MikeTrumpis wrote on 2009-02-01

update-- I can't reproduce the error in a Linux environment on the same machine. The lapack_lite module is built with:

numpy version 1.3.0.dev6336
-llapack -lblas -lcblas -latlas
lapack/blas atlas 3.9.2

Looks like the difference is in Mac's Accelerate framework

Replying to [ticket:990 MikeTrumpis]:

I'm running into a "!LinAlgError: SVD did not converge" in both numpy and scipy for a given matrix (npy file linked below). The matrix is very badly conditioned (but SVD is meant for that).

np version 1.3.0.dev6083
mac os 10.4, lapack_lite built with -Wl,-framework -Wl,Accelerate

I can get results for this matrix using the SVD routines of either Matlab or Octave.

This is similar to ticket #1304, but the solution there (r4914) can't help me, since my build doesn't use dlapack_lite.c. Since I can get solutions in Matlab and Octave, perhaps there is some post-LAPACK solution for numpy too?

https://cirl.berkeley.edu/twiki/pub/User/MikeTrumpis/sinc_operator.npy

Mike
{{{

In [156]: [u,s,vt] = np.linalg.svd(snc_op, 1, 1)

LinAlgError Traceback (most recent call last)

/Users/miket/sandbox/trunk/testdata/siemens/ in ()

/opt/local/lib/python2.5/site-packages/numpy/linalg/linalg.pyc in svd(a, full_matrices, compute_uv)
1024 work, lwork, iwork, 0)
1025 if results['info'] > 0:
-> 1026 raise LinAlgError, 'SVD did not converge'
1027 s = s.astype(_realType(result_t))
1028 if compute_uv:

LinAlgError: SVD did not converge

In [157]: [u,s,vt] = sp.linalg.svd(snc_op, full_matrices=1, compute_uv=1)

LinAlgError Traceback (most recent call last)

/Users/miket/sandbox/trunk/testdata/siemens/ in ()

/opt/local/lib/python2.5/site-packages/scipy/linalg/decomp.pyc in svd(a, full_matrices, compute_uv, overwrite_a)
919 else: # 'clapack'
920 raise NotImplementedError,'calling gesdd from %s' % (gesdd.module_name)
--> 921 if info>0: raise LinAlgError, "SVD did not converge"
922 if info<0: raise ValueError,
923 'illegal value in %-th argument of internal gesdd'%(-info)

LinAlgError: SVD did not converge
}}}

@numpy-gitbot
Copy link

@rgommers wrote on 2010-09-21

I can't reproduce this with 1.5.0 on OS X 10.6 either. So problem solved either in numpy or in Accelerate. Can this be closed?

@numpy-gitbot
Copy link

Attachment added by trac user jgarcke on 2010-11-11: svd_dgesvd.py

@numpy-gitbot
Copy link

Attachment added by trac user jgarcke on 2010-11-11: svd_error.mat

@numpy-gitbot
Copy link

trac user jgarcke wrote on 2010-11-11

I also have a problem with some of my matrices and the algorithm used in numpy to compute the SVD. The current algorithm does not work for this and some other matrices.

Somewhat longish explanation and background, numpy uses the following lapack-algorithm:
http://www.netlib.org/lapack/double/dgesdd.f

It says for its error message:

0: DBDSDC did not converge, updating process failed.

Which numpy 'converts' into 'SVD fails to converge' which is wrong. A SVD cannot converge, it is not an algorithm, but a matrix factorization.

Looking further into
http://www.netlib.org/lapack/double/dbdsdc.f we find for its info return:

0: The algorithm failed to compute an singular value.

  •            The update process of divide and conquer failed.
    

Which makes a lot more sense ! The algorithm (which uses a divide and conquer approach) failed, which can happen for some matrices. But this algorithm is typically a couple of factors faster than the other algorithm in lapack which computes a SVD.

That other one is:
http://www.netlib.org/lapack/double/dgesvd.f

Note that in principle it can also fail, or at least it has an error message if something isn't fully working.

0: if DBDSQR did not converge, INFO specifies how many

  •            superdiagonals of an intermediate bidiagonal form B
    
  •            did not converge to zero. See the description of WORK
    
  •            above for details.
    

Further, and important for comparison, this is the algorithm used in Matlab (not sure about Octave, but I would assume so).

But, as said, this algorithm is typically much slower than the other one.
Some more on that here in a lapack working paper:
http://www.netlib.org/lapack/lawnspdf/lawn88.pdf

I'll attach a code of mine using the other (slower) routine on one example matrix where the currently used algorithm fails.

It needs some design decision how to solve this problem in numpy. One can find some (as this one) reports where people say 'SVD fails to converge' in numpy. But if it would happen regularly it would be much more common...

It might be fine to use the current and fast one as the standard one, but the error message needs to be changed in any case.

The other, slower, method should also be included, at least as a 'slow' or 'safe' version of the SVD computations.

One could also think about falling back to the slow one if the current one fails (with a WARNING output of course).

Or make the safe one the default, and rename the current one to 'fast' or something.

But this is a numpy design decision.

@numpy-gitbot
Copy link

@charris wrote on 2010-11-11

Numpy needs translated c versions of lapack routines as a fallback for lapack_lite, we can't count on lapack being available. Look at the source code in numpy/linalg and numpy/linalg/lapack_lite/ to see how this is done. There would probably be some resistance to including more lapack routines in numpy as opposed to scipy but if you are interested in doing that work you can raise the issue on the mailing list.

@numpy-gitbot
Copy link

trac user jgarcke wrote on 2010-11-11

c version would be a totally different question though, first it has to be decided how to handle the situation of the two different SVD routines in lapack, and that the one used in numpy one is very slightly less robust.

I assume the c version is more or less clapack ? Or, in other words, why isn't clapack used from start.

@numpy-gitbot
Copy link

@charris wrote on 2010-11-11

The fallback versions in numpy were produced by f2c, as is clapack. You can find the scripts for that in numpy/linalg/lapack_lite/. Note that lapack_lite only contains a small subset of the available lapack routines and is intended to supply a minimal linear algebra capability, the full (or at least fuller) interface to lapack is in scipy. That said, there are some functions in the current lapack_lite that aren't made available to numpy, LU decomposition for instance, and that could be fixed.

@numpy-gitbot
Copy link

trac user jgarcke wrote on 2010-11-12

Crossref with scipy ticket:
http://projects.scipy.org/scipy/ticket/957

@numpy-gitbot
Copy link

Milestone changed to Unscheduled by @mwiebe on 2011-03-24

seberg pushed a commit to seberg/numpy that referenced this issue Oct 23, 2012
This patch enforces a strict dichotomy for the variables 'indescr'
and 'newdescr', so they are either NULL, or they own a reference.
Following the consequences of this allowed the reference error
to be tracked down.
seberg pushed a commit to seberg/numpy that referenced this issue Oct 23, 2012
The bug was fixed by the previous patch.
@yarikoptic
Copy link
Contributor

ok -- detected with older versions of numpy but now testing with current 1.7.x maintenance v1.7.0rc1-16-gec58fab

On an exemplar matrix http://www.onerussian.com/tmp/a.npy it pukes... a dirty workaround is to add a miniscule amount of noise -- apparently it helps to converge here:

$> PYTHONPATH=/home/yoh/proj/numpy python -c 'import numpy as np; print np.__version__; a=np.load("a.npy"); out_=np.linalg.svd(a+np.random.normal(size=a.shape)*1e-19); print "ok with noise"; np.linalg.svd(a)'                             
1.7.0.dev-ec58fab
ok with noise
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/home/yoh/proj/numpy/numpy/linalg/linalg.py", line 1323, in svd
    raise LinAlgError('SVD did not converge')
numpy.linalg.linalg.LinAlgError: SVD did not converge

It is kinda sad because our (PyMVPA) users run into such cases quite often.

@argriffing
Copy link
Contributor

For what it's worth, my statsmodels PR fails on Travis because of this error but the PR does not fail on my local system which uses openblas LAPACK.

@charris
Copy link
Member

charris commented Feb 15, 2014

@argriffing Is this still a problem.

@charris
Copy link
Member

charris commented Feb 15, 2014

@yarikoptic That matrix factors without problem here.

@argriffing
Copy link
Contributor

@charris That earlier comment probably does not apply to numpy development branch.

Right now the most annoying thing I see with np.linalg.svd on the development branch is that the following numpy code hangs until I kill -9 it.

$ python runtests.py --shell
$ python
Python 2.7.5+
[GCC 4.8.1] on linux2
>>> import numpy as np
>>> np.__version__
'1.9.0.dev-e3f0f53'
>>> A = np.array([[1e3, 0], [0, 1]])
>>> B = np.array([[1e300, 0], [0, 1]])                                          
>>> C = np.array([[1e3000, 0], [0, 1]])                                         
>>> np.linalg.svd(A)
(array([[ 1.,  0.],
       [ 0.,  1.]]), array([ 1000.,     1.]), array([[ 1.,  0.],
       [ 0.,  1.]]))
>>> np.linalg.svd(B)                                                            
(array([[ 1.,  0.],
       [ 0.,  1.]]), array([  1.00000000e+300,   1.00000000e+000]), array([[ 1.,  0.],
       [ 0.,  1.]]))
>>> np.linalg.svd(C)        
[hangs forever]

One argument could be that numpy.linalg is low level and that if you don't want your code to hang or crash or begin running nethack or do other undefined things then you should either be careful that the input doesn't have inf/nan or you should use scipy.linalg instead. On the other hand I think svd is like O(min(N,M)_N_M) and checking isfinite should be O(N*M) so maybe it would not be such an unbearable cost. Maybe the default could be to check, but the check could be disabled with a flag if you want your process to hang if you give svd a big number.

@charris
Copy link
Member

charris commented Feb 15, 2014

Sounds like a reasonable suggestion. Might post it to the list to see if anyone objects.

@yarikoptic
Copy link
Contributor

FWIW -- I should have added a reminder on which box/distro exactly that happened. now can't reproduce with available versions of numpy

@seberg
Copy link
Member

seberg commented May 22, 2019

Looking through this, it seems like a very old linear algebra problem with Accelerate, so closing.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants