Skip to content

ENH: add hermitian=False kwarg to np.linalg.matrix_rank #8557

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 1 commit into from
Sep 16, 2017

Conversation

perimosocordiae
Copy link
Contributor

@perimosocordiae perimosocordiae commented Feb 2, 2017

With a symmetric matrix, the more efficient eigvalsh method can be used to find singular values.

This PR also make some minor docstring fixes, and uses the more efficient count_nonzero method.

None, and ``S`` is an array with singular values for `M`, and
``eps`` is the epsilon value for datatype of ``S``, then `tol` is
set to ``S.max() * max(M.shape) * eps``.
symmetric : bool, optional
If True (default), `M` is assumed to be square and symmetric, enabling
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

its not the default.

@seberg
Copy link
Member

seberg commented Feb 2, 2017

More tests could be nice, such as also for the tolerance. Which also makes me wonder, the singular values are not quite the eigenvalues, no? So if you would change to symmetric, the difference, with respect to the tolerance setting, should be more then just numerical accuracy?

@perimosocordiae
Copy link
Contributor Author

Should I make a new method for the symmetric option tests or keep them where they are now?

My last commit fixes the discrepancy by using the absolute value of the eigenvalues, which (as I understand the math) exactly matches the singular values for a symmetric matrix:

  • svd(X) = sqrt(eigvals(X.T.dot(X))) in the general case
  • X.T.dot(X) = X.dot(X) = matrix_power(X, 2) in the symmetric case
  • eigvals(matrix_power(X, k)) = eigvals(X)**k in general

Combining these equalities gives us svd(X) = sqrt(eigvals(X)**2) = abs(eigvals(X)).

@seberg
Copy link
Member

seberg commented Feb 2, 2017

Yeah, probably right for symmetry. Do whatever you feel looks nicer, my guess is, it would look nicer in different functions ;) and making tests shorter makes it easier to find the code that broke if it gets triggered.

@charris
Copy link
Member

charris commented Feb 2, 2017

What is the speed/accuracy advantage?

@charris
Copy link
Member

charris commented Feb 2, 2017

Isn't the title of this PR incorrect? Looks like matrix_rank is the relevant function. Same with the summary line of the first commit.

@perimosocordiae
Copy link
Contributor Author

@charris: Oops, you're correct on the title mixup. My bad.

Here's some unscientific testing on my Ubuntu machine with master-branch numpy:

In [1]: a = np.random.random((1000, 1000))

In [2]: a = (a + a.T)/2

In [3]: %timeit np.linalg.svd(a, compute_uv=False)
1 loop, best of 3: 218 ms per loop

In [4]: %timeit np.abs(np.linalg.eigvalsh(a))
10 loops, best of 3: 128 ms per loop

@perimosocordiae perimosocordiae changed the title ENH: add symmetric=False kwarg to np.linalg.matrix_power ENH: add symmetric=False kwarg to np.linalg.matrix_rank Feb 2, 2017
@perimosocordiae
Copy link
Contributor Author

I'm not sure how to tell which method is more numerically accurate. They do produce very slightly different results on large badly-scaled matrices, but it's unclear which introduces more floating-point error.

@charris
Copy link
Member

charris commented Feb 10, 2017

How about the name ishermitean instead?

@perimosocordiae
Copy link
Contributor Author

@charris I changed it to is_hermitian.

@homu
Copy link
Contributor

homu commented Mar 4, 2017

☔ The latest upstream changes (presumably #8368) made this pull request unmergeable. Please resolve the merge conflicts.

@eric-wieser
Copy link
Member

Would be better as a rebase rather than a merge. Also, this will conflict with #8682 anyway

@eric-wieser
Copy link
Member

Given we already have eig + eigh, eigvals + eigvalsh, I think we should either:

  • Call the hermitian version matrix_rankh
  • Deprecate eigh and eigvalsh and add eig(..., hermitian=True), eigvals(..., hermitian=True)

@perimosocordiae
Copy link
Contributor Author

Yeah, I was trying out Github's web-based conflict resolution tool. It's nice, but a rebase would indeed be cleaner. If the Github folks are watching, maybe that would be a nice feature to add? ;-) I'll rebase and squash when/if this PR is accepted for merging, especially because I need to fix my commit message.

I'm -1 for matrix_rankh, and in general I found the -h suffix versions pretty inscrutable back when I was first learning the API. Actually deprecating eigh and friends seems like it would affect a lot of code in the wild, though...

@perimosocordiae
Copy link
Contributor Author

One benefit of adding a hermitian kwarg would be to eventually enable eig(..., hermitian='autodetect') or similar, which could be convenient for some users and also matches the Matlab behavior.

@homu
Copy link
Contributor

homu commented Apr 13, 2017

☔ The latest upstream changes (presumably #8682) made this pull request unmergeable. Please resolve the merge conflicts.

@eric-wieser
Copy link
Member

Needs rebase

@perimosocordiae perimosocordiae changed the title ENH: add symmetric=False kwarg to np.linalg.matrix_rank ENH: add hermitian=False kwarg to np.linalg.matrix_rank Jun 7, 2017
@perimosocordiae
Copy link
Contributor Author

Rebased and squashed, and updated the title.

if hermitian:
S = eigvalsh(M)
else:
S = abs(svd(M, compute_uv=False))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did this abs appear? Should it be in the other branch?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

See this comment for the derivation.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, I think you should have abs(eigvalsh(X)) on the line above. Why are you changing what happens for hermitian = False? Was there a bug?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oof, my bad! Fixed now.

hermitian : bool, optional
If True, `M` is assumed to be Hermitian (symmetric if real-valued),
enabling the use of a more efficient method for finding singular values.
Defaults to False.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Needs .. versionadded:: 1.14

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done.

@eric-wieser
Copy link
Member

eric-wieser commented Jul 13, 2017

Do we want to consider allowing hermitian='U', which forwards to eigvalsh(UPLO='U')?

@perimosocordiae
Copy link
Contributor Author

I'm not sure how useful allowing 'U' and 'L' would be, so I'll defer to the consensus.

@eric-wieser
Copy link
Member

eric-wieser commented Jul 16, 2017

I'm not sure how useful allowing 'U' and 'L' would be,

I agree, but we already expose it on eigvals, so there's an argument to keep it. I only mention it as this is probably going to trigger a PR to move all the functions to hermitian=... at some point, and it would be nice to have a roadmap now.

No need to change this PR further

@perimosocordiae
Copy link
Contributor Author

I've opened gh-9436 to get that roadmap out explicitly.

@eric-wieser
Copy link
Member

Needs a release note.

Might be nice to add a benchmark comparing hermitian with not, but not necessary.

@perimosocordiae perimosocordiae force-pushed the patch-3 branch 3 times, most recently from e90561b to 4539ca9 Compare July 22, 2017 19:50
@perimosocordiae
Copy link
Contributor Author

Added a release note and rebased. I didn't add an asv benchmark because those are more useful for tracking efficiency improvements for the same codepaths. I did run a final sanity check using runtests.py -i, though:

Python 3.5.3 (default, Jan 19 2017, 14:11:04) 
Type 'copyright', 'credits' or 'license' for more information
IPython 6.1.0 -- An enhanced Interactive Python. Type '?' for help.

In [1]: x = np.random.random((500, 500))

In [2]: x = (x + x.T) / 2

In [3]: %timeit np.linalg.matrix_rank(x)
157 ms ± 194 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

In [4]: %timeit np.linalg.matrix_rank(x, hermitian=True)
46.7 ms ± 264 µs per loop (mean ± std. dev. of 7 runs, 10 loops each)

@perimosocordiae
Copy link
Contributor Author

Rebased to fix merge conflicts.

@eric-wieser
Copy link
Member

More conflicts, I'm afraid. I'll try and merge this before it conflicts again

@perimosocordiae
Copy link
Contributor Author

Conflicts resolved.

@eric-wieser
Copy link
Member

I don't agree with that last commit - that doesn't seem to match the style of other files.

Can you push again without that commit? Also, I don't think I can restart appveyor with a ci-skip commit last.

With a symmetric matrix, the more efficient `eigvalsh` method
can be used to find singular values.
@perimosocordiae
Copy link
Contributor Author

Ok, I reverted that change and squashed the commits. Hopefully appveyor will work this time!

@eric-wieser
Copy link
Member

Oh I see - the last commit brought it in line with what it was before. Whatever, let's just put this in, since the tests now run. Thanks @perimosocordiae!

@eric-wieser eric-wieser merged commit 55a3abb into numpy:master Sep 16, 2017
@perimosocordiae perimosocordiae deleted the patch-3 branch September 16, 2017 20:20
@perimosocordiae
Copy link
Contributor Author

Thanks for sticking with this, @eric-wieser !

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants