[WIP] PERF Replace np.dot with higher level BLAS _gemv #20396

melissawm · 2021-06-26T20:08:52Z

Reference Issues/PRs

Partially addresses #13210

What does this implement/fix? Explain your changes.

Replaces np.dot with a BLAS _gemv call.

Any other comments?

This is my first PR so any feedback is welcome, thanks!

jakirkham · 2021-06-27T03:12:25Z

sklearn/linear_model/_cd_fast.pyx

@@ -506,7 +506,10 @@ def enet_coordinate_descent_gram(floating[::1] w,
    cdef unsigned int n_features = Q.shape[0]

    # initial value "Q w" which will be kept of up to date in the iterations
-    cdef floating[:] H = np.dot(Q, w)
+    # cdef floating[:] H = np.dot(Q, w)
+    cdef floating[:] H = np.zeros(n_features, dtype=dtype)


Would suggest making this empty since it doesn't contribute to the computation below and is overwritten

Suggested change

cdef floating[:] H = np.zeros(n_features, dtype=dtype)

cdef floating[:] H = np.empty(n_features, dtype=dtype)

rth · 2021-06-27T06:35:55Z

Thank you @melissawm !

Could someone remind me what is the motivation of replacing np.dot with the BLAS call in this particular case? I understand the use case for combining multiple operations in a single BLAS call as in #11507 but that's not the case here. Is it that it would release the GIL or that there are less pre-processing?

@melissawm Could you please

benchmark ElasticNet before and after this change
confirm that the way this function is used Q and w is, I assume, always C contiguous, meaning that it was allocated somewhere before in the code, and is not for instance is a user input or obtained by array slicing.

Actually, should we check that the array is C contiguous before applying BLAS functions? Since code can change in the future.

ogrisel · 2021-06-28T14:17:54Z

Q and w are both contiguous (with the C-layout for Q as declared in the prototype of the enet_coordinate_descent_gram Cython function:

floating[::1] w

and

np.ndarray[floating, ndim=2, mode='c'] Q

that being said, I am not sure we would get a performance improvement in this specific situation. Maybe @jeremiedbb knows? In any case, a small benchmark would give us a definite answer.

ogrisel · 2021-06-28T14:21:10Z

BTW, the GIL is already released at line 540. Neither the np.dot version nor the _gemv version need the GIL in this case.

jeremiedbb · 2021-06-28T15:13:01Z

The motivation of the mentioned issue is to use higher level blas functions than the ones currently used when possible, i.e. use gemv instead of dots in a loop for instance, not replace numpy dot par directly calling the corresponding blas function. In would not expect much speed-up in that case, although a benchmark is welcomed because I could be wrong.

jakirkham · 2021-06-28T18:16:31Z

Yeah I would agree with that. It's possible the list I made (a long time ago) included this line erroneously

melissawm · 2021-06-28T18:25:38Z

Got it - this is all helpful information so I'll review all this with benchmarks. I'll mark this one as a draft for completeness and post the benchmarking results here, but you are right that there may not be an improvement. In any case it is possible the I can go thought the list and find other stuff and now I know what to look for. Thanks!

rth · 2021-06-28T18:30:30Z

+1 to implement other cases from #13210 (comment) except for this one.

reshamas · 2021-07-07T17:09:32Z

#DataUmbrella LATAM sprint

reshamas · 2021-07-23T15:03:22Z

Sent @melissawm reminder on PR.

reshamas · 2021-10-23T16:29:26Z

This PR is stalled, and is open to a contributor.

cmarmo · 2022-02-22T01:50:00Z

I have relabeled this PR as per this comment.
Feel free to modify if I am wrong.

lorentzenchr · 2022-03-26T10:30:39Z

I see 2 possible ways forward:

Put the initialization of H just after with nogil and merge - we should not ask to make extensive benchmarks here as we do not except a (well measurable) improvement.
Close this PR (which I prefer over open ends).

@melissawm IMO, it's up to you if you are still interested - and apologies for the long decision processes here:blush:

thomasjpfan · 2022-03-26T14:41:21Z

If there is no measurable improvement, then I prefer to close. np.dot is easier for future contributors & maintainers to reason about.

jeremiedbb · 2022-03-26T17:25:14Z

np.dot is easier for future contributors & maintainers to reason about.

I agree, and as a general rule of thumb, directly calling BLAS should be done only where we don't want any python interaction at all. Elsewhere, numpy should be preferred, unless a convincing benchmarks shows otherwise.

I'm also in favor of closing since it's not benchmarked (and very unlikely to give a speed-up).

lorentzenchr · 2022-03-26T20:15:14Z

So let's close.

PERF Replace np.dot with higher level BLAS _gemv

11982b2

github-actions bot added cython module:linear_model labels Jun 26, 2021

jakirkham reviewed Jun 27, 2021

View reviewed changes

melissawm marked this pull request as draft June 28, 2021 18:26

melissawm changed the title ~~PERF Replace np.dot with higher level BLAS _gemv~~ [ẂIP] PERF Replace np.dot with higher level BLAS _gemv Jun 28, 2021

melissawm changed the title ~~[ẂIP] PERF Replace np.dot with higher level BLAS _gemv~~ [WIP] PERF Replace np.dot with higher level BLAS _gemv Jun 28, 2021

reshamas added help wanted Stalled labels Oct 23, 2021

reshamas added the Sprint label Nov 5, 2021

cmarmo added Needs Decision - Close Requires decision for closing and removed help wanted labels Feb 22, 2022

lorentzenchr closed this Mar 26, 2022

	cdef floating[:] H = np.zeros(n_features, dtype=dtype)
	cdef floating[:] H = np.empty(n_features, dtype=dtype)

Uh oh!

[WIP] PERF Replace np.dot with higher level BLAS _gemv #20396

[WIP] PERF Replace np.dot with higher level BLAS _gemv #20396

Uh oh!

Conversation

melissawm commented Jun 26, 2021

Reference Issues/PRs

What does this implement/fix? Explain your changes.

Any other comments?

Uh oh!

jakirkham Jun 27, 2021

Choose a reason for hiding this comment

Uh oh!

rth commented Jun 27, 2021

Uh oh!

ogrisel commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

ogrisel commented Jun 28, 2021

Uh oh!

jeremiedbb commented Jun 28, 2021 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jakirkham commented Jun 28, 2021

Uh oh!

melissawm commented Jun 28, 2021

Uh oh!

rth commented Jun 28, 2021

Uh oh!

reshamas commented Jul 7, 2021

Uh oh!

reshamas commented Jul 23, 2021

Uh oh!

reshamas commented Oct 23, 2021

Uh oh!

cmarmo commented Feb 22, 2022

Uh oh!

lorentzenchr commented Mar 26, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

thomasjpfan commented Mar 26, 2022

Uh oh!

jeremiedbb commented Mar 26, 2022

Uh oh!

lorentzenchr commented Mar 26, 2022

Uh oh!

Uh oh!

ogrisel commented Jun 28, 2021 •

edited

Loading

jeremiedbb commented Jun 28, 2021 •

edited

Loading

lorentzenchr commented Mar 26, 2022 •

edited

Loading