Skip to content

MAINT Use memoryviews in _expected_mutual_info_fast #25575

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed

Conversation

OmarManzoor
Copy link
Contributor

Reference Issues/PRs

Towards: #25484

What does this implement/fix? Explain your changes.

  • Replaced cnp.ndarray with memory views in _expected_mutual_info_fast.pyx

Any other comments?

CC: @Vincent-Maladiere @jjerphan @adam2392

Copy link
Contributor

@Vincent-Maladiere Vincent-Maladiere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hi @OmarManzoor, thanks for the PR :)
In this case, I wonder if memoryviews are justified since we only use the underlying buffers and return np.array through np.log or gammaln. WDYT?

@OmarManzoor
Copy link
Contributor Author

OmarManzoor commented Feb 10, 2023

Hi @OmarManzoor, thanks for the PR :) In this case, I wonder if memoryviews are justified since we only use the underlying buffers and return np.array through np.log or gammaln. WDYT?

I think we can just make use of cnp arrays here. However we are indexing arrays later in the code.

@OmarManzoor
Copy link
Contributor Author

Hi @OmarManzoor, thanks for the PR :) In this case, I wonder if memoryviews are justified since we only use the underlying buffers and return np.array through np.log or gammaln. WDYT?

I think we can just make use of cnp arrays here. However we are indexing arrays later in the code.

This is what I am referring to.

cdef Py_ssize_t i, j, nij
for i in range(R):
for j in range(C):
for nij in range(start[i,j], end[i,j]):
term2 = log_Nnij[nij] - log_a[i] - log_b[j]
# Numerators are positive, denominators are negative.
gln = (gln_a[i] + gln_b[j] + gln_Na[i] + gln_Nb[j]
- gln_N - gln_nij[nij] - lgamma(a[i] - nij + 1)
- lgamma(b[j] - nij + 1)
- lgamma(N - a[i] - b[j] + nij + 1))
term3 = exp(gln)
emi += (term1[nij] * term2 * term3)

@Vincent-Maladiere
Copy link
Contributor

I don't understand your point; could you give more details?
After discussion with @jjerphan, memoryviews allow decoupling our implementations from numpy, so even using .base is still a feature that brings value.
However, in this specific case, there is currently a significant revamp work of _expected_mutual_info_fast by @glemaitre, so I wonder if we should wait before merging this PR.

@jjerphan
Copy link
Member

#24794 is the PR @Vincent-Maladiere mentioned.

@OmarManzoor
Copy link
Contributor Author

#24794 is the PR @Vincent-Maladiere mentioned.

Looking at the PR I think that if there is work being done in this file then we can revisit this PR after that work is finalized. What do you think?

@jjerphan
Copy link
Member

jjerphan commented Feb 14, 2023

This is the best approach I think. I would prioritize making all the C extensions use the latest NumPy C API as mentioned in #24875.

@OmarManzoor OmarManzoor deleted the cython_expected_mutual_info branch March 7, 2023 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants