Skip to content

Add from_dense and to_dense methods #382

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 15 commits into from
Feb 17, 2023
Merged

Conversation

eriknw
Copy link
Member

@eriknw eriknw commented Feb 4, 2023

Closes #378.

@jim22k, I think we discussed to/from dense years ago, and I think we punted on methods like in this PR b/c it's surprisingly awkward to do in GraphBLAS. This PR shows that, indeed, this is difficult to do well, so I think they're good methods to add.

I also added sort=True to e.g. to_csr, because the C spec says column indices within a row may be unsorted, but sometimes we want them to be sorted. to_coo already has sort=True, so adding sorting more places seems natural.

Currently, from_dense is also used to create a fully dense Vector/Matrix from a scalar, which requires the shape to be provided. I think I like this better than a separate method such as Matrix.from_scalar(1, nrows=3, ncols=4).

This also improves some handling of sub-array dtypes.

I still need to add documentation. I think the APIs and behaviors need the most careful review.

@eriknw eriknw added feature Something is missing io Data input, output, and conversions labels Feb 4, 2023
@coveralls
Copy link

coveralls commented Feb 4, 2023

Coverage Status

Coverage: 99.411% (-0.08%) from 99.495% when pulling b4a80a4 on eriknw:from_dense into 1bbce69 on python-graphblas:main.

@eriknw eriknw marked this pull request as draft February 5, 2023 05:07
@eriknw eriknw marked this pull request as ready for review February 5, 2023 17:56
@eriknw
Copy link
Member Author

eriknw commented Feb 8, 2023

missing_value= is a nice addition to from_dense. However, missing_value= doesn't make any sense for building a dense object from a scalar, so I feel like this is now squeezing too much functionality into from_dense.

Compare:

def from_dense(cls, value, missing_value=None, dtype=None, *, nrows=None, ncols=None, name=None):
    # handle scalars and arrays

vs

    def from_dense_scalar(cls, value, nrows, ncols, dtype=None, *, name=None):
    def from_dense_array(cls, values, missing_value=None, dtype=None, *, name=None):

I prefer two methods. But, what should we name them?

@eriknw
Copy link
Member Author

eriknw commented Feb 8, 2023

I prefer two methods. But, what should we name them?

Perhaps:

  • from_dense for arrays
  • Add fill_value=None to the main constructor to fill with a scalar
    • e.g., Vector(size=5, fill_value=10)
  • or from_iso_value for scalar

@eriknw eriknw added the deprecation Something is being removed label Feb 9, 2023
@eriknw
Copy link
Member Author

eriknw commented Feb 9, 2023

Splitting from_dense into two functions (one for "dense from scalar") simplified the implementation, so +1 from me.

I still need to update documentation. Do we like from_iso_value?

@SultanOrazbayev
Copy link
Member

Splitting from_dense into two functions (one for "dense from scalar") simplified the implementation, so +1 from me.

Yes, it is more readable now.

I still need to update documentation. Do we like from_iso_value?

Looks good, I'm not sure how frequently this specific function is used (e.g. in algorithms).

@eriknw
Copy link
Member Author

eriknw commented Feb 10, 2023

Updated and ready for review.

@eriknw
Copy link
Member Author

eriknw commented Feb 15, 2023

We discussed this in our meeting today, and we settled on from_scalar instead of from_iso_value. We should also add a couple more notes such as this being cheap / low storage in SuiteSparse:GraphBLAS, and maybe show an alternative example for how to create a new matrix with a scalar value and a mask.

We could consider adding a mask= keyword to from_scalar, but this implies a shape, which makes things a little awkward.

@eriknw
Copy link
Member Author

eriknw commented Feb 15, 2023

from_iso_value is renamed to from_scalar, and I made a note about SuiteSparse:GraphBLAS being efficient.

@jim22k, what note about masks did you want to add?

I think this PR is ready to merge, but I'll wait for an approval (for a couple days).

@jim22k
Copy link
Member

jim22k commented Feb 15, 2023

The note could say that if you want to create an iso-valued Matrix or Vector with the same structure as an existing object, use obj.apply(binary.second, right=scalar).

…calar

A more flexible way with any mask is e.g.:
```python
w = Vector(v.dtype, size=v.size)
w(~v.S) << value
```
@eriknw
Copy link
Member Author

eriknw commented Feb 17, 2023

Thanks @jim22k and @SultanOrazbayev for your input on this PR! Even though I'm getting credit for commits, I know y'all deserve credit for helping too, because your engagement does make things better ❤️

This is going in! 🚀

@eriknw eriknw merged commit d226a51 into python-graphblas:main Feb 17, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecation Something is being removed feature Something is missing io Data input, output, and conversions
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Add easier way to create dense vectors and matrices
4 participants