Question: Most efficient way to set or remove the diagonal in large matrices #487

q-aaronzolnailucas · 2023-07-31T19:12:55Z

Hi, what's the best way to set or remove the diagonal (self-loops) from a graphblas Matrix?

eriknw · 2023-07-31T19:46:44Z

Good questions, thanks for asking! Here are a few examples:

>>> import graphblas as gb
>>> import numpy as np

>>> A = gb.Matrix.from_dense(np.arange(9).reshape(3, 3))
>>> A
"M_0"      nvals  nrows  ncols  dtype  format
gb.Matrix      9      3      3  INT64   fullr
---------------------------------------------
   0  1  2
0  0  1  2
1  3  4  5
2  6  7  8

>>> # Diagonal of a Matrix as a Vector
>>> A.diag()
"v_0"      nvals  size  dtype  format
gb.Vector      3     3  INT64    full
-------------------------------------
index  0  1  2
value  0  4  8

>>> # Off-diagonal of a Matrix as a Vector
>>> A.diag(1)
A"v_1"      nvals  size  dtype  format
gb.Vector      2     2  INT64    full
-------------------------------------
index  0  1
value  1  5

>>> # Select diagonal of a Matrix as a Matrix
>>> A.select("diag")
gb.MatrixExpression                         nrows  ncols  dtype
M_0.select(select.diag[BOOL], thunk=False)      3      3  INT64

"Result"   nvals  nrows  ncols  dtype
gb.Matrix      3      3      3  INT64
-------------------------------------
   0  1  2
0  0
1     4
2        8

Do expr.new() or other << expr to calculate the expression.

>>> # Select off-diagonal of a Matrix as a Matrix (using alternative API)
>>> gb.select.diag(A, 1)
gb.MatrixExpression                     nrows  ncols  dtype
M_0.select(select.diag[BOOL], thunk=1)      3      3  INT64

"Result"   nvals  nrows  ncols  dtype
gb.Matrix      2      3      3  INT64
-------------------------------------
  0  1  2
0    1
1       5
2

Do expr.new() or other << expr to calculate the expression.

>>> # Remove diagonal elements of a Matrix using select
>>> A << gb.select.offdiag(A)
>>> A
"M_0"      nvals  nrows  ncols  dtype   format
gb.Matrix      6      3      3  INT64  bitmapr
----------------------------------------------
   0  1  2
0     1  2
1  3     5
2  6  7

>>> # Assign a Vector to the diagonal of a Matrix
>>> v = gb.Vector.from_dense([1, 2, 3])
>>> v
A(Diag.S) << Diag"v_2"      nvals  size  dtype  format
gb.Vector      3     3  INT64    full
-------------------------------------
index  0  1  2
value  1  2  3

>>> A(gb.op.second) << v.diag()

>>> # or
>>> Diag = v.diag()
>>> A(Diag.S) << Diag
>>> A
"M_0"      nvals  nrows  ncols  dtype  format
gb.Matrix      9      3      3  INT64   fullr
---------------------------------------------
   0  1  2
0  1  1  2
1  3  2  5
2  6  7  3

Does this help? Is there any way we can make this better?

eriknw · 2023-07-31T19:51:50Z

I would be curious to see which of these is faster in general to assign a Vector to a diagonal:

>>> A(gb.op.second) << v.diag()

or

>>> Diag = v.diag()
>>> A(Diag.S) << Diag

I would expect and hope they are comparable.

q-aaronzolnailucas · 2023-08-01T10:05:41Z

@eriknw thank you for the detailed response - and apologies if this would have fit better in 'discussions', will keep in mind next time.

I will try both

>>> A(gb.op.second) << v.diag()

and

>>> Diag = v.diag()
>>> A(Diag.S) << Diag

and report back if I see significant differences.

As to improvements - I think adding/removing self loops is quite a common operation - e.g. removing loops before pagerank or adding them before GNN message passing.

Scipy has a setdiag method that is quite useful, maybe wrapping some of your answer into something like that (but that fits the lazy execution model).

I think particularly

A(gb.op.second) << v.diag()

is quite hard to read. It's also more verbose to set the diagonal to a scalar.

I'd be happy to have a go at implementing a setdiag method if it fits in the paradigm

jim22k · 2023-08-02T15:29:04Z

Removing the diagonal is easy.

gb.select.offdiag(A)

Setting the diagonal is necessarily more complex due to the interactions with the existing diagonal.

Replace every element of the diagonal with a Scalar value
Replace existing elements of the diagonal with a Scalar value
Accumulate a Vector into the existing diagonal
Replace the existing diagonal with values from a Vector (leaving untouched the values on the diagonal where the Vector has missing values)
Fully replace the existing diagonal with a Vector (dropping values on the diagonal where the Vector has missing values)

I worry that a setdiag method would need to be quite complex to capture these different situations.

q-aaronzolnailucas · 2023-08-03T08:29:24Z

I see your point @jim22k, but wouldn't the complex nature be a good reason to have an implementations, so that users don't have to write it each time? I think like the nuances in the most efficient approach for each of the above requires a deep understanding of graphblas

eriknw · 2023-08-12T04:32:42Z

I'd like to entertain the possibility of a simpler A.setdiag method.

You both make good points. I agree this may be a common and useful operation. It doesn't need to do every option that Jim listed--and probably shouldn't--but its docstring should discuss how to do things "manually" if one needs accumulation, masking, etc. This is a reasonable method to look for--especially since scipy has it--so we can make it findable, useful, and educational.

Firstly, I don't think A.setdiag needs to be lazy. A.diag and v.diag both return new objects, not expressions. Calling A.setdiag can update A.

Here's what I think it could look like:

def setdiag(self, values, k=0):

where values can be:

scalar (Python scalar or gb.Scalar)
- replace all diagonal elements with the value
np.ndarray
- replace all diagonal element with the values
Vector
- replace diagonal values where the vector has values, and leave other diagonal values alone
Matrix (maybe)
- replaces diagonal values with the diagonal values of the input matrix, and leave missing diagonal elements alone

Right now, I'm leaning towards adding something like this, and am interested in hearing what other people think. Arguments for adding it are:

it has a good name
it is useful
it (probably) reads better (for most people)
implementating different cases can be non-obvious / non-trivial for some
- in particular, the best/fastest way may not be obvious (we should benchmark and choose)
the implementation can be simple (for us experts) and fast
people may look for it and expect it (as @q-aaronzolnailucas did)
it can be educational

Some arguments for not adding it are:

API pressure to add mask (vector or matrix?) and accumulator as keywords, which is not a common pattern for us
explicit recipes exist to do the operation you want
may be ambiguous when reading code whether missing values from vector will remove diagonal elements from A
- for example, is it reasonable to expect that A.setdiag(v) ; v.isequal(A.diag()) is always true?
- API pressure to add an option to control this behavior (i.e., clear_missing= or whatever)
- this will need to be clearly documented
should setdiag return self so it can be used in method chaining, or should it return None so there is no confusion about return value (since it doesn't return an expression)?

I'll retort to myself that API pressure isn't necessarily a bad thing. It offers possible future enhancements if there is sufficient need.

Thanks for the suggestion and engaging with us @q-aaronzolnailucas! I really appreciate ideas to improve usability.

eriknw · 2023-09-05T21:12:20Z

We've discussed various ways to set the diagonal during our weekly meetings. Here are some options as I understand them:

# 1
A.setdiag(v)
A.setdiag(1, k=1)
A.setdiag([1, 2, 3])
A('+').setdiag(v)
A(v.S).setdiag(v)

# 2
A.setdiag(v)
A.setdiag(v, accum="+")
A.setdiag(v, mask=v.S)

# 3
A[diag] = v
A[diag(0)] = v
A(...)[diag(0)] = v
A[diag(0)](...) << v

# 4
A[diag, 0] = v
A['diag', 0] = v

# 5
#A.diag[0].new()  # extract diagonal--maybe--not in GraphBLAS (diag is a constructor)
A.diag[0] = v
# Note we already have A.diag() -> Vector
v = A.diag(k)
A.diag[k] = v

I think the top two options are 2 and 3. Maybe we can begin with option 2?

eriknw · 2023-09-22T15:46:29Z

Closing. A.setdiag(k=k, mask=mask, accum=accum) was added in #493, and extracting the diagonal is best done with A.diag(k) or gb.select.diag(A, k).

Thanks again for the questions and suggestions @q-aaronzolnailucas! I think A.setdiag is a nice addition to the GraphBLAS API.

q-aaronzolnailucas changed the title ~~Question: Most efficient way to set the diagonal in large matrices~~ Question: Most efficient way to set or remove the diagonal in large matrices Jul 31, 2023

eriknw added the discussion Discussing a topic with no specific actions yet label Jul 31, 2023

eriknw mentioned this issue Aug 23, 2023

Add A.setdiag(x, k) #493

Merged

eriknw closed this as completed Sep 22, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Question: Most efficient way to set or remove the diagonal in large matrices #487

Question: Most efficient way to set or remove the diagonal in large matrices #487

q-aaronzolnailucas commented Jul 31, 2023

eriknw commented Jul 31, 2023

Uh oh!

eriknw commented Jul 31, 2023

Uh oh!

q-aaronzolnailucas commented Aug 1, 2023 •

edited

Loading

Uh oh!

jim22k commented Aug 2, 2023

Uh oh!

q-aaronzolnailucas commented Aug 3, 2023

Uh oh!

eriknw commented Aug 12, 2023

Uh oh!

eriknw commented Sep 5, 2023

Uh oh!

eriknw commented Sep 22, 2023

Uh oh!

Question: Most efficient way to set or remove the diagonal in large matrices #487

Question: Most efficient way to set or remove the diagonal in large matrices #487

Comments

q-aaronzolnailucas commented Jul 31, 2023

eriknw commented Jul 31, 2023

Uh oh!

eriknw commented Jul 31, 2023

Uh oh!

q-aaronzolnailucas commented Aug 1, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

jim22k commented Aug 2, 2023

Uh oh!

q-aaronzolnailucas commented Aug 3, 2023

Uh oh!

eriknw commented Aug 12, 2023

Uh oh!

eriknw commented Sep 5, 2023

Uh oh!

eriknw commented Sep 22, 2023

Uh oh!

q-aaronzolnailucas commented Aug 1, 2023 •

edited

Loading