@thread17 Agree!
To be honest, it feels like scan is pretty much always used to compute a sum of elements with no regard to the partial sums you're left with. I'm sure there are other great uses of sum that require the partial sums, but I'm wondering why we don't just use a sum algorithm in situations like this one.
Does the overhead of having to determine which indices in the values array need to be multiplied incur a significant performance hit when doing sparse matrix arithmetic? I read a bit about optimizing sparse matrix operations, but I've been a bit confused as to how optimal it can be given the fact that we now have to load a bunch of different arrays and index into each for any arithmetic. Is the main benefit saving space in memory?
Can anyone please explain how flag works here? I got the row_starts part, but I'm not sure how it transfer to [1,0,1,1,0,0]. Thanks!
nvm, I think I figured it out if it's [1,0,1,1,1,0,0]. Thanks @thread17 and @icebear101 !
@swkonz I feel like the main benefit would indeed be saving space in memory. I think this example in particular is pretty small but you could imagine having a 1mil x 1mil sparse matrix with only say 25 elements set, and this alternate approach would be much more memory efficient for both storing in memory and when it comes to loading the data necessary to perform the arithmetic. Also, I think in terms of performing the arithmetic, it does have to figure out which indices in the values array need to be multiplied, but the regular version would multiply every row with every column, including the 0 elements, so maybe there are way less math ops in the optimized algorithm which could hide the performance hit from determining indices?
Can someone explain how the row_starts = [0,2,3,4] was computed a head of time using only the matrix given in the picture? I understand how one can come up with the vector [1,0,1,1,1,0,0] by looking at the matrix, but I don't get what the numbers mean?
nvm, I realize that the indexes the [1,0,1,1,1,0,0] is a vector that is 1 precisely at the locations of [0,2,3,4].
Could someone explain where row_starts/flags comes from? I see their relationship with each other but how can one see this from the matrix itself?
The value at index i
in the row_starts
array is equal to the index of the subarray representing row i
within the (flattened) values
array. For index 0, the row is [3, 1], whose index in values
is 0, so we get row_starts[0] == 0
. For index 1, the row is [2], whose index in values
is 2 (since it's represented as a flat array), so we get row_starts[1] == 2
. flags
, of course, is generated straightforwardly from row_starts
.
Please log in to leave a comment.
Should flags = [1,0,1,1,1,0,0] here because there are 4 rows in the matrix?