Message Boards Message Boards

4
|
7218 Views
|
10 Replies
|
12 Total Likes
View groups...
Share
Share this post:

Possible bug: Map puts SparseArray in inconsistent state

Posted 7 years ago
POSTED BY: Szabolcs Horvát
10 Replies
POSTED BY: Szabolcs Horvát
POSTED BY: Ilian Gachevski

Okay, "promises" was too strong a word.

The documentation is somewhat vague on ArrayRules, so one is left to make some guesses.

To interpret the documentation properly, it is also critical to distinguish between arguments described as list, which refer to dense arrays, and arguments described as s or SparseArray[...], which refer only to SparseArrays. I did not pay sufficient attention to this distinction.

The "NonzeroPositions" are not recomputed automatically, as doing so would be unreasonably expensive.

This is completely reasonable.

What ArrayRules actually does is, in my opinion, not so reasonable from a user-friendliness perspective.

Let me describe what it does precisely (as I understand it now).

ArrayRules[x] will:

  • If x is a SparseArray, then return the background element (which may be nonzero!) and explicit values (which may contain the background) in a specific format.
  • Otherwise, return positions/values that are nonzero.

The documentation says, "ArrayRules[list] gives rules for SparseArray[list]", which is a completely reasonable concise explanation, but of course it should not be taken literally, as evidenced by the difference between ArrayRules[{{1} -> 1, {4} -> 1}] and ArrayRules@SparseArray[{{1} -> 1, {4} -> 1}].

There is also a line saying "ArrayRules[list] assumes a default value of 0." Notice that this is only true for dense arrays (which is implied by the name list, but that is too easy to overlook).

ArrayRules[x, b] will:

  • If x is a SparseArray with background element b, then return the background element and explicit values (which may contain the background) in a specific format
  • If x is a SparseArray with background element other than b, then return the positions/values different from b
  • If x is not a SparseArray then return positions/values different from b.

This may be okay from the implementor's perspective, but it is disturbingly inconsistent from a user's perspective. Why is there a special casing for whether the second argument is equal to the background element of the first one?

  1. ArrayRules[x] could be a function that just reveals the structure of a SparseArray: explicit values/positions and background element. In this case it should not work on a dense array and should not have a second argument. There is no need for a second argument because that background element is what it is. This would be a simple to understand function that operates on a sparse array data structure.

  2. Otherwise ArrayRules[x, b] could be a function which returns the values/positions of elements different from b. The default b could be taken to be 0 (for the one-argument form). This ArrayRules would not care about what data structure x is (dense or sparse). It would operate at a higher, conceptual level. It is also simple to understand.

Instead, we have an ArrayRules which is a confusing mix of these two things.

All these things are not precisely stated in the documentation. The behaviour is sufficiently complicated that it would be hard to describe it concisely. So I am left to guess. Given that ArrayRules works on dense arrays and that it has a second argument, guessing that the two-argument form does what I described in 2. above was completely reasonable.

POSTED BY: Szabolcs Horvát

ArrayRules[arr, x] promises to return the positions of only those elements of arr which are different from x.

Out of curiosity, where exactly is this promise stated?

POSTED BY: Ilian Gachevski

In tutorial/SparseArrays-LinearAlgebra it says: "the rules for nonzero elements in an array". But it is not said in the documentation of ArrayRules itself I think..

POSTED BY: Sander Huisman

I am retyping this response because after I typed up most of it, it simply vanished with no warning (and no interaction on my part) ... not sure why.

I've always wondered if AdjacencyLists, Background, ColumnIndices, Density, MatrixColumns, MethodInformation, Methods, NonzeroPositions, NonzeroValues, PatternArray, Properties, and RowPointers are actually officially supported and should be used.

These are commonly used, and shown in some documentation examples (just search for them). You will also find them in a lot of source code that comes with Mathematica. I really doubt that they will go away.

But here I only used them to pinpoint the problem. The problem manifests itself in a practical way during the use of ArrayRules.

But yeah, seems to be a 'bug', though not a critical one, as your matrix is fine...

I think this is a critical problem because it leads to incorrect results. I am not sure that we can say that the matrix is fine if the various documented functions no longer work correctly with it. See the ArrayRules example.

I used ArrayRules to replace all zero elements of a sparse or dense matrix by something else. The implementation needed to work on huge sparse arrays, so converting to a dense format was not an option. This bug broke it.

Specifically, I needed to replace all zeros in a sparse adjacency matrix by Infinity, so I could use WeightedAdjacencyGraph to convert back the output of WeightedAdjacencyMatrix to a graph. The former uses Infinity to represent nonexistent edges, while the latter uses zero (0 or 0.0). I am not sure why there is a mismatch between the two. It is only an inconvenience.

I Mapped a cutoff function onto the weighted adjacency matrix to replace some small weights with zeros (and thus remove those edges). That is how the inconsistent sparse matrix was created.

POSTED BY: Szabolcs Horvát

What I meant was, is that when you construct an array from the array rules the matrix is not 'wrong', it is however not stored most efficiently... Someone should have a look at it though...

POSTED BY: Sander Huisman

My point is that ArrayRules[arr, x] promises to return the positions of only those elements of arr which are different from x. This is regardless of whether arr is sparse or dense or what its background element is when it is sparse.

This is not what it does here. ArrayRules[arr, 0] is giving me the positions of some zeros.

That's a wrong result, in other words: a bug. This is not about inefficient storage. It is about a function giving an incorrect result.

POSTED BY: Szabolcs Horvát
POSTED BY: Sander Huisman

I've always wondered if AdjacencyLists, Background, ColumnIndices, Density, MatrixColumns, MethodInformation, Methods, NonzeroPositions, NonzeroValues, PatternArray, Properties, and RowPointers are actually officially supported and should be used. I've used some of these myself, but I always have in the back of my mind the idea that some day it might not work anymore... I could not find any of the properties in the documentation...

But yeah, seems to be a 'bug', though not a critical one, as your matrix is fine...

POSTED BY: Sander Huisman
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract