Message Boards Message Boards

1
|
5820 Views
|
7 Replies
|
3 Total Likes
View groups...
Share
Share this post:

Why does the amount of memory for similar matrices vary so much?

Posted 8 years ago

I noticed that similar matrices require vastly different amounts of memory to store. For instance a certain calculated matrix of 1060x2 integer values between 1 and 112 requires about 48 bytes per element to store , while another one with similar integer values generated with RandomInteger requires only about 8 bytes per element. Why is that? Is there a way to reduce the memory necessary to store a matrix? Another thing. I noticed that if I transpose the first matrix the amount of memory necessary for storage decreases to about half. However the transposition does not affect the matrix generated with RandomInteger.

POSTED BY: Petre Logofatu
7 Replies

I said rectangular matrices because they were rectangular in a algebraic sense, although not in a programming sense.

Indeed it works, that is packing matrices the way you showed me saves a lot of memory, although slows the program. I was trying to build a matrix of array rules for a sparse matrix. I could not build a packed array of positions but I broke the positions array in 4 packed arrays and I added the corresponding sparse matrices. I noticed that the array with positions is taking more memory than the sparse matrix itself. This is why I was trying to minimize the memory necessary for those matrices. Also, I thought about building many sparse matrices with only parts of the total elements and then to add them. It saves even more memory, indeed, but then the program runs very slow. Thanks again.

If you know any tricks about working fast with sparse arrays, please let me know.

POSTED BY: Petre Logofatu

I forgot to add:

If an expression is rectangular (test with ArrayQ), and all numeric of the same type, than it can be converted to a packed array. Note that I said can, the two criteria are prerequisites for it to be able to become a packed array. Functions like RandomInteger know it will always output an rectangular ('full') array, and there will output a packed array by default. However table can't do that:

Table[{i, j}, {i, 2}, {j, 3}]

Though this is both a full array (ArrayQ => True), and all of the same type, it is NOT automatically converted because it is beforehand not known what elements it can expect during the evaluation of your 'table loop'.

Functions like RandomInteger, ConstantArray, Tuples among others (can!) directly give packed arrays, and I think Table never returns packed arrays.

N.B. You could use On["Packing"], to track if expression get unpacked during evaluation...

POSTED BY: Sander Huisman

Thank you. I will try to digest and to put in practice what you just told me. I will get back to you later to tell you if I was successful.

POSTED BY: Petre Logofatu

Without explicit code it is hard to say what is happening exactly....

But I'd bet that:

Developer`PackedArrayQ[array1]
Developer`PackedArrayQ[array2] 

will give True and False for those two matrices.

Some functions return so-called packed arrays, while others are not (because they are not rectangular (in any dimensions), or contains symbols, or mix floats and integers et cetera). These packed arrays can only be rectangular (so M or MxN or MxNxO or MxNxOxP.... et cetera) and must have the same numerical datatype.

Look at the following functions for more info:

ToPackedArray
FromPackedArray
PackedArrayForm
PackedArrayQ
POSTED BY: Sander Huisman

Thank you. Indeed it is a problem of packing. However the two matrices do not differ in the sense that one has symbols, or a mix of floats and integers. They are both rectangular and all their elements are positive integers. For illustration I added the sample code below. Indeed, after packing pd has the same byte count as m. After transposition the byte count of pd decreases to almost half, although the transposed pd is still unpacked. My real problem is, I see now, that, during construction, before packing, matrices like pd become very large and the computer runs out of memory. What I really need is pd to be small, packed even during construction. It still puzzles me why there are are necessary 40-50 bytes per element for storage.

In[1]:= Needs["Developer`"]
m = 5; n = 7; r = 4;
pl = Flatten[Table[{i, j}, {i, r}, {j, i}], 1];
pu = Flatten[Table[{i, j}, {i, r}, {j, i + 1, r}], 1];
nl = r (r + 1)/2; nu = r (r - 1)/2;
pdd = Flatten[ 
   Table[ConstantArray[{(i - 1) r, (j - 1) r}, nl] + pl, {i, r}, {j, 
     i}], 2];
pds = Flatten[ 
   Table[ConstantArray[{(i - 1) r, (j - 1) r}, nu] + pu, {i, r}, {j, 
     i}], 2];
pd = Flatten[{Table[r^2 (j - 1) + pdd, {j, n}], 
    Table[ConstantArray[{r^2 j, r^2 (j - 1)}, nl nu] + pds, {j, 
      n - 1}]}, 2];

In[9]:= Dimensions[pd]

Out[9]= {1060, 2}

In[10]:= ByteCount[pd]

Out[10]= 80936

In[11]:= 80936/(1060 2) // N

Out[11]= 38.1774

In[12]:= Max[pd]

Out[12]= 112

In[13]:= m = Table[RandomInteger[111] + 1, {i, 1060}, {j, 2}];
ByteCount[m]

Out[14]= 9304

In[15]:= 9304/(1060 2) // N

Out[15]= 4.38868

In[16]:= PackedArrayQ[pd]

Out[16]= False

In[17]:= PackedArrayQ[m]

Out[17]= True

In[18]:= pdpack = ToPackedArray[pd];
ByteCount[pdpack]

Out[19]= 9304

In[25]:= boo = True;
Do[boo = boo \[And] Head[pd[[i, j]]] == Integer, {i, 1060}, {j, 2}]
boo

Out[27]= True

In[28]:= boo = True;
Do[boo = boo \[And] Head[m[[i, j]]] == Integer, {i, 1060}, {j, 2}]
boo

Out[30]= True

In[31]:= pdtr = Transpose[pd];
ByteCount[pdtr]

Out[32]= 43192

In[33]:= PackedArrayQ[pdtr]

Out[33]= False
POSTED BY: Petre Logofatu

The main reason is that:

Table[{i, j}, {i, r}, {j, i}]

is not rectangular! So it can't be packed. Similarly, for all your other table-creations; they are not rectangular!

For packed arrays transposing arrays does not change there size, they are stored as a single 'line' of memory, and the 'dimensions' (and thus the 'ordering') are stored separately. Transposing just changes the latter, the former stays untouched.

For expression that are not packed, think of it as the following:

test = {{1, 2, 3, 4}, {5, 6, 7, 8}};
TreeForm[test]
TreeForm[test\[Transpose]]

Both have the same amount of information on first sight. However, Mathematica stores the numbers 1...8 but also the heads for each subexpression. You can see in the TreeForm that Mathematica has to store more 'List' heads in the transposed case. This explained the difference in ByteCount (among others I guess).

You can manually pack your expression after flattening them using:

Developer`ToPackedArray
POSTED BY: Sander Huisman

You could create your packed arrays like this but it will be slower (I guess) for larger matrices. You have to start with a packed expression, and then carefully replace values:

r=5;
pl=ConstantArray[100,{r (r+1)/2,2}];
Developer`PackedArrayQ[pl]
pos=1;
Do[pl[[pos,1]]=i;pl[[pos++,2]]=j,{i,r},{j,i}]
pl
Developer`PackedArrayQ[pl]
POSTED BY: Sander Huisman
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract