# How I saved Gigabytes of memory or the Discovery of FAT and thin lists

Posted 9 years ago
6893 Views
|
5 Replies
|
9 Total Likes
|
 I've been running Mathematica with large amounts of data. ByteCounts for lists can be over a gigabyte. Sometimes seemingly small software changes could put Mathematica into a virtual memory fit leading to system lockup. (A related problem is that Mathematica can get into a VM fit just to DumpSave a large list whereas it can Put ">>" the same list successfully.)The problem is Mathematica has at least two ways of storing lists with one method taking 3 times the storage of the other. Here is code to demonstrate the problem (scaled down). We'll start with a simple list of lists. myList = Table[Table[RandomReal[], {1000000}], {10}]; ByteCount /@ myList ByteCount@myList Output{8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144}80001560Then we Flatten the list or Join the lists to produce a simple result. The ByteCounts are different for each function. myFlatList = Flatten@myList; ByteCount@myFlatList 240000080 myJoinList = Join @@ myList; ByteCount@myJoinList 80000144But the lists are equal and the same! In memory usage one is FAT and the other is thin. Equal[myJoinList, myFlatList] && SameQ[myJoinList, myFlatList] TrueThe difference extends to saving the lists as MX. SetDirectory["D:\\"]; DumpSave["flatlist.mx", myFlatList]; DumpSave["joinlist.mx", myJoinList]; FileByteCount@"flatlist.mx" FileByteCount@"joinlist.mx" 10864054680000252When the MX files are read back the FAT and thinness is preserved. << "flatlist.mx"; << "joinlist.mx"; ByteCount@myFlatList ByteCount@myJoinList 24000008080000144What about text output? Put[myFlatList, "flatlist.txt"]; Put[myJoinList, "joinlist.txt"]; FileByteCount@"flatlist.txt" FileByteCount@"joinlist.txt" 125000000123333334A slight difference in size. Mathematica randomly changes the number of values per line. The good news is the values are correct, the bad news is that text-derived lists are always FAT. join2 = << "joinlist.txt"; flat2 = << "flatlist.txt"; join2 == flat2 ByteCount@flat2 ByteCount@join2 True240000080240000080Operations can make FAT lists or thin lists. The simplest way (so far) of thinning a list (by 2/3) is this: newFlat = Map[Identity, myFlatList]; ByteCount@newFlat 80000144Mathematica 9 has the same results.
5 Replies
Sort By:
Posted 9 years ago
 Yes, Map uses autocompilation for lists over a certain size and they will be packed if possible. The direct way would be packed = DeveloperToPackedArray[unpacked]; 
Posted 9 years ago
 myList is not a packed array and I guessFlatten therefore decides the result should be unpacked. Had the thing been packed to begin with there would be no unpacking by Flatten. Try it e.g. with myList2 = RandomReal[1, {1000000, 10}]; As for Join, it is in fact seeing packed arrays since the component sublists of myList are packed. Ergo, a packed result.
Posted 9 years ago
 And to be different Map decides the result should be packed. I'll be using this. packed = Map[Identity, unpacked]; `
Posted 9 years ago
Posted 9 years ago
 Very interesting! I'm interested to know the cause of this. I sometimes also work with very big files...