I've been running Mathematica with large amounts of data. ByteCounts for lists can be over a gigabyte. Sometimes seemingly small software changes could put Mathematica into a virtual memory fit leading to system lockup. (A related problem is that Mathematica can get into a VM fit just to DumpSave a large list whereas it can Put ">>" the same list successfully.)
The problem is Mathematica has at least two ways of storing lists with one method taking 3 times the storage of the other. Here is code to demonstrate the problem (scaled down). We'll start with a simple list of lists.
myList = Table[Table[RandomReal[], {1000000}], {10}];
ByteCount /@ myList
ByteCount@myList
Output
{8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144, 8000144}
80001560
Then we Flatten the list or Join the lists to produce a simple result. The ByteCounts are different for each function.
myFlatList = Flatten@myList;
ByteCount@myFlatList
240000080
myJoinList = Join @@ myList;
ByteCount@myJoinList
80000144
But the lists are equal and the same! In memory usage one is FAT and the other is thin.
Equal[myJoinList, myFlatList] && SameQ[myJoinList, myFlatList]
True
The difference extends to saving the lists as MX.
SetDirectory["D:\\"];
DumpSave["flatlist.mx", myFlatList];
DumpSave["joinlist.mx", myJoinList];
FileByteCount@"flatlist.mx"
FileByteCount@"joinlist.mx"
108640546
80000252
When the MX files are read back the FAT and thinness is preserved.
<< "flatlist.mx";
<< "joinlist.mx";
ByteCount@myFlatList
ByteCount@myJoinList
240000080
80000144
What about text output?
Put[myFlatList, "flatlist.txt"];
Put[myJoinList, "joinlist.txt"];
FileByteCount@"flatlist.txt"
FileByteCount@"joinlist.txt"
125000000
123333334
A slight difference in size. Mathematica randomly changes the number of values per line. The good news is the values are correct, the bad news is that text-derived lists are always FAT.
join2 = << "joinlist.txt";
flat2 = << "flatlist.txt";
join2 == flat2
ByteCount@flat2
ByteCount@join2
True
240000080
240000080
Operations can make FAT lists or thin lists. The simplest way (so far) of thinning a list (by 2/3) is this:
newFlat = Map[Identity, myFlatList];
ByteCount@newFlat
80000144
Mathematica 9 has the same results.