Message Boards Message Boards

Error using smooth kernel distribution with a high amount of elements

Posted 2 years ago

Hello everyone. I want to save Gaussian random points in a txt file and then load them and calculate the empiric pdf. However, I am getting an error if the number of elements is high enough.

To keep it simple, I have prepared a small example.

d1 = RandomVariate[NormalDistribution[0, 1], 100000];

Export["d1TEST.txt", d1];

d2 = Flatten[Import["d1TEST.txt", "Table"]];

\[ScriptCapitalD]1 = SmoothKernelDistribution[d1];

\[ScriptCapitalD]2 = SmoothKernelDistribution[d2];

SmoothKernelDistribution::invldd: The input data SmoothKernelDistribution[{-0.850889,1.1616,0.696657,1.78897,-0.504029,0.713672,-0.415394,-1.02279,-0.575213,-0.0739832,<<31>>,-1.67304,-0.0996146,0.262648,1.21812,-1.34935,0.0263795,0.714684,-0.67401,-0.498297,<<99950>>}] should be a vector or a matrix of real numbers or a valid TemporalData object.

d3 = RandomVariate[NormalDistribution[0, 1], 1000];

\[ScriptCapitalD]3 = SmoothKernelDistribution[d3];

Can someone please tell me why I am getting the error under D2? It cannot be the size of the vector, since D1 works fine.

Any help is appreciated. Best regards. Jaime.

POSTED BY: Jaime de la Mota
2 Replies
Posted 2 years ago

The problem occurs when the values are small, in that case, Export as text causes the numbers to be read as String.

d2 // Map[Head] // DeleteDuplicates
(* {Real, String} *)

d2 // Select[Head[#] === String &]
(* {"-5.846412131032388*^-6", "-9.827288974460845*^-6"} *)

Best to use Save or DumpSave and Get as suggested by Neil, and as I suggested here.

POSTED BY: Rohit Namjoshi

Jaime,

I would report this as a bug -- sometimes it works and sometimes it does not. It must depend on the data. If you avoid the text file, it works at any size (I tried up to 10,000,000). Use

d1 = RandomVariate[NormalDistribution[0, 1], 10000000];

Save["d1TEST", d1];

d2 = Get["d1TEST"];

\[ScriptCapitalD]1 = SmoothKernelDistribution[d2];

Or do binary read/writes to save space.

Regards,

Neil

POSTED BY: Neil Singer
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract