# Analyse a large data set using the function "HistogramDistribution"?

GROUPS:
 When I analyzed a large data set(see the attachment) using the function "HistogramDistribution" dataout = Import["/Users/apple/Desktop/dataOut.txt", "Table"]; HistogramDistribution[dataout[[All, {1, 2}]], "Scott"] an error occurs: Thread::tdlen: Objects of unequal length in (1/99953){0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,<<42>>} {<<1>>} cannot be combined.And the output data distribution becomes a mess. The strange thing is, when I change the bspec "Scott" to "Sturges" or "Automatic", everything works. HistogramDistribution[dataout[[All, {1, 2}]], "Sturges"] Now for some practical reasons, I have to use "Scott" to analyze the data. So could anyone help me solve this? Thanks a lot. Attachments:
18 days ago
7 Replies
 Stefan Ragnarsson 1 Vote This appears to work for me in Mathematica 11.3. What version are you using?
18 days ago
 I have tried 11.0 and 11.1.1.0 , both failed. Could you upload your code and the output results? Thank you.
17 days ago
 It looks like there was a bug in the HistogramDistribution code that was introduced in version 10.4 and fixed in version 11.2. I'm afraid I don't know of a workaround for the affected versions.And the code I used was: data = Import["~/Downloads/dataOut_2.txt", "TSV"]; HistogramDistribution[data[[2 ;;, {1, 2}]], "Scott"] 
17 days ago
 Right, Solved! Thank you!
 Henrik Schachner 1 Vote As far as I can see the problem comes from the very first line of your data:So instead you might try: HistogramDistribution[dataout[[2 ;;, {1, 2}]], "Scott"] Hope that helps, regards -- Henrik