Message Boards Message Boards

1
|
5369 Views
|
1 Reply
|
1 Total Likes
View groups...
Share
Share this post:

Avoid MMA 11.2 Import Statement Produces Errors & Truncates CSV Data?

Posted 7 years ago

https://mathematica.stackexchange.com/questions/155908/mathematica-11-2-import-statement-produces-errors-truncates-csv-data-created-i

In Mathematica 11.1 I've created many CSV files from StarData:

sunMass = StarData["Sun", "Mass"];
sunLuminosity = StarData["Sun", "Luminosity"];
sunTemperature = StarData["Sun", "EffectiveTemperature"];
sunGravity = StarData["Sun", "Gravity"];
sunDensity = StarData["Sun", "Density"];
sunVolume = StarData["Sun", "Volume"];
sunDiameter = StarData["Sun", "Diameter"];

SetDirectory[$UserDocumentsDirectory]

listData1 = Take[StarData[EntityClass["Star", All]], {1, 10000}];

CloseKernels[]; LaunchKernels[4]
AbsoluteTiming[
 Length[
  data =
   Transpose[
    ParallelMap[
     StarData[listData1, #] &,
     {"Name", "Metallicity", "SpectralClass", "BVColorIndex", 
      "EffectiveTemperature",
      "Mass", "Luminosity", "AbsoluteMagnitude", "Gravity", "Density",
       "Diameter",
      "DistanceFromEarth", "MainSequenceLifetime", "Parallax",
      "RadialVelocity", "Radius", "StarEndState", "StarType", 
      "SurfaceArea",
      "VariablePeriod", "Volume", "HDName"}]]]]



zeroData = data /. {Missing["NotAvailable"] -> 0}; 

noUnitsData = 
  zeroData /. {c1_, c2_, c3_, c4_, c5_, c6_, c7_, c8_, c9_, c10_, 
     c11_, c12_, c13_, c14_, c15_, c16_, c17_, c18_, c19_, c20_, c21_,
      c22_} -> {c1, c2, c3, c4, QuantityMagnitude[c5], 
     QuantityMagnitude[c6/sunMass], 
     QuantityMagnitude[c7/sunLuminosity], c8, 
     QuantityMagnitude[c9/sunGravity], QuantityMagnitude[c10], 
     QuantityMagnitude[c11/sunDiameter], QuantityMagnitude[c12], 
     QuantityMagnitude[c13], QuantityMagnitude[c14]
     , QuantityMagnitude[c15], QuantityMagnitude[c16], c17, c18, 
     QuantityMagnitude[c19], QuantityMagnitude[c20], 
     QuantityMagnitude[c21/sunVolume], c22};

Length[noUnitsData]

prePendData = 
  Prepend[noUnitsData, {"Name", "Metallicity", "SpectralClass", 
    "BVColorIndex", "EffectiveTemperature",
    "Mass", "Luminosity", "AbsoluteMagnitude", "Gravity", "Density", 
    "Diameter",
    "DistanceFromEarth", "MainSequenceLifetime", "Parallax",
    "RadialVelocity", "Radius", "StarEndState", "StarType", 
    "SurfaceArea",
    "VariablePeriod", "Volume", "HDName"}];

TableForm[Take[prePendData, 5]]

Export["allStars1.csv", prePendData, "CSV"]

This creates a CSV file with 10,000 rows of comma separated data. Works great for all 108,939 rows of StarData, by creating 11 CSV files.

Importing each CSV file in a new notebook is pretty straight forward

  Length[data1 = 
  Import["allStarData1.csv", {"Data", {All}}, 
    "HeaderLines" -> 1] /. {c1_, c2_, c3_, c4_, c5_, c6_, c7_, c8_, 
     c9_, c10_, c11_, c12_, c13_, c14_, c15_, c16_, c17_, c18_, c19_, 
     c20_, c21_, c22_} -> {c1, c2, c3, c4, c5, c6, c7, c8, c9, c10, 
     c11, c12, c13, c14
     , c15, c16, c17, c18, c19, c20, c21, c22}
 ]

This worked fine across different releases of Mathematica 11 for a large number of CSV files.

Upgraded today to Mathematica 11.2.

None of the existing notebooks work.

Import statements now take forever, generating error messages, and truncating large numbers of rows from existing CSV files.

One workaround that I'm currently testing, is running the StarData extract code listed above under Mathematica 11.2 creating new CSV files, and then importing the new CSV files.

This worked for the first 10,000 row StarData extract. No errors, no truncation. But still runs very slow. Will have to run the other 10 extracts and create new 10,000 row CSV files for each.

Feels like this is a bug in Mathematica 11.2 Import statement internal code. Where new internal data verification checks are incompatible to previously created CSV files.

Anyone else run into this issue?

Also I turned off the error messages to try to get through existing code, but don't know how to turn the error messages back on, so that I can include them in this post. Anyone know how?

Thanks

Including JPEG of Import errors:

enter image description here

Mathematica 11.2 documentation points to updates in CSV Import & Export functions:

enter image description here

Sample 10,000 record CSV file Export created in Mathematica 11.1 that is truncated when Import under Mathematica 11.2

Same files created as Export under Mathematica 11.2 and Import under Mathematica 11.2 are not truncated, and have full 10,000 records per file.

enter image description here

"TextDelimiters"->"" fixed the problem. Eliminated row truncation. Import is still horribly slow under Mathematica 11.2 for a 10,000 row CSV file, taking over 250 seconds. Workaround I tested was to create all files under Mathematica 11.2 Export and then Import. No Truncation. The Part::partw warning messages are new under 11.2. They did not show up under Mathematica 11.1. Turning them off via Off[Part::part] did not improve the performance elapsed time of the Import.

enter image description here

TextDelimiters Fixed the Problem

Fixed

Falling back to Mathematica 11.1. Importing CSV files under Mathematica 11.2 is 100 times slower.

enter image description here

Attachments:
POSTED BY: Joseph Karpinski

This issue is now recognized as a bug in Mathematica 11.2.

See the following link:

Bug Introduced In Mathematica11.2

POSTED BY: Joseph Karpinski
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract