Message Boards Message Boards

1
|
6992 Views
|
3 Replies
|
2 Total Likes
View groups...
Share
Share this post:

Improve Wolfram StarData performance?

Posted 8 years ago

My tests of using AstronomicalData versus StarData show that StarData requests are 15 times slower for the exact same request. Can anyone make a recommendation to improve the sample test below?

Length[m1 = AstronomicalData["ClassMStar"]]

(* Out: 4258 *)

Timing[
     Length[
 listClassMStar =
  {AstronomicalData[#, "Name"], AstronomicalData[#, "HDNumber"], 
   AstronomicalData[#, "SpectralClass"], AstronomicalData[#, "BVColorIndex"], 
   AstronomicalData[#, "EffectiveTemperature"], AstronomicalData[#, "Mass"], 
   AstronomicalData[#, "Luminosity"], AstronomicalData[#, "AbsoluteMagnitude"], 
   AstronomicalData[#, "ApparentMagnitude"], AstronomicalData[#, "ConstellationName"]} & 
/@ m1]]

(* Out: {45.5534, 4258} *)

Timing[
 Length[
  listClassMStar =
   StarData[
    m1, {"Name", "HDNumber", "SpectralClass", "BVColorIndex", "EffectiveTemperature", 
         "Mass", "Luminosity", "AbsoluteMagnitude", "ApparentMagnitude", 
         "ConstellationName"}
   ]
  ]
 ]

(* Out: {702.228, 4258} *)

POSTED BY: Joseph Karpinski
3 Replies

Parallelisation?

LaunchKernels[8]
AbsoluteTiming[Length[listClassMStar = ParallelMap[StarData[m1, #] &, vars]]]

takes 188 seconds for me... Apart from that i don't think you can really speed these calls up I'm afraid...

POSTED BY: Sander Huisman

That worked!

Thanks!

The Length value on the output was coming out as 10, which really threw me. But Transpose corrected that. My computer is older with only 2 CPUs so my final query was:

CloseKernels[]; LaunchKernels[2]
AbsoluteTiming[
 Length[
  Transpose[
   list4ClassMStar = 
    ParallelMap[
     StarData[m1, #] &, {"Name", "HDNumber", "SpectralClass", 
      "BVColorIndex", "EffectiveTemperature", "Mass", "Luminosity", 
      "AbsoluteMagnitude", "ApparentMagnitude", "Constellation"}]]]]

{182.006, 4258}

Still 3.5 times slower than the same query using AstronomicalData.

What worries me is that StarData is a replacement for AstronomicalData. And ClassMStar was one of the small categories, at 4000 members. Many other star class categories hold 18-26,000 stars. So if a StarData group of 4000 stars takes 3 minutes, while a AstronomicalData group takes under 60 seconds, we are still looking at long access times for common star classes {A,B,F,G,K} etc. And the newer class data in StarData has even more stars in each class category. The amount of data being transferred is actually pretty small by todays standards, and the fact that older AstronomicalData format is 3 times faster, still points to StarData format as a performance issue. Could be as simple as StarData residing on newer data servers that queue throttle a request by resource usage, into slower priority work queues. Or StarData may be stored in a different database format, that does not perform as well as AstronomicalData database format. Only Wolfram would know. In either case, Wolfram should fix this issue, as it may also be shared across other curated data groups, not just StarData.

POSTED BY: Joseph Karpinski

I think they expanded their databases with much more new data, and split it up for Planets and for Stars separately (i guess). These requests are basically database requests, so this might take additional time? It seems a bit long indeed. The data is of order megabyte so that takes a second to transfer nowadays. What's happening the other time? not sure...

You can send them a 'bug' report, and mark it as a (big) speed regression, attaching a notebook with both examples. They might look in to it....

POSTED BY: Sander Huisman
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract