Message Boards Message Boards

Computing with Entity classes involving large amounts of data

Posted 10 years ago

Continuing along the lines of the question here

http://community.wolfram.com/groups/-/m/t/367014

but within Mathematica rather than Wolfram|Alpha... And using this as a prototype of a more general question....

How does one efficiently compute the list of stars within a particular distance from the sun? (This is a prototype of the more general question of how to use curated Entity data that involves a large quantities of data to then process.)

One can get a list of all stars using (there appear to be about 107000 of them... so don't execute this if you don't want to wait).

EntityList[EntityClass["Star", "Star"]]

And one can get the stars' names along with their distances (and thus select those stars from the list that are within a desired maximum distance from the Earth) using

EntityValue[
 EntityClass["Star", "Star"], {EntityProperty["Star", "Name"], 
  EntityProperty["Star", "DistanceFromEarth"]}]

But, again, one has to download all 107000 items and then perform the calculation.

So my question is, is there a syntax that can solve this problem (say asking for the stars that are within 10 lightyears of the Earth) without having to actually download all 107000 the stars' data first. It takes an excruciatingly long time... in fact it is not clear that that computation will properly complete.

POSTED BY: David Reiss
2 Replies
POSTED BY: David Reiss

I think the short answer to your question is basically "No".

Filtering enties by a property is kinda hard. In order for a query to evaluate efficiently the entities would have to be stored in some data structure design for that kind of query. But to handle a generic query, what other choice is there but to go through every possible entity? The developers I've talked to are of course interested in making common/likely queries eaiser. Maybe we will see some new tools in future versions.

For some entities there are tools to make these kinds of queries easier. GeoEntities is an example of such a function. Entities which have positions of the earth are likely to be queried by their distance or position in some way, so GeoEntities provides a tool for such queries. GeoEntities doesn't do stars though. I bring it up as an example of where there is a function that allows for a more advanced query without doing an exhaustive search.

"Star" entities, like many others have EntityClasses. These are groups of stars that might be useful in some cases:

EntityClassList["Star"]

One of the groups is the nearest 100 stars for example.

For a brute force search such you are describing I would use StarData as a function instead:

Select[#, Last[#] < Quantity[10, "LightYears"] &] &@
StarData[StarData[], {"Name", "DistanceFromEarth"}]

This doesn't seem to work just yet, but CloudEvaluate might be helpful since the servers likely have quicker read to the star data:

CloudEvaluate[
 Select[#, Last[#] < Quantity[10, "LightYears"] &] &@
  StarData[StarData[], {"Name", "DistanceFromEarth"}]
]
POSTED BY: Sean Clarke
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract