Group Abstract Group Abstract

Message Boards Message Boards

How to speed up entity queries?

Posted 8 years ago
POSTED BY: Szabolcs Horvát
4 Replies
Posted 3 years ago
POSTED BY: Updating Name
Posted 8 years ago

Never display entities:

t = AbsoluteTime[];
AbsoluteTiming[
  ent = Entity["Plant", "Species:GlycineMax"]["TaxonomyGraph"]][[1]]
AbsoluteTime[] - t

2.05579

2.057142

AbsoluteTiming[ToBoxes[ent]][[1]]

79.7398

A previous spelunking session showed me that EntityValue makes calls to Internal`MWACompute, which if I remember correctly just calls the Wolfram|Alpha API (you can actually completely spelunk how it makes these calls I believe; haven't figured out how to abuse that yet.)

The display call clearly asks for way to much data, which it stores in $UserBaseDirectory/Knowledgebase. So I think for this plant dataset the first time you evaluate that it downloads a bunch of data, which causes the slowdown you see.

I tried to illustrate that:

retDat =
  AssociationMap[
   With[{
      ent = AbsoluteTiming[RandomEntity[#]], 
      size = AbsoluteTiming[EntityValue[#, "EntityCount"]]},
     <|
      "Size" -> size[[2]],
      "DisplayTime" -> AbsoluteTiming[ToBoxes[ent[[2]]]][[1]],
      "RetrievalTime" -> ent[[1]],
      "SizeRetrievalTime" -> size[[1]]
      |>
     ] &,
   entNames (* A cached version of EntityValue[] *)
   ];

ListPlot[
 KeyValueMap[
  Callout[{#2["Size"], #2["DisplayTime"]}, #] &,
  retDat
  ]
 ]

blb

But this seems pretty random so I don't really know Maybe RandomEntity is messing with things. More likely I'm just wrong about that. The data may also be thrown off if the EntityValue retrieves via a paclet mechanism.

Unfortunately I've got no way to really work around this slow-down, except for never displaying an Entity (which I try to never do).

POSTED BY: b3m2a1 ​ 

I'm also curious on what takes what amount of time? Is it setting up the connection? Is it the size (large size)? is it the interpretation? is it the database lookup?

2 minute 50 on my machine btw…

POSTED BY: Sander Huisman

Here 8.49393 in the Entity[], absolute time difference 127.83453 (2 min 7), Mathematica 10.4 on this machine, Windows 10 64 Bit Prof Update 1709. If you do the same thing again (0.0114 vs. 10.7457 - Computers are obviously intended to do it again).

But know, keep your socks on, if the notebook is closed, Mathematica exits too, then the Notebook opened again, it does again an 'Initializing Knowledge Base Connection' but returns in 4.68854 vs. 6.31252. Then the question is, what has been returned?

enter image description here

Usually the solution is to localize or cache the data needed, if that is possible. Only from time to time a check whether curators did change something should be done.

POSTED BY: Udo Krause
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard