Message Boards Message Boards

0
|
8161 Views
|
9 Replies
|
6 Total Likes
View groups...
Share
Share this post:

I can't write a simple algorithm because 'ForEach' does not exist

Posted 9 years ago

I know that Select[] is some sort of Mathematica function that works with lists. However, I need to be able to grab each element by name into a variable so that I can use it just like in a ForEach loop. Below, you can see that I converted one of my loops into a Select, but I have two more nested ones. Mathematica states that it has full procedural potential, but it does not have the ForEach loop, which makes it much more difficult to operate (do more than one function/action on) each element in some given loop. Even if this could be converted into a functional list to be printed after, I'd rather have something that can print as it finds applicable graphs.

For[x = 0, x < 5, x = x + 1,
     n = RandomInteger[{1, 10}];
     m = RandomInteger[{n - 1, n * (n - 1) / 2}];
     G = RandomGraph[{n, m}];
     R = Radius[G];
     V = VertexList[G];
     P = Select[V, R + 1 == VertexEccentricity[G, #]];
     ForEach [p, P,
      N = AdjacencyList[G, p];
      test = true;
      ForEach[n, N,
       If[VertexEccentricity[G, n] == R,
         test = false;
         ];
       ];
      If [test == true,
       Print[G]
       ]
      ]
     ]
POSTED BY: Timothy Swan
9 Replies

I'm happy I could hook you up with some food for thought (or research). I'm afraid I can't help you with the questions you raise, since computer science is definitely not my field of expertise. The Wolfram language is probably a hard nut to crack in terms of performance prediction. So much is automated and an expression may or may not be passed on to more or less optimized implementations in the background, whether that's a different algorithm to solve a mathematical problem, or compilation, or parallelization, or maybe just the use of sparse / packed array structures or floats instead of integers... There's just so much that could happen, and while it's all traceable, it's not something the language communicates to a user who doesn't ask for it. Much of the performance tuning advice you'll get is based largely on experience.

By the way, here's an impressively simple example to highlight the role of duck typing. Just use more iterations if your computer is too fast to notice the difference.

AbsoluteTiming[Do[Exp[2], 10^6];]
{0.411852, Null}
AbsoluteTiming[Do[Exp[2.0], 10^6];]
{0.198113, Null}

So far, so good, and you're probably not surprised, but now look at these guys (let's ignore that they also gather the results in lists):

AbsoluteTiming[Array[Exp[2.0] &, 10^6];]
{0.0573083, Null}
AbsoluteTiming[Table[Exp[2.0], 10^6];]
{0.023209, Null}
AbsoluteTiming[ConstantArray[Exp[2.0], 10^6];]
{0.00207215, Null}

Relative speed is a bit different if you use RandomInteger[] instead of 2 and RandomReal[] instead of 2.0 (and obviously ConstantArray will not do the same thing in the random case), but the trend still holds, which means that something similar will happen for different sorts of real-life datasets as well.

... and this, essentially, is the kind of example that gives loops in Wolfram a bad name. Things get much worse once Append enters the stage, and generally speaking, people who use Do indiscriminantly often also like Append a lot. You can imagine the near-magical speed-up going from Do[Append[...]...] to Map or Table, and the resulting shouts of "I'll never Do again!".

Here's a scary Append example. Let's say we'd like to find all primes up to 10^6. I'd consider the first example the intuitive Wolfram-esque approach, I'm glad to see it is in fact the fastest one:

AbsoluteTiming[Select[Range[10^6],PrimeQ];]
{0.40464, Null}
AbsoluteTiming[Reap[Do[If[PrimeQ[ii],Sow[ii]],{ii,1,10^6}]][[2,1]];]
{0.651222, Null}
AbsoluteTiming[list={};Do[If[PrimeQ[ii],AppendTo[list,ii]],{ii,1,10^6}];list;]
{15.224, Null}

... even these not-so-straight-forward solutions are faster than Do-Append:

AbsoluteTiming[Select[Prime[Range[10^6]],#<10^6&];]
{3.01896, Null}
AbsoluteTiming[Pick[Range[10^6],PrimeQ[Range[10^6]]];]
{0.517066, Null}
AbsoluteTiming[ii=Prime[1];Reap[While[ii<10^6,Sow[ii];ii=NextPrime[ii];]][[2,1]];]
{1.11314, Null}
POSTED BY: Bianca Eifert

Just a few remarks here... You should definitely take a look at the different varieties of Map in the Documentation, and there's also ParallelMap since you mentioned parallelization. Other functions also have parallel versions, like ParallelDo.

Also, I know you said that speed isn't your primary concern, but just in case you're interested... Here's a Stackexchange question on the topic (item 1.4 in the accepted answer briefly addresses loop constructs), and then here's a Wolfram Blog post on the topic in the same general direction.

I realize that a lot of Mathematica users are allergic to loop constructs, and I know that seems strange (borderline infuriating) to anyone who is used to other programming languages. Ideology aside, the core of that allergy is that there are so many different approaches available that you can solve a given problem in a variety of ways. That means you can usually find a solution where the code has the same structure as the problem itself. Now we all know there's no guarantee that that would be the most efficient implementation, but it's certainly the one that's fastest to write. Sometimes your problem really is a loop, but maybe it's more like a matrix multiplication or a rearranging of lists or strings, and it just seems unnecessary to re-formulate the problem into a loop structure. On top of that, Frank is also right: for many problems that are not "naturally" loops, other approaches are a lot faster because they're already optimized inside of specialized functions, or because Table will be able to compile or pack something behind the scenes when Do for some reason can't. And here's the bad news: Actually optimizing computation time can be a tiresome excercise in Mathematica because it's not straight-forward and it's not immediately clear whether a noticable speed-up is possible or not. The two sources linked above will give you an idea and a list of things to consider. If you want a rule of thumb: "Use the highest-level built-in meta-function possible" tends to be good advice, and short code has decent odds of being fast code. (Take that with a huge grain of salt.)

By the way, for your test, you use test=true and test=false. If you wanted the Boolean true and false, you'll have to use True and False (all built-in symbols in Mathematica are capitalized). Your code very likely still works because you essentially use new undefined variables true and false and symbolically compare them with test.

POSTED BY: Bianca Eifert

Bianca this is a very nice answer. The line

Usually a solution with the structure of the problem.

Is phrasing I've reached for, and failed to find, on many occasions!

POSTED BY: David Gathercole
Posted 9 years ago

That article is pretty interesting. I'm glad it appears as though they utilize pure functions. I did notice a part under the kernel section about procedural programming: "Avoid using procedural programming (loops etc), because this programming style tends to break large structures into pieces (array indexing etc). This pushes larger part of the computation outside of the kernel and makes it slower." Studying time complexity is one of the core focuses of my field and I'm curious about what this part means. It's got me confused. The main point that advice was under was the following: "Push as much work into the kernel at once as possible, work with as large chunks of data at a time as possible, without breaking them into pieces" That string isn't searchable online, but I'm wondering if it relates at all to using a graphics card for parallelization. If so, the language shouldn't have a problem with doing that automatically for loops, especially with pure functions enforced. I suppose a map might remove the opportunity to have shared memory over the iterations while the map would not. The compiler/interpreter technically could be designed to simply detect something like that, but I suppose since there is no editor feedback, a programmer wouldn't know when adding a variable slowed down his code. Ahh, this is all so interesting. I haven't done too much with parallelization so maybe that's why I'm not familiar with the practices here.

POSTED BY: Timothy Swan

You can also use Map

In[3]:= Map["a" <> # &, {"b", "c", "d"}]

Out[3]= {"ab", "ac", "ad"}

In[4]:= "a" <> # & /@ {"b", "c", "d"}

Out[4]= {"ab", "ac", "ad"}
POSTED BY: Frank Kampas

By inefficient, I mean slow.

POSTED BY: Frank Kampas
Posted 9 years ago

Except for library code being developed for parallelization or inner loop conversion to machine code optimizations, changes in time complexity by small constant factors are considered negligible in the context of problems. https://en.wikipedia.org/wiki/Time_complexity In other words, I'm not really interested in cutting operation time in half. I'm just interested in being able to quickly encode or transform an algorithm. Languages use lambdas to remove code repetition, which improves encoding process, and to lazily perform functions, which can potentially change the time complexity if it prevents unnecessary operations from being done. I suppose it's not common to communicate parallelization manually yet, so I'm wondering if those functions parallelize the calls. That might be unnecessary if I only search graphs of size 10. Could you explain the syntax of the Map function? It might be easier to code that, if I remember how, than to use loop variables.

POSTED BY: Timothy Swan

Standard looping constructions, such as For loops and Do loops are inefficient in Mathematica. It's better to use Table, which allows you to loop over lists.

In[1]:= Table["a" <> s, {s, {"b", "c", "d"}}]

Out[1]= {"ab", "ac", "ad"}
POSTED BY: Frank Kampas
Posted 9 years ago

What do you mean my inefficient? A loop takes O(n) time to compute and so does generating a table, so they would be the same time efficiency. A table may take linear space though, which is less efficient. So, the ForEach loops that I intended may be done with tables so I don't have to implement indices? I already implemented a for loop version of the for each which uses index and length accesses of the list.

POSTED BY: Timothy Swan
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract