Your summary of the case of Logo seems interesting and somewhat discouraging, although I don't have time to read the full paper. Anyway, that was over 30 years ago, and I'd expect the increased accessibility of technology in everyday life to play a big role here. I think it's actually getting easier to get people interested in "making the computer do stuff", but that's (1) just my guess and (2) not even a well-founded guess considering I'm younger than that paper.
You also mention the changing perception of what constitutes programming. I think you're overestimating this, or at least it's only true for people who are already immersed in Wolfram technology. It's worth noting that the Wolfram Language is fighting two separate battles at the moment. (Sorry about the anthropomorphization of WL, I think this is not exclusively on WRI as a company, so I'll just pretend that the wolf is conscious.) On the one hand, WL wants to be counted as a real and proper programming language, and on the other hand, it wants to change the definition of "programming language" at the same time. Outside the wolfish ecosystem, "interacting with the computer by talking normal English" is most certainly not considered "programming", it's "using a nifty (wannabe-)AI program someone else wrote". (Just like "I'm programming my video tape recorder" was never really "programming" - not to a programmer at least.) Also, I believe a lot of people still don't usually consider "using Mathematica" a form of programming, or only in the sense that using a graphing calculator is "programming". (And I still maintain the opinion that this is one of the intrinsic problems proprietary languages will never shake entirely, but that's neither here nor there.) It's comparatively easy to be an established programming language and then try to broaden the scope of the term. It's also not so hard to establish yourself as a new programming language if you already look like one. Trying both at the same time is ambitious, and I like it, but the process is only just beginning. And it won't bring about a paradigm shift if either one of the two aspects falls behind. ("Mathematica can do natural language processing" and "WL is a programming language", each taken just on its own, are both no big deal conceptionally.)
Anyway, where I was going with that last paragraph is this: If we use Mathematica in school, will students even think that they've learned to program, will they be able to transfer their knowledge to other languages easily, and will other people (particularly potential employers) acknowledge that they can program if they do things like (Ctrl+=)NYC["population"]
and call it a web app? Or is the more realistic approach that we try to enable computational problem solving in everyone, especially people who don't want to be anywhere near the technology sector, perhaps even well outside the STEM fields altogether? That seems somewhat more feasible at the moment, and STEM-phobic students might actually learn more maths this way than through a traditional approach. I'm not trying to argue for or against anything here, I genuinely don't know what the bigger picture could be expected to look like a few years down the road.
As for this statement: "Maybe it is naïve to say this, but I would claim that rigorous thinking is not the exclusive domain of mathematics. I am not sure, but it seems to me that different programming languages promote different types of thinking. (Anyone disagree?)" Totally agree with all of this.
And finally, you say that "This is the main claim (I think it is) that learning this language will let you explore solutions to problems you could not solve before." I think I know what you mean, but I had to chuckle a bit. Isn't that the main claim of all forms of education? What else is education supposed to do? But you're right, the ideal measurement of benefits of WL in education must be something along the lines of "what does this enable people to do further down the line". I think that's actually a reasonable measurement for education in general (if you want to go beyond employment rate and average salary as an indicator of success). But I don't have any concrete ideas on how to do this either (considering I still don't want to actually dig into the literature on this).
Just as an aside, you also said that "One thing this means is the notion that bugs in a program is not really relevant like it was (or like it is with other programming languages)." This is interesting; in which way are bugs in WL less relevant than in other languages? Because I've been hunting some of these critters just last week, and they sure looked relevant enough to me!