Message Boards Message Boards

0
|
6406 Views
|
9 Replies
|
1 Total Likes
View groups...
Share
Share this post:

Help in getting started with data reduction

Posted 10 years ago

Hi,

I am looking for someone to help me get started in visualizing some data collection. The data is stored in a nested JSON file. At this stage i know how to import JSON into a rather large associative array, and now i need to know how to extract, arrange, group, sum data to present data insights using 2D and 3D graphs.

I am a programmer, and have some appreciation for functional programming, but am very new to mathematica and need some help to get started. What i need is several explained examples tailored to my needs and i think i can then continue on my own.

thanks,

Dan

POSTED BY: Dan G
9 Replies
Posted 10 years ago

Thank you.

Yes. I am a programmer.

I think the key challenge in getting started is the wealth of possibilities, and the need to get into the right mindset, how, problems are solved with the language tools at disposal. Its frankly, overwhelming.

I think a key challenge i currently perceive in the documentation is that examples are "simple islands", and its not obvious how to stitch language constructs together to solve problems that require multiple problem solving steps (apply ./, then create a "part" function, then use case to pattern match, wrapped up in one expression, kind of thinking.

I think best if there are more coarse grained end-to-end problems solving examples, that tackle typical problem and "pattern" of usage of language constructs in combination.

hope this makes sense,

Dan

p.s. I start to see the /. utility -- it;s like a key, value look up -- given a "key pattern" it returns a (list of fitting) value(s).

POSTED BY: Dan G

Great!

The /. operator is a shorthand (and it is almost always used in this shorthand form) fo the ReplaceAll function:

http://reference.wolfram.com/language/ref/ReplaceAll.html

So a comparative example might be:

In[1]:= b /. {a -> 1, b -> 2, c -> 3}

Out[1]= 2

b /. {a -> 1, b -> 2, c -> 3}

In[2]:= ReplaceAll[b, {a -> 1, b -> 2, c -> 3}]

Out[2]= 2

By the way, since I suspect that you have considerable experience with other programming languages, you it may help you to go though the following rapid tutorial to get an overview of things

http://www.wolfram.com/language/fast-introduction-for-programmers/

It's rather like drinking from a firehose, but it can orient you into the overall way of thinking and to get a conceptual high-level sense of the architecture of the Wolfram language.

And a book that I often recommend is this (current through Mathematica 9 functionality):

http://www.amazon.com/Programming-Mathematica-Introduction-Paul-Wellin/dp/1107009464/

POSTED BY: David Reiss
Posted 10 years ago

Mathematica is magic!

The simplicity and elegance of solutions is why I really want to become proficient in Mathematica -- its an amazing tool.

What is the name of the /. operator/function. Can you write this out in a non-short-hand form, where I can see what functions are used and applied?

thank you,

Dan

POSTED BY: Dan G

Just a first point about nomenclature: what you get when you import a JSON file is not a Mathematica Association: it is a nested list of Rules. Association (http://reference.wolfram.com/language/ref/Association.html) in Mathematica is a very specific data construct that encapsulates key-value pairs. And even though the list of rules returned from the JSON import is generally a nested list of rules in key-value form--it is not an Association in the Mathematica meaning of the Association function.

That said... each particular analysis that you want to do will involve writing a specific function to grab the data you desire from this list of rules. In some cases it may use the Cases function, but more generally it will involve using things like Part and ReplaceAll.

So here is an example (if you could post a truncated example of one of your JSON data files I could show a couple of examples from that.

But here is a couple of examples. Attached is a JSON file (a very simple one) that I found on the web. If I Import it I get:

In[1]:= jsonImport = Import["/Users/dreiss/Desktop/json.json"]

Out[1]= {"sex" -> "M", "first" -> "John", 
 "favorites" -> {"color" -> "Blue", "sport" -> "Soccer", 
   "food" -> "Spaghetti"}, "last" -> "Doe", 
 "interests" -> {"Reading", "Mountain Biking", "Hacking"}, 
 "age" -> 39, "salary" -> 70000, "registered" -> True, 
 "skills" -> {{"tests" -> {{"name" -> "One", 
       "score" -> 90}, {"name" -> "Two", "score" -> 96}}, 
    "category" -> 
     "JavaScript"}, {"tests" -> {{"name" -> "One", 
       "score" -> 79}, {"name" -> "Two", "score" -> 84}}, 
    "category" -> 
     "CouchDB"}, {"tests" -> {{"name" -> "One", 
       "score" -> 97}, {"name" -> "Two", "score" -> 93}}, 
    "category" -> "Node.js"}}}

So lets find the "skills" data:

In[3]:= "skills" /. jsonImport

Out[3]= {{"tests" -> {{"name" -> "One", 
     "score" -> 90}, {"name" -> "Two", "score" -> 96}}, 
  "category" -> 
   "JavaScript"}, {"tests" -> {{"name" -> "One", 
     "score" -> 79}, {"name" -> "Two", "score" -> 84}}, 
  "category" -> 
   "CouchDB"}, {"tests" -> {{"name" -> "One", 
     "score" -> 97}, {"name" -> "Two", "score" -> 93}}, 
  "category" -> "Node.js"}}

And, instead of this let's see the "tests" directly:

In[3]:= testData = "tests" /. ("skills" /. jsonImport)

Out[3]= {{{"name" -> "One", "score" -> 90}, {"name" -> "Two", 
   "score" -> 96}}, {{"name" -> "One", 
   "score" -> 79}, {"name" -> "Two", 
   "score" -> 84}}, {{"name" -> "One", 
   "score" -> 97}, {"name" -> "Two", "score" -> 93}}}

And, now knowing this structure we can get the values for the test with "name" that has the value "one":

In[5]:= Cases[testData, ({"name" -> "One", "score" -> value_}) :> 
  value, \[Infinity]]

Out[5]= {90, 79, 97}

And this result can be analyzed and visualized...

POSTED BY: David Reiss
Posted 10 years ago

Thank you for all the replies. Here is the structure (mappings) I need to analyze and graph:

AllUsers(user)->UserSessions

UserSessions(DateTime)->OneUserSessionUserData

OneUserSessionUserData(sessionProperty) -> SessionPropertyValue

OneUserSessionScreen(Session, Screen, Property) -> ScreenPropertyValue

there is a bit more, but that's enough for now.

In words the structure means the following:

We have a collection of users, known by their names, who repeatedly play games. A user plays a game on a particular date/time, usually only game a day -- this we call one game session.

For each game we collect some overall data, such as duration of the game, number of screens played. Each game is made out of several screens. for each screen we also collect data, such as time to complete a screen. Whether the screen was completed successfully or not, or was skipped.

I now need to analyze this data from various points of views. For example. I want to know how often each person played the game so far. Also, on what dates/times the game was played, and how long it took to play on each date -- or any of the other properties we measure per game.

We also want to know on average how long a screen took.

Finally, we partition all users into two groups, so we want to compare between groups also.

All the data is exported into a nested JSON array as per the structure above, and when Imported into mathematica (using a URL), it generates nested Association lists.

I think that Mathematica is ideal for this task, but i don't know yet how to get started.

I wrote a small program for now, that traverses the structure and whenever it hits on a mapping it triggers an event with data. The event routines then do the data counting and cacluation work.

I need to do something similar in mathematica: some kind of Select or Apply, and one or more Functions that are applied and perhaps also composed to yield the results my event routines now calculate.

any help how to get started would be much appreciated,

Dan

POSTED BY: Dan G

Bruce and I were neck-and-neck in this one ;-)

POSTED BY: David Reiss

http://www.wolfram.com/broadcast/

http://www.wolfram.com/broadcast/c?c=108

http://reference.wolfram.com/language/, especially

(Note the links to other Guide pages and tutorials.)

Videos of on-line mini courses are at http://www.wolfram.com/training/special-event/

If you have focused, specific questions, try the Community. Show the code that you tried.

If you want a lot of project-related tutoring, consider Technical Services. http://www.wolfram.com/support/technical-services/

POSTED BY: Bruce Miller

A couple of quick comments to get you started though.

When you import from a JSON file the result is a list (perhaps nested in various ways reflecting the structure of the JSON file) of replacement rules. (FYI: be careful of calling it an associative array--even though, in essence it is--since that might confuse folks with the Association construct in Mathematica which is somewhat different than a list of rules.)

Since, once you've imported your JSON file, you now have a list of rules, you can manipulate it in any way you wish using list manipulation commands as well as various other Mathematica programming constructs (pattern matching, for example, to pull out particular items from within the list) including replacements using the rules.

Here are some tutorials:

Working with Rules: http://reference.wolfram.com/language/tutorial/ApplyingTransformationRules.html

At the top of this page under "Learning Resources" are links to tutorials on working with Lists: http://reference.wolfram.com/language/guide/ListManipulation.html

Looking at the documentation of the various functions for plotting will help you get started there: http://reference.wolfram.com/language/tutorial/PlottingListsOfData.html

and going through some of the on-line videos can be useful (some are better than others, so use what helps and skip what doesn't), e.g.: http://www.wolfram.com/training/courses/graphics-visualization/

And if you can provide a simple example of what you want to do that will help with being able to provide you with more targeted advice.

POSTED BY: David Reiss

Perhaps post a small example of your JSON file along with an example of what you want to plot? Otherwise your question is extremely general...

POSTED BY: David Reiss
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract