Message Boards Message Boards

0
|
2424 Views
|
2 Replies
|
1 Total Likes
View groups...
Share
Share this post:
GROUPS:

Sorting data within an array

Posted 10 years ago

Hello,

I am a beginner with Mathematica. I work with a dataset of logs containing user/itemid/rating. I try to extract logs for which the rating are up to a value. Herafter my attempt:

In[10]:= DataSetOutTime[[1]]

Out[10]= {196, 242, 3}

In[20]:= Select[DataSetOutTime, DataSetOutTime[[All, 3]] > 3 &]

Out[20]= {}

Is there something easy like (possible with python) "Extract= [x for x in DataSetOutTime if DataSetOutTime[x,3] > 3]" ?

BR

2 Replies

Thks a lot: your propositions are OK !

Now, I'd like to extract from the previous set (logs with ratings up to 3), the logs attached to a user who has performed at least 10 ratings (I will compute accuracy and recall and wish to work with user with at least 10 rates). In others words, I retain userid only if the userid appears at least 10 times.

I can select this list of userid:

In[196]:= TableUserIDAuMoins10 = Select[Counts[CorpusReferenceu1test[[All, 1]]], # > 9 &]

Out[196]= <|1 -> 79, 2 -> 12, 5 -> 25, 6 -> 64, 7 -> 135, 8 -> 22, ...|>

In[197]:= Keys[TableUserIDAuMoins10]

Out[197]= {1, 2, 5, 6, 7, 8, 10, 11, 12, 13, ...}

But now, I don't see how I can write "select logs form CorpusReferenceu1test where userid is within Key[TAbleUSerIDAumoins10]... (logs are (userid, itemid, rates), so that Keys[TableUserIDAuMoins10] can be seen as the list of the available userid )

BR

There's a syntax kinda like this with Dataset. Dataset is kinda like Dataframes from Python's Pandas package.

To do it simply however, I would just write the following.

Select[DataSetOutTime, #[[3]] > 3 &]

or maybe

Select[DataSetOutTime, Last[#] > 3 &]

This says "Select every element from DataSetOutTime where the third/last element of element is greater than three".

POSTED BY: Sean Clarke
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
Attachments
Remove
or Discard

Group Abstract Group Abstract