Group Abstract

Message Boards

WOLFRAM COMMUNITY

17.4K Views

21 Replies

0 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Mathematica Statistics and Probability

How can i select a subset of information with mathematica 9?

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Hello every one, i am learning how to use mathematica, i am using mathematica 9. My problem is that when i have several groups in a column i have to create several spreadsheets in order to separate the groups and be able to make operation between them. I would like to know how to select the sub groups without creating several spreadsheets. I add a spreadsheets and a mathematica notebook with some data in which i working on Thank you very much. Attachments:

POSTED BY: Oscar Rodriguez

21 Replies

Sort By:

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Thanks Otto, I tried the code, but for some reason is not working, i don't know if i doing something wrong. Thank you very much Attachments:

POSTED BY: Oscar Rodriguez

Otto Linsuain

Otto Linsuain, Westinghouse Electric Company

Posted 10 years ago

Oscar, You probably want: Part[Select[data, And[#[[1]] == "1 VTS",#[[7]]=="Female"] &], All, 3] This will only give you the values in column 3 for the entries that correspond to 1 VST and females. If you also want to see the word "Female" in those entries, then you would do: Part[Select[data, And[#[[1]] == "1 VTS",#[[7]]=="Female"] &], All, {3,7}] You could also extract the columns 3 and 7 for both males and females (for 1 VTS) and then segregate males and females into two sublists. That would be: GatherBy[Part[Select[data, #[[1]] == "1 VTS" &], All, {3,7}],#[[2]]&] If after this segregation you no longer want to keep the values "Male" and "Female", then you could do: Drop[GatherBy[Part[Select[data, #[[1]] == "1 VTS" &], All, {3,7}],#[[2]]&],None,None,2] This is what I meant when I said you could use this to plot both datasets in the same graph (with different plot markers). The output of the command above should be a list consisting of two sublists (probably of different lengths), and each one of those is just a list of numbers. If you do a ListPlot of the output, then you should get such a plot. The x-axis will just be a counter. Hope it helps, OL.

POSTED BY: Otto Linsuain

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Hello Otto, I was about to say that it is great to have the option to have help in spanish... So i try this In[49] Part[Select[data, #[[1]] == "1 VTS" &], All, 3](Selecciona la columna 3 asociada a 1 VTS) Out[49] {2., 1., 1., 3., 2., 3., 2., 5., 4.} This allowed me to get the data of the column 3 associated to 1 VST. Then i tried to get the data of the column 3 associated to the sex of the participants In[30] Part[Select[data, #[[1]] == "1 VTS" &], All, {3, 7}] Out[30] {{2., "Female "}, {1., "Male "}, {1., "Female "}, {3., "Female "}, {2., "Female "}, {3., "Male "}, {2., "Female "}, {5., "Male "}, {4., "Female "}} But i am no able to get the data of the column 3 associated to a specific sex, for example the data associated only to females. Could you indicate me how to do that?

POSTED BY: Oscar Rodriguez

Otto Linsuain

Otto Linsuain, Westinghouse Electric Company

Posted 10 years ago

Hello Oscar, I think the moderators have a point in that their job is made more difficult if they do not understand the posts. What you are trying to do only requires that you specify more parts in the command: Part[Select[data, #[[1]] == "1 VTS" &], All, {3,n}] where n is the column that specifies the participants' gender. The arguments in Part (after the first one) specify the indexes you want at each depth. To specify a range of indexes, but at one given depth, you just use {} (otherwise it would be confused with a higher dimensional array and you will get an error saying that the part specification exceeds the depth of the object, or something similar). There are options for skipping entries, such as {1,-1,2} (from the first to the last skipping every other). Not sure whether one can specify entry numbers that do not follow a simple pattern (something like {{1,4,2,6,3}}). Notice that the [[]] notation and the semicolon is just a shorthand for Part, it is the same function. Also note that the part specification is understood a bit differently in Take and Drop than it is in Part. Best, OL.

POSTED BY: Otto Linsuain

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 10 years ago

This forum supports currently only English language. Please use only English in your discussions so moderators and other members can understand the content.

POSTED BY: EDITORIAL BOARD

Otto Linsuain

Otto Linsuain, Westinghouse Electric Company

Posted 10 years ago

Oscar y Luis, Me parece que lo que Oscar quiere se debe poder hacer así: data = Import["/home/bird/Documents/ejemplo1.csv", "CSV"]; Part[Select[data, #[[1]] == "1 VTS" &], All, 3] {2, 1, 1, 3, 2, 3, 2, 5, 4} Yo también hago bastante trabajo en Excel, pero para listas grandes se traba mucho y la funcionalidad es mucho más limitada (me parece a mí). Yo hago mucho trabajo que incluye arreglos bastante grades usando sólo las funciones normales de listas (Select, Map, MapThread, Thread, MapAt, MapIndexed, Apply, Part, Take, Drop, Join, Insert, Append, Prepend, Flatten, Partition, etc.). Todas estas funciones resuelven la mayoría de los problemas de manipular listas (o listas de listas de listas). Mathematica tiene además una maquinaria un poco más sofisticada (que yo no he usado mucho) para manipular bases de datos. Si trabajas mucho con bases de datos busca la función Dataset en la documentación. Pero yo creo que tiene sentido primero familiarizarse con las estructuras básicas, como las que menciono arriba. Saludos, OL.

POSTED BY: Otto Linsuain

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Hola Otto, Muchas gracias por ayudarme a entender aun más el funcionamiento de estas herramientas, todos son muy amables. Ese comando que me muestras me permite seleccionar lo que necesito, es decir, solo la columna 3 asociada a 1 VTS, me gustaría saber si con ese mimo comando se puede hacer algo como seleccionar la columna 3 pero esta vez asociada al sexo de los participantes, intenté esto pero no funciona Part[Select[data, #[[1]] == "1 VTS" &], All, 3] Muchas gracias

POSTED BY: Oscar Rodriguez

Luis Ledesma

Posted 10 years ago

No entendi bien tu pregunta, pero si lo que intentas hacer es solamente tomar los elementos de la tercera columna de la variable data sin "1 VTS" podrias usar la misma idea que te comparti en mi comentario anterior. Otra posiblilidad es que quieras una vez seleccionados los pares {"1 VTS",elemento columna3} solamente tomar todos los elementos de la columna 3 podrias hacerlo asi Cases[data, {"1 VTS", _, _, _, _, _, _}][[;; , 1 ;; 3 ;; 2]][[;; , 2]] o si necesitas tener los pares {"1 VTs",elemento columna3} almacenados y despues utilizar la columna 3 creo que podrias utilizar variables, para asi tener disponibles los datos que necesitas hu = Cases[data, {"1 VTS", _, _, _, _, _, _}][[;; , 1 ;; 3 ;; 2]][[;; , 2]] In[9]:= hu[[;; , 2]] Out[9]= {2., 1., 1., 3., 2., 3., 2., 5., 4.} Espero haberte ayudado en algo, pero si sigues teniendo algunos problemas por fa hazmelos saber tal vez entre los dos tengamos alguna solucion.

POSTED BY: Luis Ledesma

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Hola Luis, eres muy amable, ya he logrado individualizar vectores y hacer operaciones entre ellos (como se ve en la imagen), lo cual era mi objetivo, aún no domino muy bien la técnica pero creo que es cuestión de practicar. Muchas gracias por la guía que me has dado Ssaludos

POSTED BY: Oscar Rodriguez

Luis Ledesma

Posted 10 years ago

Hola Oscar espero haber entendido bien tu duda, ¿quieres tener en una misma salida los elementos que tienen "1 VTS" y la tercera columna de la variable data?, de ser asi yo lo he hecho de la siguiente forma usando el archivo xlsx que has adjuntado. data = Import[ "C:\\Users\\miriam\\Downloads\\para wolfram comunity.xlsx", \ {"XLSX", "Data", 1}]; para despues ejecutar esto Grid[Cases[data, {"1 VTS", _, _, _, _, _, _}]] Obteniendo lo mismo que tu hiciste con Select, despues para lo de la tercera columna con lo de "1 VTS" hago lo siguiente que como veras es mas simple de lo que pense. Grid[Cases[data, {"1 VTS", _, _, _, _, _, _}][[;; , 1 ;; 3 ;; 2]]] Te comparto una imagen para que veas lo que he hecho y me digas si es lo que buscas. Saludos.

POSTED BY: Luis Ledesma

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Hola Luis, sí, muchas gracias. Sólo me queda una duda, ¿es posible de alguna manera seleccionar únicamente la data de esa tercera columna, es decir, seleccionar todos los "1 VTS" y la columna 3 pero sin que salga la columna que muestra 1 VST? Muchas gracias

POSTED BY: Oscar Rodriguez

Otto Linsuain

Otto Linsuain, Westinghouse Electric Company

Posted 10 years ago

Oscar, Hay mucha información en esas libretas. Esto hace la pregunta difícil de entender y de responder. Me parece que lo que quieres se puede hacer usando la función GatherBy, por ejemplo GatherBy[tutabla, #[[1]]&] (asumiendo que el protocolo es el primer elemento en cada hilera), y después escogiendo el grupo donde se encuentra 1 VS. Quizá más fácil sea usar Select[tutabla, #[[1]]=="1 VS" &]. Yo uso GatherBy cuando quiero ver todos los grupos, pero segregados (el output de GatherBy se puede pasar a ListPlot, por ejemplo, y ver los gráficos con distintos PlotMarkers). Si no necesito los demás grupos, entonces uso Select. A juzgar por las libretas tienes un nivel bastante avanzado. Saludos, OL.

POSTED BY: Otto Linsuain

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Hola Otto, Muchas gracias por ayudarme, llevo algún tiempo usando mathematica y la verdad siempre he tenido problemas seleccionando sub grupos, entonces cuando quiero hacer una comparación entre grupos, por ejemplo una diferencia de medias o cosas por el estilo, mi única opción ha sido crear dos hojas de cálculo, importar y así poder tenerlos separados para luego comprarlos. Siempre he querido poder, desde una única hoja de cálculo, separar los sub grupos y hacer operaciones entre estos. Así que te agradezco mucho. Sin embargo, soy muy novato en mathematica y me cuesta entender algunas cosas. Intenté replicar lo que me dijiste (Select[tutabla, #[[1]]=="1 VTS" &]) y funcionó muy bien, pude seleccionar todos los datos que en la primera columna tenía 1 VTS, ahora me gustaría saber como puedo, además de seleccionar todos los datos (o sea todas las columnas asociadas a 1 VTS) seleccionar una columna en particular, es decir, por ejemplo, seleccionar todos los datos 1 VTS y también la columna 3?. Para mi sería de gran ayuda si pudieras poner tú ejemplo directamente en el cuadernillo de matehematica con mis datos, de esa manera me sería más fácil comprender tú ejemplo. Aquí adjunto el cuadernillo y la hoja de cálculo. Muchas gracias. Attachments:

POSTED BY: Oscar Rodriguez

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

thank you very much, now i will work more fluently with mathematica

POSTED BY: Oscar Rodriguez

David Keith

Posted 10 years ago

Hi Oscar, Attached is a notebook example which imports your data. It then uses the Mathematca function GatherBy to group the data into sublists, where each sublist contains the records for a given protocol. At this point, a separate but identical analysis can be run on each of these sublists to get a separate analysis for each protocol. I would usually do that be making a function that performs an analysis and then mapping it onto the data collection, but that is a bit advanced. You can also copy the analysis, with new input. Where each input is one of the sublists. Best, David Attachments:

POSTED BY: David Keith

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

i really apreciety your help. I hope this new file be more simple Attachments:

POSTED BY: Oscar Rodriguez

David Keith

Posted 10 years ago

In your data I see some date objects in column 1, not the "1 VS" and the other things you describe. Where are these group designators? This notebook is very complicated. Is it possible you could provide a simpler example? (I will be gone tomorrow, so there will be a delay unless someone else can contribute.)

POSTED BY: David Keith

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

i know that it is probably an annoyance, i am learning mathematica and I can not follow you, could you please make the example directly in the mathematica notebook with my data? I will appreciate

POSTED BY: Oscar Rodriguez

David Keith

Posted 10 years ago

I checked -- it was not new syntax in 10. I think it's been available for quite a while. You can use select together with a selection condition to extract rows by content. For example, code below uses a pure function that returns True for a particular match. A row is selected if the 2nd column contains precisely "B" In[6]:= set = {"A", "B", "C", "D"}; In[7]:= data = RandomChoice[set, {9, 5}] Out[7]= {{"D", "D", "B", "C", "A"}, {"A", "C", "A", "A", "C"}, {"C", "B", "D", "B", "C"}, {"B", "A", "A", "C", "C"}, {"B", "B", "A", "B", "C"}, {"B", "C", "B", "B", "C"}, {"D", "D", "B", "D", "A"}, {"D", "C", "D", "C", "D"}, {"A", "A", "C", "B", "A"}} In[8]:= Select[data, #[[2]] === "B" &] Out[8]= {{"C", "B", "D", "B", "C"}, {"B", "B", "A", "B", "C"}}

I checked -- it was not new syntax in 10. I think it's been available for quite a while.

You can use select together with a selection condition to extract rows by content. For example, code below uses a pure function that returns True for a particular match. A row is selected if the 2nd column contains precisely "B"

In[6]:= set = {"A", "B", "C", "D"};

In[7]:= data = RandomChoice[set, {9, 5}]

Out[7]= {{"D", "D", "B", "C", "A"}, {"A", "C", "A", "A", "C"}, {"C", 
  "B", "D", "B", "C"}, {"B", "A", "A", "C", "C"}, {"B", "B", "A", "B",
   "C"}, {"B", "C", "B", "B", "C"}, {"D", "D", "B", "D", "A"}, {"D", 
  "C", "D", "C", "D"}, {"A", "A", "C", "B", "A"}}

In[8]:= Select[data, #[[2]] === "B" &]

Out[8]= {{"C", "B", "D", "B", "C"}, {"B", "B", "A", "B", "C"}}

POSTED BY: David Keith

Oscar Rodriguez

Oscar Rodriguez, corporación universitaria iberoamericana

Posted 10 years ago

Thanks. In this moment i only have acces to version 9. i do not know if what your code works in V9. And yes, if you see, in the column 1, for example, there are several groups (1 VS, 2 Vs, 3 VAS etc), so i need to select every rows with, for example, 3 VAS. In this moment the only thing that i can do is create another spreadsheets in which i segregate 3 VAS, in order to make operation with other groups

POSTED BY: Oscar Rodriguez

David Keith

Posted 10 years ago

Oscar, I'm not sure I understand but if you want only a particular range of rows for a given column, it can be done like this in V10: In[3]:= d = Table[10 j + i, {j, 9}, {i, 5}] Out[3]= {{11, 12, 13, 14, 15}, {21, 22, 23, 24, 25}, {31, 32, 33, 34, 35}, {41, 42, 43, 44, 45}, {51, 52, 53, 54, 55}, {61, 62, 63, 64, 65}, {71, 72, 73, 74, 75}, {81, 82, 83, 84, 85}, {91, 92, 93, 94, 95}} In[4]:= d[[3 ;; 5, 2]] Out[4]= {32, 42, 52}

Oscar,

I'm not sure I understand but if you want only a particular range of rows for a given column, it can be done like this in V10:

In[3]:= d = Table[10 j + i, {j, 9}, {i, 5}]

Out[3]= {{11, 12, 13, 14, 15}, {21, 22, 23, 24, 25}, {31, 32, 33, 34, 
  35}, {41, 42, 43, 44, 45}, {51, 52, 53, 54, 55}, {61, 62, 63, 64, 
  65}, {71, 72, 73, 74, 75}, {81, 82, 83, 84, 85}, {91, 92, 93, 94, 
  95}}

In[4]:= d[[3 ;; 5, 2]]

Out[4]= {32, 42, 52}

POSTED BY: David Keith

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback