Greetings, everyone. I hope you are doing ok during these strange times.
My question "How do you modify and overwrite a Dataset?" is motivated in the following problem:
I have imported a dataset (.csv file attached) using SemanticImport. I have added two columns to this dataset and created new corresponding datasets, using the Append function. These columns were created, as you can see, by doing some operations on data already in the dataset (The procedure is explained in this post: https://community.wolfram.com/groups/-/m/t/313491).
d2sv6 = Append[#,
"theta40" -> #["Theta_0i"] + #["m0i"]*(40 - #["A"])] & /@ d1sv6
d3sv6 = Append[#,
"theta80" -> #["Theta_0i"] + #["m0i"]*(80 - #["A"])] & /@ d2sv6
Now I need to add a large number of columns (ranging from 20 to 100) to this dataset. Unlike the examples above, the columns I am needing to include require data not already in the dataset. For instance, I need one column to be generated by each of the 50 values in the list "test80":
test80 = {-4.465980214, -4.300742326, -4.135504438, -3.97026655, \
-3.805028661, -3.639790773, -3.474552885, -3.309314997, -3.144077109, \
-2.978839221, -2.813601333, -2.648363445, -2.483125556, -2.317887668, \
-2.15264978, -1.987411892, -1.822174004, -1.656936116, -1.491698228, \
-1.32646034, -1.161222452, -0.995984563, -0.830746675, -0.665508787, \
-0.500270899, -0.335033011, -0.169795123, -0.004557235, 0.160680653,
0.325918542, 0.49115643, 0.656394318, 0.821632206, 0.986870094,
1.152107982, 1.31734587, 1.482583758, 1.647821647, 1.813059535,
1.978297423, 2.143535311, 2.308773199, 2.474011087, 2.639248975,
2.804486863, 2.969724751, 3.13496264, 3.300200528, 3.465438416,
3.630676304}
The calculation takes each value of the columns "theta40" or "theta80" (each one represents a scenario) and computes a number using each of the 50 values of the list "test80". For instance, taking just the "theta40" scenario, 50 columns should be created: one for each row-value in "theta40" in combination with each of the 50 values in the list "test80".
Of course, it becomes impractical creating an object for each added column as I did in the first part of the problem. Ideally, after all columns are created, a single new dataset should be created for each scenario. There are two main questions I have on this problem:
- How do I append and overwrite the dataset so it includes the values from the list?
- How do I get each new column to have a different name? Perhaps should I use an array instead of a list?
For those of you interested, this problem is relevant to the fields of psychometrics and education. The numbers to be calculated are probabilities of correct response (following what is known as a Rasch model). The values of the list "test80" can be taught as item difficulties. The data to be generated simulates performance on a test. The need to add columns to the dataset is common in education; actually the problem is trivial using a spreadsheet. The problem is that calculating huge numbers of formulas in a spreadsheet seems to be very inefficient.
Attachments: