Binning data with associate data
A common 'problem' is that you want to bin data by (e.g.) the x
coordinate, and then you want to have the associated y
with it. So to do this, I often do:
x={24,19,49,5,27,100,18,28,77,38,82,22,2,13,12,32,69,72,52,90,16,9,63,64,10,31,51,14,80,70,21,30,71,79,37,65,84,47,33,81,40,94,68,58,11,15,97,88,1,99,74,78,91,93,89,26,45,98,95,67,4,92,29,43,85,39,73,23,8,62,83,57,35,41,17,34,75,25,66,53,44,36,50,60,3,46,86,42,20,56,6,87,55,76,54,48,7,96,59,61};
y={6,64,21,13,34,100,7,89,83,50,19,32,43,38,60,14,1,31,99,40,80,78,68,95,55,72,63,65,91,71,9,51,70,97,37,25,20,52,88,22,62,81,66,69,35,75,29,4,26,27,41,33,93,18,42,98,77,44,85,17,11,12,57,94,61,54,23,28,30,67,10,3,46,45,87,79,96,16,24,73,58,53,48,36,90,76,92,2,82,39,5,59,49,15,74,8,47,84,56,86};
ybinnedbyx=BinLists[{x,y}\[Transpose],{0,100,10},{-10000,10000,20000}][[All,1,All,2]]
i.e. I turn it into 2D data, and then I make a very big bin size in the y
direction, such that all the data falls in one bin. Of course this works with numerical data, but it doesn't work if the data in the y
direction are strings or a list itself or other....
I see four possible solutions (the 3rd being the most neat):
1
A new binning function that returns lists of indices (BinIndices or BinPositions are good candidate names), so that it can used with Extract
on other data. Still a little fiddly because it (presumably) returns a list of list of indices, and Extract
does not handle that directly, so you probably have to do some combination of Map
, Part
, and a pure function
.
2
A new option to BinLists
, for which I have no good name yet but let's call it BinFunctions
for now. By default, if I give some 2D data:
{{1,2},{2,4},{3,1.5},{4,8}....}
it will bin it first by Part[#,1]&
of the expression and then by Part[#,2]&
. That is, first by the first element, then by the second element...
If we could supply our own BinFunctions
, then we could do something like Part[#,1]&
and 1&
. such that all the y
data goes in one bin.
3
Reduce the necessity to supply n
number of binspecs
when we have "vectors" of length n
. So:
BinLists[{x,y}\[Transpose],{0,100,10}]]
Would just only bin it by the first element of each vector, ignoring the rest of the vector...
This will keep it backward compatible, so that is very good!
4
A new option for BinLists
, that is called something like AssociatedData
(and a combiner function?). Now it bins first the data, and then combine the results with the associated data using a combiner function (List
by default).