# Message Boards

Answer
(Unmark)

GROUPS:

Goal: Mutual Information is an interesting quantity as it measure correlations between data which cannot be measured by the correlation function. The goal of this project is to develop a general enough implementation of mutual information which can be applied to any kind of data, and then to explore how to extract meaningful measurements from different types of data. Results: We were able to develop a function which calculates mutual information between two sets of data, and relies on a distance function to do so. As any data can be considered to be embedded in a metric space, this implementation works in principle for any kind of data, whether it is images, videos or just time series. We then tested and later explored how this can be used for extracting meaningful measurements from audio and videos of well known systems, and ultimately applied it to a more general example. Further developments: The function uses a k-th nearest neighbours approach to estimate mutual information, so heavily relies on built-in functions like Nearest and DistanceMatrix. Efficiency is certainly an issue that must be addressed, especially when using user-defined distances compared to built in ones. Furthermore, exactly which distance or preprocessing is useful in extracting meaningful measurements is an open ended question which needs much more investigation. Imagine you’re sitting by a fountain looking at the calming and steady movement of the water stream going up and down. Then a thought strikes: is it just a periodic movement, or is there something more? Say you just have a camera and record the movement to inspect it later out of curiosity. To extract something even remotely meaningful would require quite a bit of image processing on the video frames, an expert’s work. What if you had a function which allowed you to just plug in the video and get an estimate, even if a bit rough, of how much information is “common” to the frames? And what about doing the same with, say, the frequency spectra or the amplitude of an audio file? The approach we explored in this project is to implement a function which estimates a quantity called mutual information. Consider two random variables X Y I(X,Y) X Y H(X) I(X,Y)=H(X)-H(X|Y) where H(X|Y) X Y X Y But what has this to do with media files? Well, pretty much any media file can be thought as laying on some kind of metric space, i.e. a space with a distance function. Once one has a distance between two objects, one can devise a procedure to “bin” the space they lay in and look at them as probability distributions. More in detail, we implemented an estimator published by Kraskov et al. [1] X Y In the following, we will sometimes refer to the mutual information as MI.
Example : Logistic Map But why should we care about the mutual information? There are many other quantities which are able to estimate relationships among variables and which are easier to compute, like for example the widely used correlation function. The point in using the mutual information is that it can detect relationships of higher order than the correlation function does. In particular, if we briefly go back to random variables, the mutual information becomes 0 if and only if the two variables X and Y I(X, Y) = 0 ⇔ p XY p X p Y where p XY X Y In order to better illustrate the concept above, let’s consider a well know system: the logistic map, a 1D system which for some parameter values exhibits chaotic motion. Its evolution is given by the iterations (on an initial condition x 0 x n+1 x n x n It is clear that the function above has a quadratic form: Manipulate[Plot[rx(1-x),{x,0,1},PlotLabelStyle[ x Style["n",Italic]+1 x Style["n",Italic] x Style["n",Italic] In[]:= If we iterate the function it will eventually end up cycling between a set of values or even just a single one. This set is called an attractor. In the plot below we show the attractor of the logistic map as a function of the starting point x 0 r Manipulate(limits=Compile[{r},({r,#1}&)/@Union[Drop[NestList[#1r(1-#1)&,0.234,pts],300]]];) r1-r0 n x 0 x Style["n",Italic]+1 x Style["n",Italic] x Style["n",Italic] 1 5 yhigh-ylow 10 In[]:= What about the correlation between consecutive points in the evolution of the map? In the plot below, we show how the correlation function rapidly plunges to 0 after just one step, whereas the mutual information decreases much more slowly, for r=4 ClearAll[lm];lm=Compile[{{x,_Real},{r,_Real}},NestList[r*#(1.-#)&,x,10000]];Block[{mutinfo,logdata,corr,root,timeMI,timeCorr,mivalue,r=4.},logdata=lm[RandomReal[],4.];{mutinfo,corr}=Transpose@Table[{KraskovI1[logdata〚;;1000〛,logdata〚(T+1);;(1000+T)〛,4],CorrelationFunction[logdata〚;;1000〛,T]},{T,0,20}];ListLinePlot[{mutinfo,corr},PlotRangeAll,PlotLabel"Mutual Information vs Correlation",Epilog{Inset[LineLegend[ColorData[97]/@{1,2},{"Mutual Information","Correlation"}],Center]},AxesLabel{"Time"},ImageSizeLarge]] In[]:= This means that the mutual information shows a relationship between consecutive points which cannot be seen by the correlation function.
Application: delay embedding of Rössler attractor In order to test our implementation of the mutual information, we then decided to apply it to another system. We took into consideration the Rössler attractor, defined by the equations
The solutions to the above system gives the attractor represented below: ClearAll[rosslervalues];Module[{x,y,z},rosslervalues=With[{a=0.2,b=0.2,c=8.0},NDSolveValue[{x'[t]-(y[t]+z[t]),y'[t]x[t]+ay[t],z'[t]b+x[t]*z[t]-cz[t],x[0]1,y[0]1,z[0]1},{x,y,z},{t,0,10000}]];Manipulate[Show[ParametricPlot3D[Through[rosslervalues[t]],{t,0,100},PlotRangeAll,PlotLabelStyle["Rössler attractor",16,Bold],AxesLabel{"x","y","z"},PlotStyle{Thin},ImageSizeLarge],Graphics3D[{Red,PointSize[Medium],Point[Through[rosslervalues[time]]]}]],{{time,0,"Time"},Range[0,100,0.1],ControlTypeSlider},SaveDefinitionsTrue,SynchronousUpdatingFalse]] Now, we want to use the mutual information to perform delay embedding. Basically, say I have the above system but can only measure the first coordinate x(t), at certain time intervals. Can I reconstruct the system properties just from it? The answer is yes! Using Taken’s theorem: consider a time delay T and build the vector (x(t),x(t+T),...,x(t+mT)) m m>2n But which is the best delay T I(Δt)=I(x(t),x(t+Δt)) and make the ΔT T here T=1.52s ClearAll[tembed,firstc];tembed=1.52;firstc=Function[{x},rosslervalues〚1〛[x],Listable];ParametricPlot3D[{firstc[t],firstc[t+tembed],firstc[t+2tembed]},{t,0,100},AxesLabel{"x[t]","x[t+T]","x[t+2T]"},PlotLabelStyle["Delay embedding of Rössler attractor with T="<>ToString[tembed],15,Bold],ImageSizeLarge] In[]:= We see that the shape above is quite close to the initial one, apart from a smooth geometrical transformation. In the plot below, we show our estimation of the mutual information for this system. The minimum seems close to ≈1.5s (time scale must be divided by 10) {mutinforossler,errorrossler}=Transpose@MonitoredMap[Function[{T},{KraskovI1[firstc[Range[0,1000]],firstc[Range[0,1000]+T],4],miError[KraskovI1,firstc[Range[0,1000]],firstc[Range[0,1000]+T],4]}],Range[0,20,.1]]; In[]:= corrrossler=CorrelationFunction[firstc[Range[0,1200,0.1]],{0,200,1}];Rasterize[ListLinePlot[{mutinforossler,corrrossler,mutinforossler+errorrossler,mutinforossler-errorrossler},PlotRangeAll,FrameTrue,FrameLabel{"Time (tenth of seconds)"},PlotStyle{ColorData[97][1],ColorData[97][2],Directive[Red,Opacity[0.2]],Directive[Red,Opacity[0.2]]},PlotLegendsPlaced[{"Mutual information","Correlation","Error"},Top],PlotTheme"Detailed",ImageSizeLarge]] We also computed the correlation, using the built-in function CorrelationFunction, and compared it in the plot with the mutual information. Notice that their oscillations are similar, but the peaks in the latter are a clearer indication of the periodic behavior of the Rössler attractor. In particular, the second peak corresponds to a full period, and the first minimum is expected to be somewhere a quarter of the period. The behaviour we get is quite similar to the expected one, but now another question arises. Does this estimator for MI depend on the particular distance used? We talked before about using different distances, and until now we' ve used the so called ∞ - norm under the hood (ChessboardDistance in Mathematica), but any distance function can be easily plugged in the MI estimator just by using the DistanceFunction option and providing either a built-in or a user defined function (here for simplicity we plot on a shorter timescale): Block[{miaux},miaux=MonitoredMap[Function[{T},KraskovI1[firstc[Range[1,1000]],firstc[Range[1,1000]+T],4,DistanceFunction#]],Range[0,10,0.1]]&;rosslerMIassociation=AssociationMap[miaux,{ChessboardDistance,EuclideanDistance,SquaredEuclideanDistance}];] In[]:= distancesplots=ListLinePlot[#,PlotRangeAll,Frame{{True,False},{True,False}},FrameLabel{"Time (tenth of seconds)","Mutual Information"},PlotTheme"Detailed",ImageSizeLarge]&/@rosslerMIassociation;TabView[distancesplots] In[]:= ChessboardDistance EuclideanDistance SquaredEuclideanDistance Out[]= We can clearly see from the plots above that, for such a simple system, the behaviour is the same no matter which distance function is used.
Applications: audio and video Ok, everything seems to be good on well know systems, but why do we not pass to something a bit more interesting? Let' s say I have an audio or video file of any kind, can I extract some information from it in a way I can make sense of it? In principle, any list of “stuff” can be plugged into the MI estimator. But in order to extract some useful information, we must be careful about what we actually feed to it.
Audio Let' s consider again the Rössler system, but this time in a slightly different setup. We encode its first coordinate in an audio file using midi tones: Block[{midiToFreq},midiToFreq[m_]:=2^((m-69)/12)*440.;audioross=LowpassFilter[AudioGenerator[{"Sin",TimeSeries[midiToFreq/@(50+firstc[Range[0,100,0.1]]),{0,3}]},3],Quantity[1000,"Hertz"]]] In[]:=
Out[]= The audio is funny, but the spectrogram is quite interesting too (we applied a LowPass filter to filter out noise at higher frequencies): Spectrogram[audioross,ImageSizeMedium] In[]:= It would be nice now to recover the behaviour we saw above for the Rössler attractor just from the frequency spectra of this audio file. But we first need some kind of distance on frequency spectra! We’ll use two different distances: Euclidean: ∑ i 2 ( x i y i Slope: ∑ i 2 ( , x i , y i , x i x i+1 x i Now we just have to plug into the MI estimator the frequency spectra and the distances: audiorossspectra=Abs[SpectrogramArray[audioross,Automatic,Automatic]]; In[]:= ClearAll[spectralDistanceEuclidean,spectralDistanceSlope];spectralDistanceEuclidean=Sqrt@Total[(#1-#2)^2]&;spectralDistanceSlope=Sqrt@Total[(Differences[#1]-Differences[#2])^2]&; In[]:= {eucliddistM,euclidError}=Transpose@With[{TMax=100,l=Length[audiorossspectra]},MonitoredMap[Function[{T},{KraskovI1DM[audiorossspectra〚;;(l-TMax)〛,audiorossspectra〚(T+1);;(T+l-TMax)〛,4,DistanceFunctionspectralDistanceEuclidean],miError[KraskovI1DM,audiorossspectra〚;;(l-TMax)〛,audiorossspectra〚(T+1);;(T+l-TMax)〛,4,DistanceFunctionspectralDistanceEuclidean]}],Range[0,TMax]]];{slopedist,slopedistError}=Transpose@With[{TMax=100,l=Length[audiorossspectra]},MonitoredMap[Function[{T},{KraskovI1DM[audiorossspectra〚;;(l-TMax)〛,audiorossspectra〚(T+1);;(T+l-TMax)〛,4,DistanceFunctionspectralDistanceSlope],miError[KraskovI1DM,audiorossspectra〚;;(l-TMax)〛,audiorossspectra〚(T+1);;(T+l-TMax)〛,4,DistanceFunctionspectralDistanceSlope]}],Range[0,TMax]]]; In[]:= RowListLinePlot{eucliddistM+euclidError,eucliddistM,eucliddistM-euclidError}, The behaviour obtained is strikingly similar to the Rössler original one! Also, once again we get similar results for two different distances. The exact values obtained depend on the sampling rate of the audio and on how we encode the values into frequency, but the functional form doesn’t lie.
Video Since we' ve briefly touched audio, the next step is video files! Let’s consider this animation of a double pendulum: Module{deqns,aeqns,ics,params},deqns={Subscript[m,1]x1''[t](λ1[t]/Subscript[l,1])x1[t]-(λ2[t]/Subscript[l,2])(x2[t]-x1[t]),Subscript[m,1]y1''[t](λ1[t]/Subscript[l,1])y1[t]-(λ2[t]/Subscript[l,2])(y2[t]-y1[t])-Subscript[m,1]g,Subscript[m,2]x2''[t](λ2[t]/Subscript[l,2])(x2[t]-x1[t]),Subscript[m,2]y2''[t](λ2[t]/Subscript[l,2])(y2[t]-y1[t])-Subscript[m,2]g};aeqns={x1[t]^2+y1[t]^2Subscript[l,1]^2,(x2[t]-x1[t])^2+(y2[t]-y1[t])^2Subscript[l,2]^2};ics={x1[0]1,y1[0]0,x1'[0]0,y1'[0]0,x2[0]1,y2[0]-1,x2'[0]0,y2'[0]0};params={g9.81,Subscript[m,1]1,Subscript[m,2]1,Subscript[l,1]1,Subscript[l,2]1};soldp=First[NDSolve[{deqns,aeqns,ics}/.params,{x1,y1,x2,y2,λ1,λ2},{t,0,1000},Method{"IndexReduction"{Automatic,"IndexGoal"0}}]];animationdoublependulum=
ListAnimate[frames,AnimationRunningFalse] In[]:= What's its information content? We can compute the mutual information from its components: doublependulum1mi=MonitoredMap[Function[{T},KraskovI1DM[soldp〚1,2〛[Range[0,30,0.025]],soldp〚1,2〛[Range[0,30,0.025]+T],4]],Range[0,7.5,0.025]];doublependulum2mi=MonitoredMap[Function[{T},KraskovI1DM[soldp〚3,2〛[Range[0,30,0.025]],soldp〚3,2〛[Range[0,30,0.025]+T],4]],Range[0,7.5,0.025]];{doublependulum1cr,doublependulum2cr}={CorrelationFunction[soldp〚1,2〛[Range[0,37.5,0.025]],{300}],CorrelationFunction[soldp〚3,2〛[Range[0,37.5,0.025]],{300}]}; In[]:= Rasterize@LabeledGridListLinePlot{doublependulum1mi,doublependulum2mi}, In the above plot, confidence regions (red bands seen in the previous plots) are not plotted for simplicity. Also, the time samples correspond to 1 40 How can we extract the same estimation from the animation above? We attempted in many different ways. In particular, we noticed that just using DimensionReduce in 3 dimension gave an interesting result: doublependulumreduced=DimensionReduce[frames,3,Method"TSNE"];ListPointPlot3D[doublependulumreduced,BoxedFalse,AxesFalse] Using the "TSNE" method just on the list of frames, we get quite a nice trajectory. If we plug the data points obtained into our estimator: {dpframesmi,dpframeserror}=Transpose@With[{TMax=300,l=Length[doublependulumreduced]},MonitoredMap[Function[{T},{KraskovI1DM[doublependulumreduced〚;;(l-TMax)〛,doublependulumreduced〚(T+1);;(T+l-TMax)〛,4],miError[KraskovI1DM,doublependulumreduced〚;;(l-TMax)〛,doublependulumreduced〚(T+1);;(T+l-TMax)〛,4]}],Range[0,TMax]]]; In[]:= Rasterize@ListLinePlot[{dpframesmi+dpframeserror,dpframesmi,dpframesmi-dpframeserror},PlotRange{All,{0,2}},PlotStyle{Directive[Red,Thin,Opacity[0.3]],Directive[ColorData[97][1],Thick],Directive[Red,Thin,Opacity[0.3]]},Filling{1{3}},FrameTrue,FrameLabel{"Time delay (frames of video)","Mutual Information"},PlotLegendsPlaced[{{"Error"},{}},Above],PlotTheme"Detailed"] In[]:= Through this procedure, we get a behaviour which is quite similar to the estimation of MI on the first pendulum coordinate! Since the frames contain both pendulums, a small contribution from the second pendulum can be seen from the fact that the valleys and the sixth peak are a bit higher than in the first pendulum MI.
Karman vortex street Until now we’ve seen some example of system we were able to reproduce entirely through some system of differential equations. What about a system of which we do not know the behaviour beforehand? An interesting one is the Karman vortex street: imagine having a fluid flowing in one direction and a fixed obstacle is in the way. If the fluid flows at low speeds it will mostly just go around the obstacle without much hassle, but as the speed increases (typically with Reynold's number ⩾ 90) vortexes will start forming from the obstacle and will be transported along with the flow, forming kind of a path. The video below shows a (simulated) example of what we just described: karmanVortexVideoCF=Import["https://upload.wikimedia.org/wikipedia/commons/b/b3/Karman_Vortex_Street_Off_Cylinder.ogv"]; In[]:= We can see that in this particular video there’s a quite repetitive and regular behaviour. We tried following the same procedure as for the double pendulum, by using DimensionReduction on the frames to 3 dimensions, with the TSNE method (we also trim to the interesting part): karmanVortexVideoCFframes=ImageResize[#,240]&/@VideoFrameList[VideoTrim[karmanVortexVideoCF,{5,16}],All];karmanfeatures=DimensionReduce[ColorConvert[karmanVortexVideoCFframes,"Grayscale"],3,Method"TSNE"]; In[]:= ListPointPlot3D[karmanfeatures,BoxedFalse,AxesFalse] In[]:= The results are quite interesting: there’s a clear periodicity in the plot, with just a bit of noise. Notice that we had to convert the frames to grayscale in order to use DimensionReduce. Let' s try to see if we can extract some indication of this behaviour from the MI estimator: {resultskarmanDR,errorskarmanDR}=Transpose@With[{TMax=100,l=Length[karmanfeatures]},MonitoredMap[Function[{T},{KraskovI1DM[karmanfeatures〚;;(l-TMax)〛,karmanfeatures〚(T+1);;(T+l-TMax)〛,4,DistanceFunctionEuclideanDistance],miError[KraskovI1DM,karmanfeatures〚;;(l-TMax)〛,karmanfeatures〚(T+1);;(T+l-TMax)〛,4,DistanceFunctionEuclideanDistance]}],Range[0,TMax]]]; In[]:= Rasterize@ListLinePlot{resultskarmanDR+errorskarmanDR,resultskarmanDR,resultskarmanDR-errorskarmanDR}, We see that the mutual information initially goes down, but then plateaus to a constant value, quite a high value compared to the other plots we’ve seen until now. We interpret this as an indication of the highly regular behaviour of the system shown in the video above: a high value of the mutual information indicates high predictability in the system evolution. But as the system evolves one might expect to see, every few frames, some prominent peaks indicating the period of the system. We weren’t able to see any clear peak from the method above, so we attempted another approach: since the vortexes create a distortion in the background colors, maybe trying to follow such a distortion on the image might work. We used ImageKeypoints with a maximum of 30 markers: keypoints=ImageKeypoints[#,"Position",MaxFeatures30]&/@karmanVortexVideoCFframes;Show[karmanVortexVideoCFframes〚1〛,Graphics[Table[{{White,Circle[p,1]}},{p,keypoints〚1〛}]]] In[]:= Now we have just what could be seen as a distribution of points in 2D, for every frame. If we apply the MI estimator to this distribution: {karmankeypoints,errorkarmankeypoints}=Transpose@With[{TMax=100,l=Length[keypoints]},MonitoredMap[Function[{T},{KraskovI1DM[Standardize[Flatten/@keypoints〚;;(l-TMax)〛],Standardize[Flatten/@keypoints〚(T+1);;(T+l-TMax)〛],4,DistanceFunctionEuclideanDistance],miError[KraskovI1DM,Standardize[Flatten/@keypoints〚;;(l-TMax)〛],Standardize[Flatten/@keypoints〚(T+1);;(T+l-TMax)〛],4,DistanceFunctionEuclideanDistance]}],Range[0,TMax]]]; In[]:= karmanpeaks=FindPeaks[karmankeypoints];karmanpeaksframes=Extract[karmanVortexVideoCFframes,Transpose[{First/@Rest[karmanpeaks]}]];Rasterize@ShowListLinePlot{karmankeypoints+errorkarmankeypoints,karmankeypoints,karmankeypoints-errorkarmankeypoints}, |