Community RSS Feed
http://community.wolfram.com
RSS Feed for Wolfram Community showing any discussions in tag Wolfram Cloud sorted by active[WSC18] Classifying & Converting the Major Currencies By Using Neural Nets
http://community.wolfram.com/groups/-/m/t/1382598
![enter image description here][1]
#Introduction
Throughout the years, I've traveled around the world experiencing new places, eating great food, and learning about other cultures. In order to do some of these things, I needed to use the national currency of the country I was in. Whether it was using Euros in France to buy some macarons, or using the Pound in England to buy tea, I noticed that each currency is unique and its design is a part of the country's identity. My project allows a user to input as many images of different coins, and select a currency to convert them all to. The Microsite identifies what country the coin is from, the name of the currency, the value in native currency, and the converted value.
#Gathering Data
Since the Wolfram database didn't have a large dataset of foreign coin images, we had to gather some data manually. We started out by using WebImageSearch and keywords to pull images off of Bing. Since coin designs change every couple years we used en.ucoin.com as a reference as to what the current design was. Then, we edited the set of images that the WebImageSearch gave me. We then exported the relevant images to a folder named after the coin's country of origin, and named the file after its name and value. The files were saved as .wxf. We did this process for all 42 coins and additionally pulled more data from Google Images.
![enter image description here][2]
#Different Approaches to Feeding the Neural Network
Creating the right data set and matching it with a network took 4 attempts
##Attempt 1
For the first try, we created training data by joining all of the data saved in the folders (at the time this was just Euros and Canadian Dollars) and randomized it. We threaded each folder and gave the images labels of the country. We tried to use the Classify function but found that it was not complex enough.
fullEUData =
Thread[Flatten[
Map[ImageResize[#, {50, 50}] &, Import[#]] & /@
FileNames["*", "coin_data/euro"]] -> "EU"];
fullCanadianData =
Thread[Flatten[
Map[ImageResize[#, {50, 50}] &, Import[#]] & /@
FileNames["*", "coin_data/canadian"]] -> "Canadian"];
trainingData = RandomSample@Join[fullEUData, fullCanadianData];
cf = Classify[trainingData]
![enter image description here][3]
##Attempt 2
Since I didn't have the skills to build my own neural network, my mentor suggested we use the ImageIdentify net. We were able to modify the net to fit the goal of the project and had great results with the net. However, during this attempt we found that the net was depending on the transparent or white backgrounds of the pure images.
![enter image description here][4]
##Attempt 3
After analyzing the results of the 2nd attempt, we decided to process and format the images. We removed the backgrounds from each image, randomized the brightness, contrast, and angles, and layered the coins onto a background of blended color and noise.
preprocess[image_ (*Removes background*)
] :=
ImagePad[RemoveBackground@image, 5, Padding -> None]
background // Clear;
background := (*Creates a random background*)
Blend[{ConstantImage[RandomColor[], {224, 224}],
RandomImage[1, {224, 224}, ColorSpace -> "RGB"]}, RandomReal[]];
randomLighting[
image_] := (*randomizes brightness and contrast of image*)
ImageAdjust[image, RandomReal[{-.25, .25}, 2]];
overlay[coin_Image, background_Image] := (*Layers everything together*)
ImageCompose[
ImageCrop[background, {224, 224}]
,
ImagePerspectiveTransformation[
randomLighting@ImageResize[coin, RandomReal[{60, 260}]],
IdentityMatrix[2] + RandomReal[{-.75, .75}, {2, 2}],
DataRange -> Full,
Background -> White
]
,
{RandomReal[{60, 190}], RandomReal[{70, 180}]}
]
For this attempt we also created data using a generator rather than having a pre made dataset. This helped the net train much quicker and was great at identifying 3 currencies. However, when we increased the number of currencies to 7, it struggled.
![enter image description here][5]
##Attempt 4 / Final Attempt
In the last 3 attempts we had assigned the images from each country the same label, ex: "USA" or "UK". We realized that the net could get confused due to thinking up to 7 different coins were all the same thing. So, we changed the labels from country, to country & value. Ex: "USA_0.25". Then, we joined all of the countries data and that was the fullOriginalData set. This was the final attempt and worked the best because each image of a specific value and country had its own label, therefore avoiding confusion for the neural net.
fullOriginalData =
RandomSample[
Join[fullCanadianData, fullEUData, fullUKData, fullChinaData,
fullJapanData, fullSwissData, fullUSData, fullUSData]];
We used the same process to format the images with backgrounds, but instead of using a generator to create data, we used a pre-made set. In order to create test data we split the fullData 80%, 20%.
createRandomData // ClearAll;
createRandomData[coin_ -> label_, background_] :=
Thread[overlay[preprocess@coin, background] -> label];
data1 = Table[createRandomData[RandomChoice[fullOriginalData], background],
Length[fullOriginalData]];
data2 = Table[createRandomData[RandomChoice[fullOriginalData], background],
Length[fullOriginalData]];
(*Random group of data*)
fullData = RandomSample@Flatten@Join[data1, data2];
Later on, we increased the data set from 900 to 5,000.
trained =
NetTrain[new, trainingData, ValidationSet -> testdata,
BatchSize -> 20, MaxTrainingRounds -> 30]
##Accuracy
The results of the neural net were extremely accurate. With a 99.58% accuracy rate on training data, and a 99.52% accuracy rate on test data, the program is able to make an accurate guess every time.
![enter image description here][6]
##Assigning the Output with a List of Characteristics
The net outputs the label which gives the country and value, ex: "USA_0.25". In order to have the microsite's output look better, we created a list of lists that included written out characteristics.
assignments =
Association[ {"CAN_1" -> {"Canada", "CanadianDollars", 1.00},
"CAN_2" -> {"Canada", "CanadianDollars", 2.00},
"CAN_0.50" -> {"Canada", "CanadianDollars", 0.50},
"CAN_0.25" -> {"Canada", "CanadianDollars", 0.25},
"CAN_0.10" -> {"Canada", "CanadianDollars", 0.10},
"CAN_0.05" -> {"Canada", "CanadianDollars", 0.05},
"CAN_0.01" -> {"Canada", "CanadianDollars", 0.01},
"EU_1" -> {"European Union" , "Euros", 1.00},
"EU_0.50" -> {"European Union", "Euros", 0.50},
"EU_0.20" -> {"European Union", "Euros", 0.20},
"EU_2" -> {"European Union", "Euros", 2.00},
"EU_0.10" -> {"European Union", "Euros", 0.10},
"EU_0.05" -> {"European Union", "Euros", 0.05},
"EU_0.02" -> {"European Union", "Euros", 0.02},
"EU_0.01" -> {"European Union", "Euros", 0.01},
"UK_1" -> {"United Kingdom", "BritishPounds", 1.00},
"UK_2" -> {"United Kingdom", "BritishPounds", 2.00},
"UK_0.50" -> {"United Kingdom", "BritishPounds", 0.50},
"UK_0.20" -> {"United Kingdom", "BritishPounds", 0.20},
"UK_0.10" -> {"United Kingdom", "BritishPounds", 0.10},
"UK_0.05" -> {"United Kingdom", "BritishPounds", 0.05},
"UK_0.02" -> {"United Kingdom", "BritishPounds", 0.02},
"UK_0.01" -> {"United Kingdom", "BritishPounds", 0.01},
"CHN_1" -> {"China", "ChineseYuan", 1.00},
"CHN_0.50" -> {"China", "ChineseYuan", 0.50},
"CHN_0.10" -> {"China", "ChineseYuan", 0.10},
"JPN_1" -> {"Japan", "Yen", 1}, "JPN_5" -> {"Japan", "Yen", 5},
"JPN_10" -> {"Japan", "Yen", 10},
"JPN_50" -> {"Japan", "Yen", 50},
"JPN_100" -> {"Japan", "Yen", 100},
"JPN_500" -> {"Japan", "Yen", 500},
"SUI_5" -> {"Switzerland", "SwissFrancs", 5.00},
"SUI_2" -> {"Switzerland", "SwissFrancs", 2.00},
"SUI_1" -> {"Switzerland", "SwissFrancs", 1.00},
"SUI_0.50" -> {"Switzerland", "SwissFrancs", 0.50},
"SUI_0.20" -> {"Switzerland", "SwissFrancs", 0.20},
"SUI_0.05" -> {"Switzerland", "SwissFrancs", 0.05},
"USA_0.25" -> {"United States", "USDollars", 0.25},
"USA_0.10" -> {"United States", "USDollars", 0.10},
"USA_0.05" -> {"United States", "USDollars", 0.05},
"USA_0.01" -> {"United States", "USDollars", 0.01}
}];
The second element in the sublist is the name of the currency saved in Mathematica. This string is an input for the function CurrencyConvert. Not only does the Microsite list this value as the name of the currency, but it also fetches it to use in the conversion feature.
##Deploying the Microsite
Using CloudDeploy and FormFunction, we made a microsite that allows the user to input as many images as they'd like from the 7 currencies into the drop box. The site also allows the user to select one of the 7 currencies in the dropdown menu to covert all of the coins to. The output gives the image, country, name of currency, value in native currency, and value in converted currency all in table form. At the bottom of the page the output also neatly displays the total in original currencies, and the total in the converted currency.
![enter image description here][7]
![enter image description here][9]
##Future Work
In the future, this project can be expanded by:
Adding all of the currencies in the world
, Since the design changes often, date the currencies by years
, Use this to create a mobile app that allows the user to easily take a photo of the coins they have.
##Acknowledgements
This project could not have been completed without the help and insight from my mentor Rick Hennigan
[Click Here to view the Microsite][10]
##Computational Essay Is Down Below
[1]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at1.11.38PM.png&userId=1371841
[2]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at1.22.37PM.png&userId=1371841
[3]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at1.35.35PM.png&userId=1371841
[4]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at1.38.33PM.png&userId=1371841
[5]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at1.45.54PM.png&userId=1371841
[6]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at1.59.43PM.png&userId=1371841
[7]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-12at7.01.17PM.png&userId=1371841
[8]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-12at7.01.35PM.png&userId=1371841
[9]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-13at2.09.09PM.png&userId=1371841
[10]: https://www.wolframcloud.com/objects/kennethleeny/CoinIdentifierMorgan Lee2018-07-13T18:12:01Z[WSC18] Predicting the Halting Problem with Machine Learning
http://community.wolfram.com/groups/-/m/t/1384403
# A Machine Learning Analysis of the Halting Problem over the SKI Combinator Calculus
![A rasterised SK combinator with length 50, evaluated to 5 steps][16]
## Abstract
Much of machine learning is driven by the question: can we learn what we cannot compute? The learnability of the halting problem, the canonical undecidable problem, to an arbitrarily high accuracy for Turing machines was proven by Lathrop. The SKI combinator calculus can be seen as a reduced form of the untyped lambda calculus, which is Turing-complete; hence, the SKI combinator calculus forms a universal model of computation. In this vein, the growth and halting times of SKI combinator expressions is analysed and the feasibility of a machine learning approach to predicting whether a given SKI combinator expression is likely to halt is investigated.
## 1. SK Combinators
What we will refer to as 'SK Combinators' are expressions in the SKI combinator calculus, a simple Turing-complete language introduced by Schönfinkel (1924) and Curry (1930). In the same way that NAND gates can be used to construct any expression in Boolean logic, SK combinators were posed as a way to construct any expression in predicate logic, and being a reduced form of the untyped lambda calculus, any functional programming language can be implemented by a machine that implements SK combinators. While implementations of this language exist, these serve little functional purpose - instead, this language, a simple idealisation of transformations on symbolic expressions, provides a useful tool for studying complex computational systems.
### 1.1 Rules and Expressions
Formally, SK combinator expressions are binary trees whose leaves are labelled either '*S*', '*K*' or '*I*': each tree *(x y)* represents a function *x* applied to an argument *y*. When the expression is evaluated (i.e. when the function is applied to the argument), the tree is transformed into another tree, the 'value'. The basic 'rules' for evaluating combinator expressions are given below:
*k[x][y] := x*
The K combinator or 'constant function': when applied to *x*, returns the function *k[x]*, which when applied to some *y* will return *x*.
*s[x][y][z] := x[z][y[z]]*
The S combinator or 'fusion function': when applied to *x, y, z*, returns *x* applied to *z*, which is in turn applied to the result of *y* applied to *z*.
*i[x] := x*
The I combinator or 'identity function': when applied to *x*, returns *x*.
Note that the I combinator *I[x]* is equivalent to the function *S[K][a][x]*, as the latter will evaluate to the former in two steps:
*S[K][a][x]*
*= K[x][a[x]]*
*= x*
Thus the I combinator is redundant as it is simply 'syntactic sugar' - for the purposes of this exploration it will be ignored.
These rules can be expressed in the Wolfram Language as follows:
SKRules={k[x_][y_]:> x,s[x_][y_][z_]:> x[z][y[z]]}
### 1.2 Evaluation
The result of applying these rules to a given expression is given by the following functions:
SKNext[expr_]:=expr/.SKRules;
Returns the next 'step' of evaluation of the expression *expr* - evaluating all functions in *expr* according to the rules above without evaluating any 'new'/transformed functions.
SKEvaluate[expr_,n_]:=NestList[#1/.SKRules&,expr,n];
Returns the next *n* steps of evaluation of the expression *expr*
SKEvaluateUntilHalt[expr_,n_] := FixedPointList[SKNext,expr,n+1];
Returns the steps of evaluation of *expr* until either it reaches a fixed point or it has been evaluated for n steps, whichever comes first.
Note that, due to the Church-Rosser theorem (Church and Rosser, 2018), the order in which the rules are applied does not affect the final result, as long as the combinator evaluates to a fixed point / 'halts'. For combinators with no fixed point, which do not halt, the behaviour demonstrated as they evaluate could change based on the order of application of the rules - this is not explored here and is a topic for potential future investigation.
### 1.3 Examples
The functions above can be used to evaluate a number of interesting SK combinator expressions:
Column[SKEvaluateUntilHalt[s[k][a][x],10][[1;;-2]]]
[//]: # (No rules defined for Output)
The *I* combinator
Column[SKEvaluateUntilHalt[s[k[s[i]]][k][a][b],10][[1;;-2]]]
[//]: # (No rules defined for Output)
The reversal expression - *s[k][s[i]][k][a][b]* takes two terms, *a* and *b*, and returns *b[a]*.
## 2. Growth and Halting
### 2.1 Halting and Related Works
We will define a combinator expression to have halted if it has reached a fixed point - i.e. if no combinators in the expression can be evaluated, or if evaluating any of the combinators in the expression returns the original expression. As SK combinators are Turing-complete and so computationally universal, it is evident that the halting problem - determining whether or not a given SK combinator expression will halt - is undecidable for SK combinators. There are, however, patterns and trends in the growth of SK combinators, and it is arguably possible to speak of the probability of a given SK combinator expression halting.
Some investigations (Lathrop 1996) and (Calude and M. Dumitrescu 2018) have been made into probabilistically determining the halting time of Turing machines, with [2] proving that it is possible to compute some value K where for some arbitrary predetermined confidence *(1-\[Delta])* and accuracy *(1-\[Epsilon]),* a program that does
A. Input a Turing machine M and program I.
B. Simulate M on I for K steps.
C. If M has halted then print 1, else print 0.
D. Halt.
has a probability greater than *(1-δ)* of having an accuracy (when predicting whether or not a program will halt) greater than *(1-ε).* The key result of this is that, in some cases 'we can learn what we cannot compute' - 'learning' referring to Valiant's formal analysis as 'the phenomenon of knowledge acquisition in the absence of specific programming' (Valiant 1984).
### 2.2 Definitions and Functions
The size of a combinator expression can either be measured by its length (total number of characters including brackets) or by its leaf size (number of 's' and 'k' characters). We use the former in most cases, and the latter when randomly generating combinator expressions.
The number of possible combinator expressions with leaf size *n* is given by
SKPossibleExpressions[n_]:=(2^n)*Binomial[2*(n-2),n-1]/n
(Wolfram, 2002), which grows exponentially.
#### 2.2.1 Visualisation
We define a function to visualise the growth of a combinator, *SKRasterize*:
SKArray[expr_,n_]:=Characters/@ToString/@SKEvaluate[expr,n];
SKArray[expr_]:=SKArray[expr,10];
Generates a list of the steps in the growth of a combinator, where each expression is itself a list of characters ('s', 'k', '[', ']')
SKGrid[exp_,n_]:=ArrayPlot[SKArray[exp,n],{ColorRules->{"s"->RGBColor[1,0,0],"k"->RGBColor[0,1,0],"["->RGBColor[0,0,1],"]"->RGBColor[0,0,0]},PixelConstrained->True,Frame->False,ImageSize->1000}];
SKGrid[exp_]:=SKGrid[exp,10];
Generates an ArrayPlot of a list given by SKArray, representing the growth of a combinator in a similar manner to that of cellular automata up to step n. The y axis represents time - each row is the next expression in the evaluation of an SK combinator. Red squares indicate 'S', green squares indicate 'K', blue squares indicate '[' and black squares indicate ']'.
SKRasterize[func_,n_]:=Image[SKGrid[func,n][[1]]];
SKRasterize[func_]:=SKRasterize[func,10];
Generates a rasterized version of the ArrayPlot.
A visualisation of a given combinator can easily be produced, as follows:
SKRasterize[s[s[s]][s][s][s][k],15]
[//]: # (No rules defined for Output)
![The longest running halting expression with leaf size 7, halting in 12 steps (Wolfram, 2002)][1]
The longest running halting expression with leaf size 7, halting in 12 steps (Wolfram, 2002)
#### 2.2.2 Halting graphs
We can create a table of the length (string length) of successive combinator expressions as they evaluate as follows:
SKLengths[exp_,n_]:=StringLength/@ToString/@SKEvaluate[exp,n];
Returns a list of the lengths of successive expressions until step *n*
These can be plotted as a graph (x axis number of steps, y axis length of expression):
SKPlot[expr_,limit_]:=ListLinePlot[SKLengths[expr,limit],AxesOrigin->{1,0},AxesLabel->{"Number of steps","Length of expression"}];
Thus, a graph of the above combinator can be produced:
SKPlot[s[s[s]][s][s][s][k],15]
[//]: # (No rules defined for Output)
![A graph of the above combinator][2]
It is evident from the graph that this combinator halts at 12 steps.
#### 2.2.3 Random SK combinators
To empirically study SK combinators, we need a function to randomly generate them. Two methods to do this were found:
RecursiveRandomSKExpr[0,current_]:=current;
RecursiveRandomSKExpr[depth_,current_]:=
RecursiveRandomSKExpr[depth-1,
RandomChoice[{
RandomChoice[{s,k}][current],
current[RecursiveRandomSKExpr[depth-1,RandomChoice[{s,k}]]]
}]
];
RecursiveRandomSKExpr[depth_Integer]:=RecursiveRandomSKExpr[depth,RandomChoice[{s,k}]];
A recursive method, repeatedly appending either a combinator to the 'head' of the expression or a randomly generated combinator expression to the 'tail' of the expression. (Hennigan)
replaceWithList[expr_,pattern_,replaceWith_]:=ReplacePart[expr,Thread[Position[expr,pattern]->replaceWith]];
treeToFunctions[tree_]:=ReplaceRepeated[tree,{x_,y_}:>x[y]];
randomTree[leafCount_]:=Nest[ReplacePart[#,RandomChoice[Position[#,x]]->{x,x}]&,{x,x},leafCount-2];
RandomSKExpr[leafCount_]:=treeToFunctions[replaceWithList[randomTree[leafCount],x,RandomChoice[{s,k},leafCount]]];
Random combinator generation based on generation of binary trees - each combinator can be expressed as a binary tree with leaves 'S' or 'K'. (Parfitt, 2017)
While the first method gives a large spread of combinators with a variety of lengths, and is potentially more efficient, for the purposes of this exploration the second is more useful, as it limits the combinators generated to a smaller, more controllable sample space (for a given leaf size).
### 2.3 Halting Graphs
All combinators of leaf sizes up to size 6 evolve to fixed points (NKS):
exprs = Table[RandomSKExpr[6],10];
ImageCollage[Table[ListLinePlot[SKLengths[exprs[[n]],40]],{n,10}],Background->White]
[//]: # (No rules defined for Output)
![10 randomly generated combinators of size 6, with their lengths plotted until n=40.][3]
10 randomly generated combinators of size 6, with their lengths plotted until n=40.
As the leaf size increases, combinators take longer to halt, and some show exponential growth:
exprs = Table[RandomSKExpr[10],10];
ImageCollage[Table[ListLinePlot[SKLengths[exprs[[n]],40]],{n,10}],Background->White]
[//]: # (No rules defined for Output)
![10 randomly generated combinators of size 10, with their lengths plotted until n=20.][4]
10 randomly generated combinators of size 10, with their lengths plotted until n=20.
exprs = Table[RandomSKExpr[30],10];
ImageCollage[Table[ListLinePlot[SKLengths[exprs[[n]],40]],{n,10}],Background->White]
[//]: # (No rules defined for Output)
![10 randomly generated combinators of size 30, with their lengths plotted until n=40.][5]
10 randomly generated combinators of size 30, with their lengths plotted until n=40.
CloudEvaluate[exprs = Table[RandomSKExpr[50],10];
ImageCollage[Table[ListLinePlot[SKLengths[exprs[[n]],40]],{n,10}],Background->White]]
[//]: # (No rules defined for Output)
![10 randomly generated combinators of size 50, with their lengths plotted until n=40.][6]
10 randomly generated combinators of size 50, with their lengths plotted until n=40.
After evaluating a number of these combinators, it appears that they tend to either halt or grow exponentially - some sources (Parfitt, 2017) reference linear growth combinators, however none of these have been encountered as yet.
### 2.4 Halting Times
With a random sample of combinators, we can plot a cumulative frequency graph of the number of combinators that have halted at a given number of steps:
SKHaltLength[expr_,n_]:=Module[{x},
x=Length[SKEvaluateUntilHalt[expr,n+1]];
If[x>n,False,x]
]
Returns the number of steps it takes the combinator *expr* to halt; if *expr* does not halt within n steps, returns *False*.
GenerateHaltByTable[depth_,iterations_,number_]:=Module[{exprs,lengths},
exprs = Monitor[Table[RandomSKExpr[depth],{n,number}],n];
lengths = Monitor[Table[SKHaltLength[exprs[[n]],iterations],{n,number}],n];
Return[lengths]
]
Generates a table of the halt lengths of *number* random combinator expressions (*False* if they do not halt within *iterations* steps) with leaf size *depth*.
GenerateHaltData[depth_,iterations_,number_]:=Module[{haltbytable,vals},
haltbytable = GenerateHaltByTable[depth,iterations,number];
vals = BinCounts[Sort[haltbytable],{1,iterations+1,1}];
Table[Total[vals[[1;;n]]],{n,1,Length[vals]}]
]
Generates a table of the number of *number* random combinator expressions (*False* if they do not halt within *iterations* steps) with leaf size *depth* that have halted after a given number of steps
GenerateHaltGraph[depth_,iterations_,number_]:=Module[{cumulative,f},
cumulative=GenerateHaltData[depth,iterations,number];
f=Interpolation[cumulative];
{ListLinePlot[cumulative,PlotRange->{Automatic,{0,number}},GridLines->{{},{number}},Epilog-> {Red,Dashed,Line[{{0,cumulative[[-1]]},{number,cumulative[[-1]]}}]},AxesOrigin->{1,0},AxesLabel->{"Number of steps","Number of combinators halted"}],cumulative[[-1]]}
]
Plots a graph of the above data.
#### 2.4.1 Halting Graphs
We analyse halt graphs of random samples of 1000 combinators (to leaf size 30):
CloudEvaluate[GenerateHaltGraph[10,30,1000]]
[//]: # (No rules defined for Output)
![Leaf size 10: almost all combinators in the sample (997) have halted (99.7%).][7]
Leaf size 10: almost all combinators in the sample (997) have halted (99.7%).
CloudEvaluate[GenerateHaltGraph[20,30,1000]]
[//]: # (No rules defined for Output)
![Leaf size 20: 979 combinators in the sample have halted (97.9%).][8]
Leaf size 20: 979 combinators in the sample have halted (97.9%).
CloudEvaluate[GenerateHaltGraph[30,30,1000]]
[//]: # (No rules defined for Output)
![Leaf size 30: 962 combinators in the sample have halted (96.2%).][9]
Leaf size 30: 962 combinators in the sample have halted (96.2%).
CloudEvaluate[GenerateHaltGraph[40,30,1000]]
[//]: # (No rules defined for Output)
![Leaf size 40: 944 combinators in the sample have halted (94.4%).][10]
Leaf size 40: 944 combinators in the sample have halted (94.4%).
CloudEvaluate[GenerateHaltGraph[50,30,1000]]
[//]: # (No rules defined for Output)
![Leaf size 50: 889 combinators in the sample have halted (88.9%).][11]
Leaf size 50: 889 combinators in the sample have halted (88.9%).
Evidently, the rate of halting of combinators in the sample decreases as number of steps increases - the gradient of the graph is decreasing. As the graph levels out at around 30 steps, we will assume that the number of halting combinators will not increase significantly beyond this point.
As the leaf size increases, fewer combinators in the sample have halted by 30 steps - however, the graph still levels out, suggesting most of the combinators which have not halted by this point will never halt.
#### 2.4.2 Halting Times and Leaf Size
We can plot a graph of the number of halted combinators against leaf size:
CloudEvaluate[ListLinePlot[Table[{n,GenerateHaltGraph[n,30,1000][[2]]},{n,5,50,1}]]]
![A graph to show the number of combinators which halt within 30 steps in each of 45 random samples of 1000 combinators, with leaf size varying from 5 to 50.][12]
A graph to show the number of combinators which halt within 30 steps in each of 45 random samples of 1000 combinators, with leaf size varying from 5 to 50.
This graph shows that, despite random variation, the number of halted combinators decreases as the leaf size increases: curve fitting suggests that this follows a negative quadratic function.
FitData[data_,func_]:=Module[{fitd},fitd={Fit[data[[1,2,3,4,1]],func,x]};{fitd,Show[ListPlot[data[[1,2,3,4,1]],PlotStyle->Red],Plot[fitd,{x,5,50}]]}]
A curve-fitting function: plots the curve of best fit for *data* with some combination of functions *func*.
FitData[%,{1,x,x^2}]
[//]: # (No rules defined for Output)
![Curve-fitting on the data with a quadratic function][13]
{1012.07 - 1.18915 x - 0.0209805 x^2}
Curve-fitting on the data with a quadratic function yields a reasonably accurate curve of best fit.
## 3. Machine Learning Analysis of SK Combinators
The graphs above suggest that the majority of halting SK combinators with leaf size <=50 will halt before ~30 steps. Thus we can state that, for a randomly chosen combinator, it is likely that if it does not halt before 40 steps, it will never halt. Unfortunately a lack of time prohibited a formal analysis of this, in the vein of Lathrop's work - this is an area for future research.
We attempt to use modern machine learning methods to predict the likelihood of a given SK combinator expression halting before 40 steps:
### 3.1 Dataset Generation
We implement a function *GenerateTable* to produce tables of random SK expressions:
SKHaltLength[expr_,n_]:=Module[{x},
x=Length[SKEvaluateUntilHalt[expr,n+1]];
If[x>n,False,x]
]
Returns the number of steps *expr* takes to halt if the given expression *expr* halts within the limit given (*limit*), otherwise returns *False*
GenerateTable[depth_,iterations_,number_]:=Module[{exprs,lengths},
exprs = Monitor[Table[RandomSKExpr[depth],{n,number}],n];
lengths = Monitor[Table[exprs[[n]]-> SKHaltLength[exprs[[n]],iterations],{n,number}],n];
lengths = DeleteDuplicates[lengths];
Return[lengths]
]
Returns a list of *number* expressions with leaf size *depth* whose elements are associations with key *expression* and value *number of steps taken to halt* if the expression halts within *iterations* steps, otherwise *False*.
*GenerateTable* simply returns tables random SK expressions - as seen earlier, these tend to be heavily skewed datasets as around 90% of random expressions generated will halt. Thus we must process this dataset to create a balanced training dataset: this is done with the function *CreateTrainingData*:
CreateTrainingData[var_]:=Module[{NoHalt,Halt,HaltTrain,Train},
NoHalt = Select[var,#[[2]]==False&];
Halt = Select[var,#[[2]]==True&];
HaltTrain = RandomSample[Halt,Length[NoHalt]];
Train = Join[HaltTrain,NoHalt];
Return[Train]
];
Counts the number of non-halting combinators in *var* (assumption is this is less than number of halting combinators), selects a random sample of halting combinators of this length and concatenates the lists.
ConvertSKTableToString[sktable_]:=Table[ToString[sktable[[n,1]]]-> sktable[[n,2]],{n,1,Length[sktable]}];
Converts SK expressions in a table generated with *GenerateTable* to strings
We also implement a function to create rasterised training data (where instead of an individual SK combinator associated with either True or False, an image of the first 5 steps of evaluation of the combinator is associated with either True or False):
CreateRasterizedTrainingData[var_]:=Module[{NoHalt,Halt,HaltTrain,HaltTrainRaster,NoHaltTrainRaster,RasterTrain},
NoHalt = Select[var,#[[2]]==False&];
Halt = Select[var,#[[2]]==True&];
HaltTrain = RandomSample[Halt,Length[NoHalt]];
HaltTrainRaster=Monitor[Table[SKRasterize[HaltTrain[[x,1]],5]-> HaltTrain[[x,2]],{x,1,Length[HaltTrain]}],x];
NoHaltTrainRaster=Monitor[Table[SKRasterize[NoHalt[[x,1]],5]-> NoHalt[[x,2]],{x,1,Length[NoHalt]}],x];
RasterTrain = Join[HaltTrainRaster,NoHaltTrainRaster];
Return[RasterTrain]
];
Counts the number of non-halting combinators in *var* (assumption is this is less than number of halting combinators), selects a random sample of halting combinators of this length, evaluates and generates images of both halting and non-halting combinators and processes them into training data (image->True/False).
### 3.2 Markov Classification
#### 3.2.1 Training
As a first attempt, we generate 2000 random SK expressions with leaf size 5, 2000 expressions with leaf size 10 ... 2000 expressions with leaf size 50, evaluated up to 40 steps:
lengths = Flatten[Table[GenerateTable[n,40,2000],{n,5,50,5}]]
We convert all non-False halt lengths to 'True':
lengths = lengths/.(a_->b_)/;!(b===False):> (a->True);
We process the data and train a classifier using the Markov method:
TrainingData = CreateTrainingData[lengths];
TrainingData2 = ConvertSKTableToString[TrainingData];
HaltClassifier1 = Classify[TrainingData2,Method->"Markov"]
[//]: # (No rules defined for Output)
#### 3.2.2 Testing
We must now generate test data, using the same parameters for generating random combinators:
testlengths = Flatten[Table[GenerateTable[n,40,2000],{n,5,50,5}]]
testlengths = testlengths/.(a_->b_)/;!(b===False):> (a->True);
TestData = CreateTrainingData[testlengths];
TestData2 = ConvertSKTableToString[TestData];
The classifier can now be assessed for accuracy using this data:
TestClassifier1 = ClassifierMeasurements[HaltClassifier1,TestData2]
[//]: # (No rules defined for Output)
#### 3.2.3 Evaluation
A machine learning solution to this problem is only useful if the accuracy is greater than 0.5 (i.e. more accurate than a random coin flip). We test the accuracy of the classifier:
TestClassifier1["Accuracy"]
0.755158
This, while not outstanding, is passable for a first attempt. We find the training accuracy:
ClassifierInformation[HaltClassifier1]
![Classifier Information][14]
The training accuracy (71.3%) is slightly lower than the testing accuracy (75.5%) - this is surprising, and is probably due to a 'lucky' testing dataset chosen.
We calculate some statistics from a confusion matrix plot:
TestClassifier1["ConfusionMatrixPlot"]
![Confusion Matrix Plot][15]
Accuracy: 0.76
Misclassification rate: 0.24
Precision (halt): 0.722 (when 'halt' is predicted, how often is it correct?)
True Positive Rate: 0.83 (when the combinator halts, how often is it classified as halting?)
False Positive Rate: 0.32 (when the combinator doesn't halt, how often is it classified as halting?)
Precision (non-halt): 0.799 (when 'non halt' is predicted, how often is it correct?)
True Negative Rate: 0.68 (when the combinator doesn't halt, how often is it classified as not halting?)
False Negative Rate: 0.17 (when the combinator halts, how often is it classified as not halting?)
A confusion matrix plot shows that the true positive rate is larger than the true negative rate - this would suggest that it is easier for the model to tell when an expression halts than when an expression does not halt. This could be due to the model detecting features suggesting very short run time in the initial string - for instance, a combinator k[k][<expression>] would evaluate immediately to k and halt - however, these 'obvious' features are very rare.
### 3.3 Random Forest Classification on Rasterised Expression Images
Analysing strings alone, without any information about how they are actually structured or how they might evaluate, could well be a flawed method - one might argue that, in order to predict halting, one would need more information about how the program runs. Hence, another possible method is to generate a dataset of visualisations of the first 5 steps of a combinator's evaluation as follows:
SKRasterize[RandomSKExpr[50],5]
![A rasterised SK combinator with length 50, evaluated to 5 steps][16]
and feed these into a machine learning model. Although it might seem that this method is pointless - we are already evaluating the combinators to 5 steps, and we are training a model on a database of combinators evaluated to 40 steps to predict if a combinator will halt in <=40 steps, the point of the exercise is less to create a useful resource than to investigate the feasibility of applying machine learning to this type of problem. If more computational power was available, a dataset of combinators evaluated to 100 steps (when even more combinators will have halted) could be created: in such a case a machine learning model to predict whether or not a combinator will halt in <=100 steps would be a practical approach as the time taken to evaluate a combinator to 100 steps is exponentially longer than that taken to evaluate a combinator to 5 steps.
3.3.1 Training
We generate a dataset of 2000 random SK expressions with leaf size 5, 2000 expressions with leaf size 10 ... 2000 expressions with leaf size 50, evaluated up to 40 steps:
rasterizedlengths = Flatten[Table[GenerateTable[n,40,2000],{n,5,50,5}]];
In order to train a model on rasterised images, we must evaluate all SK expressions in the dataset to 5 steps and generate rasterised images of these:
RasterizedTrainingData = CreateRasterizedTrainingData[rasterizedlengths];
We then train a classifier on this data:
RasterizeClassifier=Classify[RasterizedTrainingData,Method->"RandomForest"]
#### 3.3.2 Testing
We must now generate test data, using the same parameters for generating random training data:
testrasterizedlengths = Flatten[Table[GenerateTable[n,40,2000],{n,5,50,5}]];
testrasterizedlengths = testrasterizedlengths/.(a_->b_)/;!(b===False):> (a->True);
TestRasterizedData = CreateRasterizedTrainingData[testrasterizedlengths];
The classifier can now be assessed for accuracy using this data:
TestRasterizeClassifier=ClassifierMeasurements[RasterizeClassifier,TestRasterizedData]
[//]: # (No rules defined for Output)
#### 3.3.3 Evaluation
A machine learning solution to this problem is only useful if the accuracy is greater than 0.5 (i.e. more accurate than a random coin flip). We test the accuracy of the classifier:
TestRasterizeClassifier["Accuracy"]
0.876891
This is significantly better than the Markov approach (75.5%). We find the training accuracy:
ClassifierInformation[RasterizeClassifier]
![enter image description here][17]
Again, the training accuracy (85.5%) is slightly lower than the testing accuracy (87.7%).
We calculate some statistics from a confusion matrix plot:
TestRasterizeClassifier["ConfusionMatrixPlot"]
![Confusion Matrix Plot][18]
Accuracy: 0.88
Misclassification rate: 0.12
Precision (halt): 0.911 (when 'halt' is predicted, how often is it correct?)
True Positive Rate: 0.83 (when the combinator halts, how often is it classified as halting?)
False Positive Rate: 0.08 (when the combinator doesn't halt, how often is it classified as halting?)
Precision (non-halt): 0.848 (when 'non halt' is predicted, how often is it correct?)
True Negative Rate: 0.92 (when the combinator doesn't halt, how often is it classified as not halting?)
False Negative Rate: 0.17 (when the combinator halts, how often is it classified as not halting?)
A confusion matrix plot shows that the false negative rate is larger than the false positive rate - this would suggest that it is easier for the model to tell when an expression halts than when an expression does not halt. The precision for halting is much higher than the precision for non-halting, indicating that if the model suggests a program will halt, this is much more likely to be correct than if it suggested that the program would not halt. An (oversimplified) way to look at this intuitively is to examine some graphs of lengths of random combinators:
![Random combinator length graphs][19]
Looking at combinators that halt (combinators for which the graph flattens out), some combinators 'definitely halt' - their length decreases until the graph flattens out:
!['definitely halts'][20]
'definitely halts' (1)
Some combinators have length that increases exponentially :
![exponentially increasing combinator length graph][21]
'possibly non-halting' (2)
And some combinators appear to have increasing length but suddenly decrease:
![increasing then decreasing combinator length graph][22]
'possibly non-halting' (3)
We do not know which features of the rasterised graphic the machine learning model extracts to make its prediction, but if, say, it was classifying based purely on length of the graphic, it would identify combinators like (1) as ' definitely halting', but would not necessarily be able to distinguish between combinators like (2) and combinators like (3), which both appear to be non - halting initially.
On a similar note, some functional programming languages (e.g. Agda - [7]) have the ability to classify a function as 'definitely halting' or 'possibly non-halting', just like our classifier, whose dataset is trained on functions that either 'definitely halt' (halt in <= 40 steps) or are 'possibly non-halting' (do not halt in <= 40 steps - might halt later).
### 3.4 Table of Comparison
![A table comparing statistics for Markov and Random Forest models][23]
### 4. Conclusions and Further Work
#### 4.1 Conclusions
The results of this exploration were somewhat surprising, in that a machine learning approach to determining whether or not a program will terminate appears to some extent viable - out of all the methods attempted, the random forest classifier applied to a rasterised image of the first five steps of the evaluation of a combinator achieved the highest accuracy of 0.88 on a test dataset of 1454 random SK combinator expressions. Note, though, that what is actually being determined here is whether or not a combinator will halt before some n steps (here, n=40) - we are classifying between combinators that 'definitely halt' and combinators which are 'possibly non-halting'.
### 4.2 Microsite
As an extension to this project, a Wolfram microsite was created and is accessible at [https://www.wolframcloud.com/objects/euan.l.y.ong/SKCombinators](https://www.wolframcloud.com/objects/euan.l.y.ong/SKCombinators) - within this microsite, a user can view a rasterised image of a combinator, a graph of the length of the combinator as it is evaluated, a statistical analysis of halting time relative to other combinators with the same leaf size and a machine learning prediction of whether or not the combinator will halt within 40 steps.
![Microsite Screenshot][24]
A screenshot of the microsite evaluating a random SK combinator expression
### 4.3 Implications, Limitations and Further Work
Although the halting problem is undecidable, the field of termination analysis - attempting to determine whether or not a given program will eventually terminate - has a variety of applications, for instance in program verification. Machine learning approaches to this problem would not only help explore this field in new ways but could also be implemented in, for instance, software debuggers.
The principal limitations of this method are that we are only predicting whether or not a combinator will halt in a finite number *k* of steps - while this could be a sensible idea if k is large, at present this system is very impractical due to small datasets and a small value of *k* used to train the classifier (*k *= 40). Another issue with the machine learning technique used is that the visualisations have different dimensions (longer combinators will generate longer images), and when the images are preprocessed and resized before being fed into the random forest model, downsampling/upsampling can lead to loss of data decreasing the accuracy of the model.
From a machine learning perspective, attempts at analysis of the rasterised images with a neural network could well prove fruitful, as would an implementation of a vector representation of syntax trees to allow the structure of SK combinators (nesting combinators) to be accurately extracted by a machine learning model.
Future theoretical research could include a deeper exploration of Lathrop's probabilistic method of determining *k*, an investigation of the 'halting' features the machine learning model is looking for within the rasterised images, a more general analysis of SK combinators (proofs of halting / non-halting for certain expressions, for instance) to uncover deeper patterns, or even an extension of the analysis carried out in the microsite to lambda calculus expressions (which can be transformed to an 'equivalent' SK combinator expression).
## Acknowledgements
We thank the mentors at the 2018 Wolfram High School Summer Camp - Andrea Griffin, Chip Hurst, Rick Hennigan, Michael Kaminsky, Robert Morris, Katie Orenstein, Christian Pasquel, Dariia Porechna and Douglas Smith - for their help and support writing this paper.
## References
A. Church and J. B. Rosser: Some properties of conversion. Transactions of the American Mathematical Society, 39 (3): 472\[Dash]482, 2018
R. H. Lathrop: On the learnability of the uncomputable. ICML 1996: 302-309.
C. S. Calude and M. Dumitrescu: A probabilistic anytime algorithm for the halting problem. Computability 7(2-3): 259-271, 2018.
L. G. Valiant: A theory of the learnable. Communications of the Association for Computing Machinery 27, (11): 1134-1142, 1984.
S. Wolfram: A New Kind of Science. 1121-1122, 2002.
E. Parfitt: Ways that combinators evaluate from the Wolfram Community\[LongDash]A Wolfram Web Resource. (2017)
http://community.wolfram.com/groups/-/m/t/965400
S-C Mu: Agda.Termination.Termination http://www.iis.sinica.edu.tw/~scm/Agda/Agda-Termination-Termination.html
Attached is a Wolfram Notebook (.nb) version of this computational essay.
[1]: http://community.wolfram.com//c/portal/getImageAttachment?filename=1.png&userId=1371970
[2]: http://community.wolfram.com//c/portal/getImageAttachment?filename=2.png&userId=1371970
[3]: http://community.wolfram.com//c/portal/getImageAttachment?filename=3.png&userId=1371970
[4]: http://community.wolfram.com//c/portal/getImageAttachment?filename=4.png&userId=1371970
[5]: http://community.wolfram.com//c/portal/getImageAttachment?filename=5.png&userId=1371970
[6]: http://community.wolfram.com//c/portal/getImageAttachment?filename=6.png&userId=1371970
[7]: http://community.wolfram.com//c/portal/getImageAttachment?filename=7.png&userId=1371970
[8]: http://community.wolfram.com//c/portal/getImageAttachment?filename=8.png&userId=1371970
[9]: http://community.wolfram.com//c/portal/getImageAttachment?filename=9.png&userId=1371970
[10]: http://community.wolfram.com//c/portal/getImageAttachment?filename=10.png&userId=1371970
[11]: http://community.wolfram.com//c/portal/getImageAttachment?filename=11.png&userId=1371970
[12]: http://community.wolfram.com//c/portal/getImageAttachment?filename=12.png&userId=1371970
[13]: http://community.wolfram.com//c/portal/getImageAttachment?filename=13.png&userId=1371970
[14]: http://community.wolfram.com//c/portal/getImageAttachment?filename=Classify1.png&userId=1371970
[15]: http://community.wolfram.com//c/portal/getImageAttachment?filename=15.png&userId=1371970
[16]: http://community.wolfram.com//c/portal/getImageAttachment?filename=16.png&userId=1371970
[17]: http://community.wolfram.com//c/portal/getImageAttachment?filename=17.png&userId=1371970
[18]: http://community.wolfram.com//c/portal/getImageAttachment?filename=18.png&userId=1371970
[19]: http://community.wolfram.com//c/portal/getImageAttachment?filename=19.png&userId=1371970
[20]: http://community.wolfram.com//c/portal/getImageAttachment?filename=20.png&userId=1371970
[21]: http://community.wolfram.com//c/portal/getImageAttachment?filename=21.png&userId=1371970
[22]: http://community.wolfram.com//c/portal/getImageAttachment?filename=22.png&userId=1371970
[23]: http://community.wolfram.com//c/portal/getImageAttachment?filename=23.png&userId=1371970
[24]: http://community.wolfram.com//c/portal/getImageAttachment?filename=24.png&userId=1371970Euan Ong2018-07-14T23:56:25Z[WSC18] Streaming Live Phone Sensor Data to the Wolfram Language
http://community.wolfram.com/groups/-/m/t/1386358
# Streaming Live Phone Sensor Data to the Wolfram Language
(This forms Part 1 of a 2-part community post: "Using Machine Learning Models for Accelerometer-based Gesture Recognition" - part 2 is available at http://community.wolfram.com/groups/-/m/t/1386392)
![Live demo of sensor streaming][1]
Not only are smartphones wonderful ways to stay connected to the digital world, they also contain an astonishing array of sensors making them ideal for scientific and computational experimentation.
The Wolfram Language (WL) has extraordinary data processing and scientific computing abilities - the only sensors, however, from which they can read data are specialised and either somewhat expensive or require a significant amount of setup. On a high level, the WL has baked-in support for a variety of devices - specifically the Raspberry Pi, Vernier Go!Link compatible sensors, Arduino microcontrollers, webcams and devices using the RS-232 or RS-422 serial protocol (http://reference.wolfram.com/language/guide/UsingConnectedDevices.html); unfortunately, there is no easy way to access sensor data from Android or iOS mobile devices.
In this post, I will attempt to combine the two, demonstrating
1. A UDP socket-based method for transmission of (general) sensor data from an Android phone to the Wolfram Language (based on this excellent community post: http://community.wolfram.com/groups/-/m/t/344278 which does the same for iPhones)
2. A web-based, platform-agnostic method for transmission of IMU / inertial motion unit data (i.e. accelerometer and gyroscope data) from a phone to the Wolfram Language.
# Socket Transmission
On the Google Play Store, there exist a number of Android apps which can transmit live sensor data to a computer over UDP sockets - for instance, "Sensorstream IMU+GPS" ([https://play.google.com/store/apps/details?id=de.lorenz_fenster.sensorstreamgps][1]). Unfortunately, the WL does not support receipt and transmission of data over UDP sockets - while there exists a *Socket* library, as of 2018 this is only capable of dealing with TCP. Thus, to use UDP sockets in the WL, we must implement our own library using JLink to access Java socket packages from the WL. (Credit is due to http://community.wolfram.com/groups/-/m/t/344278 - the code here was slightly outdated so had to be modified.)
## Instructions
To send accelerometer (or other sensor) data from your phone to Wolfram over UDP sockets:
1. Install the "Sensorstream IMU+GPS" app
2. Ensure the sensors you want to stream to Wolfram are ticked on the 'Toggle Sensors' page. (If you want to stream other sensors besides 'Accelerometer', 'Gyroscope' and 'Magnetic Field', ensure the 'Include User-Checked Sensor Data in Stream' box is ticked. Beware, though - the more sensors are ticked, the more latency the sensor stream will have.)
3. On the "Preferences" tab:
a. Change the target IP address in the app to the IP address of your computer (ensure your computer and phone are connected to the same local network)
b. Set the target port to 5555
c. Set the sensor update frequency to 'Fastest'
d. Select the 'UDP stream' radio box
e. Tick 'Run in background'
4. Switch stream ON **before** executing code. (nb. ensure your phone does not fall asleep during streaming - perhaps use the 'Caffeinate' app (https://play.google.com/store/apps/details?id=xyz.omnicron.caffeinate&hl=en_US) to ensure this.)
5. Execute the following WL code (in part from http://community.wolfram.com/groups/-/m/t/344278):
Initialise JLink
QuitJava[];
Needs["JLink`"];
InstallJava[];
Initialise a socket connection - ensure *5555* is the target port set
udpSocket=JavaNew["java.net.DatagramSocket",5555];
Function that reads *size* bytes of a function.
readSocket[sock_,size_]:=JavaBlock@Block[{datagramPacket=JavaNew["java.net.DatagramPacket",Table[0,size],size]},sock@receive[datagramPacket];
datagramPacket@getData[]]
Function that reads from the socket, processes data and 'sows' it to be collected later
listen[]:=record=DeleteCases[readSocket[udpSocket,1200],0]//FromCharacterCode//Sow;
Initialises the results list and repeatedly appends accelerometer data to it every 0.01 seconds - if the list is over 700 elements long, the 150 oldest elements (at start of list) are removed.
results={};RunScheduledTask[AppendTo[results,Quiet[Reap[listen[]]]];If[Length[results]>700,Drop[results,150]],0.01];
Initialises the stream list to be refreshed every 0.01 seconds with the most recent 500 elements of results. Each element of results is a string of transmitted socket data (e.g. "225585.00455, 3, -1.591, 8.624, 5.106, 4, -0.193, -0.690, -0.072") - this is split into a list of strings {"225585.00455", "3", "-1.591"...} and each string is converted to a numerical expression.
stream:=Refresh[ToExpression[StringSplit[#[[1]],","]]& /@ Select[results[[-500;;]],Head[#]==List&],UpdateInterval-> 0.01]
*Stream* now contains the 500 most recent accelerometer readings, stored in an array. The values of Stream will be updated whenever the variable is used within a *Dynamic*. (Note that, with the default sensors enabled - the first three boxes ticked on the *Toggle Sensors* tab - the x, y and z coordinates of the accelerometer can be accessed at elements 3, 4 and 5 in each list in the array. (e.g. to access the most recent accelerometer reading, run stream[[-1,3;;5]])
The accelerometer data can then be visualised using a *ListLinePlot*:
While[Length[results]<500,Pause[2]];Dynamic[Refresh[ListLinePlot[{stream[[All,3]],stream[[All,4]],stream[[All,5]]},PlotRange->All],UpdateInterval->0.1]]
![A list line plot of accelerometer data][2]
The 'pulses' (i.e. shaking the phone) were carried out every second; from this it is evident that the frequency of data transmission is 50 Hz (i.e. data is sent every 0.02 seconds).
To get the most recent accelerometer data, run
Dynamic[stream[[-1,3;;5]]]
To end socket transmission, turn off the stream on the app, run
RemoveScheduledTask[ScheduledTasks[]];
udpSocket@close[];
QuitJava[];
and ensure the process 'JLink' is quit in Task Manager / Activity Monitor etc - if it is not closed properly, you will be unable to create another socket from that port.
# Channel Transmission
An alternative way to send data from a phone to the Wolfram Cloud is by using the Channel framework. Introduced in version 11 of the Wolfram Language. the Channel framework allows asynchronous communication between Wolfram sessions as well as external systems, with communication being brokered in the Wolfram Cloud. A key point to note about the Channel framework is that it is based on a publish-subscribe model, allowing messages to be sent and received through a 'channel' rather than pairing specific senders and receivers.
##Instructions
To transmit accelerometer data, run the following code: (for other sensors, see the bottom of the page)
ChannelDeploySensorPage[func_]:=Module[{listener,listenerurl,SensorHTML,c,url,u},
CloudConnect[];
listener=ChannelListen["Sensors",func[#Message]&,Permissions->"Public"];
listenerurl = listener["URL"];
SensorHTML="<!DOCTYPE html><html lang=en><meta charset=UTF-8><title>Sensors</title><script src=https://cdn.jsdelivr.net/npm/gyronorm@2.0.6/dist/gyronorm.complete.min.js></script><script>function makeXHR(n,t,o){var e=Date.now(),r=(Math.random(),new XMLHttpRequest);r.withCredentials=!0;var i=\""<>listenerurl<>"?operation=send&time=\"+e.toString()+\"&x=\"+n.toString()+\"&y=\"+t.toString()+\"&z=\"+o.toString();r.open(\"GET\",i,!0),r.send()}function init(){var n={frequency:100,gravityNormalized:!0,orientationBase:GyroNorm.WORLD,decimalCount:2,logger:null,screenAdjusted:!1},t=new GyroNorm;t.init(n).then(function(){t.start(function(n){makeXHR(n.dm.x,n.dm.y,n.dm.z)})})}window.onload=init</script>";
c = CloudExport[SensorHTML,"HTML",Permissions->"Public"];
u=URLShorten[c[[1]]];
Return[{u,BarcodeImage[u,"QR"],listener}]
]
Then run
c = ChannelDeploySensorPage[Func]
![Output - a QR code][3]
(where the argument *func* is some function to be called whenever the channel receives a new point of data from the phone - the argument given to *func* is an association such as the one below:)
<|x=3, y=4, z=1|>
Now, simply scan the QR code generated with your phone, and sensor data will be streamed from your phone to the computer.
The data transmitted can be viewed as a time series as follows:
c[[3]]["TimeSeries"]
[//]: # (No rules defined for Output)
The accelerometer data can also be plotted with the following Dynamic: (red --> x, green --> y, blue --> z):
Dynamic[ListLinePlot[ToExpression/@Reverse[Take[Reverse[#["Values"]],UpTo[100]]]&/@c[[3]]["TimeSeries"][[2;;4]],PlotRange->{All,{-50,50}},PlotStyle->{Red, Green, Blue}]]
When you're done, delete the channel by running
RemoveChannelListener[c[[3]]]
##Explanation
Setting up a channel is as easy as connecting to the Wolfram Cloud
CloudConnect[];
and typing
current="";
Func[x_]:=current=x;
listener=ChannelListen["NameOfChannel",Func[#Message]&, Permissions->"Public"]
![A channel listener][4]
Here, *Func* is a function that will be called each time the channel receives a message (the message is supplied as an argument to the function) - it simply sets the variable 'current' to the data last sent to the channel (in the form of key-value pairs - e.g. <|x=3, y=4, z=1|>. To make the channel accessible to other users, ensure the channel has Permissions set to Public.
To delete the channel (useful when debugging), call
RemoveChannelListener[listener];
One particularly useful feature of the Channel is that it has built-in support for receiving and parsing HTTP requests - simply send a GET request to the channel URL (given by *listener["URL"]*) and the WL will automatically parse the parameters and make the data available to the user:
For instance, if we send an HTTP GET request to *https://channelbroker.wolframcloud.com/users/<your Wolfram Cloud email address>/NameOfChannel* and append the parameters "*operation=send*" (indicates data is being sent to the channel) and "*test=5*":
BaseURL=listener["URL"]
Params = "?operation=send&test=5";
URLRead[HTTPRequest[BaseURL<>Params,<|Method->"Get"|>]]
[//]: # (No rules defined for Output)
The variable 'current' has now been updated and contains the key-value pair 'test->5' which we just sent to the channel.
current
<|"test" -> "5"|>
[//]: # (No rules defined for Output)
current[["test"]]
5
[//]: # (No rules defined for Output)
An alternative way of viewing the data from the channel is to call
listener["TimeSeries"]
[//]: # (No rules defined for Output)
This allows the data sent to the channel to be stored as a time series, which can be useful in applications such as collecting time-based sensor data.
### Transmission of Sensor Data over Channels
As demonstrated earlier, a nice feature of Channels is that data can be sent to the Wolfram Language over HTTP - instead of fiddling with JLink and sockets (which tend to be laggy and break easily), one can simply create a web page that streams sensor data to a channel.
For Android devices (running Google Chrome), there exist a range of built-in sensor APIs giving a web page access to raw accelerometer, gyroscope, light sensor and magnetometer data, and processed linear acceleration (i.e. total acceleration experienced by a device disregarding that produced by gravity), absolute orientation and relative orientation sensors. Documentation for these sensors exists online at https://developers.google.com/web/updates/2017/09/sensors-for-the-web.
Unfortunately, for iOS devices there does not exist an easy way to access sensors from the web - although one can use DeviceMotion events (https://developers.google.com/web/fundamentals/native-hardware/device-orientation/), the data these give can vary significantly from browser to browser (e.g. different browsers might use different coordinate systems), so training a machine learning model on gesture data produced by this method would require either retraining a model for each browser or significant processing of data based on browser.
However, there is another solution - namely, the gyronorm.js API (https://github.com/dorukeker/gyronorm.js), which claims to return 'consistent [gyroscope and accelerometer] values across different devices'. Using this, we construct a simple web page to transmit accelerometer data to a Wolfram Language channel called 'Sensors': (While the following code focuses on extracting accelerometer data, it is a trivial task to change the sensor being polled to read, for instance, gyroscope data in the Wolfram Language instead.)
<!DOCTYPE html>
<html lang=en>
<meta charset=UTF-8>
<title>Sensors</title>
<script src=https://cdn.jsdelivr.net/npm/gyronorm@2.0.6/dist/gyronorm.complete.min.js></script>
<script>
function makeXHR(x,y,z){
var t=Date.now();
r=new XMLHttpRequest;
r.withCredentials=true;
var i="https://channelbroker.wolframcloud.com/users/euan.l.y.ong@gmail.com/Sensors?operation=send&time="+t.toString()+"&x="+x.toString()+"&y="+y.toString()+"&z="+z.toString();
r.open("GET",i,!0);
r.send()
}
function init(){
//Explanations are from the GyroNorm GitHub page. (https://github.com/dorukeker/gyronorm.js/)
var n={
frequency:100, //send values every 100 milliseconds
gravityNormalized:!0, // Whether or not to normalise gravity-related values
orientationBase:GyroNorm.WORLD, // ( Can be Gyronorm.GAME or GyroNorm.WORLD. gn.GAME returns orientation values with respect to the head direction of the device. gn.WORLD returns the orientation values with respect to the actual north direction of the world. )
decimalCount:2, // How many digits after the decimal point to return for each value
logger:null,
screenAdjusted:!1
};
t=new GyroNorm;
t.init(n).then(function(){
t.start(function(data){
makeXHR(data.dm.x,data.dm.y,data.dm.z)
//Other possible values to substitute for data.dm.x, data.dm.y, data.dm.z are:
// data.do.alpha ( deviceorientation event alpha value )
// data.do.beta ( deviceorientation event beta value )
// data.do.gamma ( deviceorientation event gamma value )
// data.do.absolute ( deviceorientation event absolute value )
// data.dm.x ( devicemotion event acceleration x value )
// data.dm.y ( devicemotion event acceleration y value )
// data.dm.z ( devicemotion event acceleration z value )
// data.dm.gx ( devicemotion event accelerationIncludingGravity x value )
// data.dm.gy ( devicemotion event accelerationIncludingGravity y value )
// data.dm.gz ( devicemotion event accelerationIncludingGravity z value )
// data.dm.alpha ( devicemotion event rotationRate alpha value )
// data.dm.beta ( devicemotion event rotationRate beta value )
// data.dm.gamma ( devicemotion event rotationRate gamma value )
})
})
}
window.onload=init;
</script>
</html>
This webpage, when opened on an Android or iOS phone, will stream data to the 'Sensors' channel, sending a new HTTP request every 100 milliseconds. (Decreasing the 'frequency' leads to more frequent results, but can cause atrocious levels of lag.)
### Producing the ChannelDeploySensorPage function
Although this webpage allows accelerometer data to be transmitted from a phone to a computer, for it to be used it must be deployed on a server. To change, for instance, the channel name, one would need to edit the file on the server itself, which can quickly become a tiresome process. Thus, we developed a function which autogenerates the required HTML code and stores it in the Wolfram Cloud as a CloudObject where it can easily be accessed. The function also outputs a QR code, to allow mobile users to quickly navigate to the web page. (The argument *func* is simply the function to be called whenever the channel receives a new point of data from the phone.)
## Alternative Sensors
ChannelDeploySensorPage functions for accessing sensors other than the accelerometer can be found below: (For more information about sensors and readings, check out https://developers.google.com/web/fundamentals/native-hardware/device-orientation/)
### Device Orientation (alpha, beta, gamma, absolute):
ChannelDeploySensorPageDeviceOrientation[func_]:=Module[{listener,listenerurl,SensorHTML,c,url,u},
CloudConnect[];
listener=ChannelListen["Sensors",func[#Message]&,Permissions->"Public"];
listenerurl = listener["URL"];
SensorHTML="<!DOCTYPE html><html lang=en><meta charset=UTF-8><title>Sensors</title><script src=https://cdn.jsdelivr.net/npm/gyronorm@2.0.6/dist/gyronorm.complete.min.js></script><script>function makeXHR(n,t,o,abs){var e=Date.now(),r=(Math.random(),new XMLHttpRequest);r.withCredentials=!0;var i=\""<>listenerurl<>"?operation=send&time=\"+e.toString()+\"&alpha=\"+n.toString()+\"&beta=\"+t.toString()+\"&gamma=\"+o.toString()+\"&absolute=\"+abs.toString();r.open(\"GET\",i,!0),r.send()}function init(){var n={frequency:100,gravityNormalized:!0,orientationBase:GyroNorm.WORLD,decimalCount:2,logger:null,screenAdjusted:!1},t=new GyroNorm;t.init(n).then(function(){t.start(function(n){makeXHR(n.do.alpha,n.do.beta,n.do.gamma,n.do.absolute)})})}window.onload=init</script>";
c = CloudExport[SensorHTML,"HTML",Permissions->"Public"];
u=URLShorten[c[[1]]];
Return[{u,BarcodeImage[u,"QR"],listener}]
]
### Device Motion - Acceleration Including Gravity (x, y, z):
ChannelDeploySensorPageAccelerationGravity[func_]:=Module[{listener,listenerurl,SensorHTML,c,url,u},
CloudConnect[];
listener=ChannelListen["Sensors",func[#Message]&,Permissions->"Public"];
listenerurl = listener["URL"];
SensorHTML="<!DOCTYPE html><html lang=en><meta charset=UTF-8><title>Sensors</title><script src=https://cdn.jsdelivr.net/npm/gyronorm@2.0.6/dist/gyronorm.complete.min.js></script><script>function makeXHR(n,t,o){var e=Date.now(),r=(Math.random(),new XMLHttpRequest);r.withCredentials=!0;var i=\""<>listenerurl<>"?operation=send&time=\"+e.toString()+\"&x=\"+n.toString()+\"&y=\"+t.toString()+\"&z=\"+o.toString();r.open(\"GET\",i,!0),r.send()}function init(){var n={frequency:100,gravityNormalized:!0,orientationBase:GyroNorm.WORLD,decimalCount:2,logger:null,screenAdjusted:!1},t=new GyroNorm;t.init(n).then(function(){t.start(function(n){makeXHR(n.dm.gx,n.dm.gy,n.dm.gz)})})}window.onload=init</script>";
c = CloudExport[SensorHTML,"HTML",Permissions->"Public"];
u=URLShorten[c[[1]]];
Return[{u,BarcodeImage[u,"QR"],listener}]
]
### Device Motion - Rotation Rate (alpha, beta, gamma):
ChannelDeploySensorPageRotationRate[func_]:=Module[{listener,listenerurl,SensorHTML,c,url,u},
CloudConnect[];
listener=ChannelListen["Sensors",func[#Message]&,Permissions->"Public"];
listenerurl = listener["URL"];
SensorHTML="<!DOCTYPE html><html lang=en><meta charset=UTF-8><title>Sensors</title><script src=https://cdn.jsdelivr.net/npm/gyronorm@2.0.6/dist/gyronorm.complete.min.js></script><script>function makeXHR(n,t,o){var e=Date.now(),r=(Math.random(),new XMLHttpRequest);r.withCredentials=!0;var i=\""<>listenerurl<>"?operation=send&time=\"+e.toString()+\"&x=\"+n.toString()+\"&y=\"+t.toString()+\"&z=\"+o.toString();r.open(\"GET\",i,!0),r.send()}function init(){var n={frequency:100,gravityNormalized:!0,orientationBase:GyroNorm.WORLD,decimalCount:2,logger:null,screenAdjusted:!1},t=new GyroNorm;t.init(n).then(function(){t.start(function(n){makeXHR(n.dm.alpha,n.dm.beta,n.dm.gamma)})})}window.onload=init</script>";
c = CloudExport[SensorHTML,"HTML",Permissions->"Public"];
u=URLShorten[c[[1]]];
Return[{u,BarcodeImage[u,"QR"],listener}]
]
# Applications
Real world applications of this sensor data abound - aside from the gesture recognition system described in post 2 (http://community.wolfram.com/groups/-/m/t/1386392), you could use this sort of data to make a pocket seismometer, a fall detector, electronic dice, investigate centrifugal motion, investigate friction... More examples available here: http://www.gcdataconcepts.com/examples.html. If you make a sensor-based project in Wolfram, or think of a new / innovative / interesting way to use this data, or if the code above is buggy / incomplete, please do share it in the comments below!
A Wolfram Notebook version of this post is attached.
-- Euan Ong
# References
"Using Connected Devices": http://reference.wolfram.com/language/guide/UsingConnectedDevices.html
"Using your smart phone as the ultimate sensor array for Mathematica": http://community.wolfram.com/groups/-/m/t/344278
"Capturing Data from an Android Phone using Wolfram Data Drop": http://community.wolfram.com/groups/-/m/t/461190
"Sensorstream IMU+GPS": https://play.google.com/store/apps/details?id=de.lorenz_fenster.sensorstreamgps
"Sensors For The Web!": https://developers.google.com/web/updates/2017/09/sensors-for-the-web
"gyronorm.js": https://github.com/dorukeker/gyronorm.js
[1]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ezgif-3-207848dda6.gif&userId=1371970
[2]: http://community.wolfram.com//c/portal/getImageAttachment?filename=60691.png&userId=1371970
[3]: http://community.wolfram.com//c/portal/getImageAttachment?filename=42762.png&userId=1371970
[4]: http://community.wolfram.com//c/portal/getImageAttachment?filename=37343.png&userId=1371970Euan Ong2018-07-17T20:01:00Z[WSC18] Phone Gesture Recognition with Machine Learning
http://community.wolfram.com/groups/-/m/t/1386392
# Accelerometer-based Gesture Recognition with Machine Learning
![A GIF of gesture recognition in progress][1]
## Introduction
This is Part 2 of a 2-part community post - Part 1 (Streaming Live Phone Sensor Data to the Wolfram Language) is available here: http://community.wolfram.com/groups/-/m/t/1386358
As technology advances, we are constantly seeking new, more intuitive methods of interfacing with our devices and the digital world; one such method is gesture recognition. Although touchless human interface devices (kinetic user interfaces) exist and are in development, the cost and configuration required for these sometimes makes them impractical, particularly for mobile applications. A simpler method would be to use devices the user already has on their person - such as a phone or a smartwatch - to detect basic gestures, taking advantage of the wide array of sensors included in such devices. In an attempt to assess the feasibility of such a method, methods of asynchronous communication between a mobile device and the Wolfram Language are investigated, and a gesture recognition system based around an accelerometer sensor is implemented, using a machine learning model to classify a few simple gestures from mobile accelerometer data.
## Investigating Methods of Asynchronous Communication
To implement an accelerometer-based gesture recognition system, we must devise a suitable means for a mobile device to transmit accelerometer data to a computer running the Wolfram Language (WL). On a high level, the WL has baked-in support for a variety of devices - specifically the Raspberry Pi, Vernier Go!Link compatible sensors, Arduino microcontrollers, webcams and devices using the RS-232 or RS-422 serial protocol (http://reference.wolfram.com/language/guide/UsingConnectedDevices.html); unfortunately, there is no easy way to access sensor data from Android or iOS mobile devices.
On a low level, the WL natively supports TCP and ZMQ socket functionality, as well as receipt and transmission of HTTP requests and Pub-Sub channel communication. We investigate the feasibility of both methods for transmission of accelerometer data in Part 1 of this community post (http://community.wolfram.com/groups/-/m/t/1386358).
## Gesture Classification using Neural Networks
Now that we are able to stream accelerometer data to the WL, we may proceed to implement gesture recognition / classification. Due to limited time at camp, we used the UDP socket method to do this - in the future, we hope to move the system over to the (more user-friendly) channel interface.
We first configure the sensor stream, allowing live accelerometer data to be sent to the Wolfram Language:
### Configure the Sensor Stream
1. Install the "Sensorstream IMU+GPS" app ([https://play.google.com/store/apps/details?id=de.lorenz_fenster.sensorstreamgps][2])
2. Ensure the sensors you want to stream to Wolfram are ticked on the 'Toggle Sensors' page. (If you want to stream other sensors besides 'Accelerometer', 'Gyroscope' and 'Magnetic Field', ensure the 'Include User-Checked Sensor Data in Stream' box is ticked. Beware, though - the more sensors are ticked, the more latency the sensor stream will have.)
3. On the "Preferences" tab:
a. Change the target IP address in the app to the IP address of your computer (ensure your computer and phone are connected to the same local network)
b. Set the target port to 5555
c. Set the sensor update frequency to 'Fastest'
d. Select the 'UDP stream' radio box
e. Tick 'Run in background'
4. Switch stream ON **before** executing code. (nb. ensure your phone does not fall asleep during streaming - perhaps use the 'Caffeinate' app ([https://play.google.com/store/apps/details?id=xyz.omnicron.caffeinate&hl=en_US][3]) to ensure this.)
5. Execute the following WL code:
(in part from http://community.wolfram.com/groups/-/m/t/344278)
QuitJava[];
Needs["JLink`"];
InstallJava[];
udpSocket=JavaNew["java.net.DatagramSocket",5555];
readSocket[sock_,size_]:=JavaBlock@Block[{datagramPacket=JavaNew["java.net.DatagramPacket",Table[0,size],size]},sock@receive[datagramPacket];
datagramPacket@getData[];
listen[]:=record=DeleteCases[readSocket[udpSocket,1200],0]//FromCharacterCode//Sow;
results={};
RunScheduledTask[AppendTo[results,Quiet[Reap[listen[]]]];If[Length[results]>700,Drop[results,150]],0.01];
stream:=Refresh[ToExpression[StringSplit[#[[1]],","]]& /@ Select[results[[-500;;]],Head[#]==List&],UpdateInterval-> 0.01]
### Detecting Gestures
On a technical level, the problem of gesture classification is as follows: given a continuous stream of accelerometer data (or similar),
1. distinguish periods during which the user is performing a given gesture from other activities / noise and
2. identify / classify a particular gesture based on accelerometer data during that period. This essentially boils down to classification of a time series dataset, in which we can observe a series of emissions (accelerometer data) but not the states generating the emissions (gestures).
A relatively straightforward solution to (1) is to approximate the gradients of the moving averages of the x, y and z values of the data and take the Euclidean norm of these - whenever these increase above a threshold, a gesture has been made.
movingAvg[start_,end_,index_]:=Total[stream[[start;;end,index]]]/(end-start+1);
^ Takes the average of the x, y or z values of data (specified by *index* - x-->3, y-->4, z-->5) from the index *start* to the index *end*.
normAvg[start_,middle_,end_]:=((movingAvg[middle,end,3]-movingAvg[start,middle,3])/(middle-start))^2+((movingAvg[middle,end,4]-movingAvg[start,middle,4])/(middle-start))^2+((movingAvg[middle,end,5]-movingAvg[start,middle,5])/(middle-start))^2;
^ (Assuming difference from start to middle is equal to difference from middle to end:) Approximates the gradient at index *middle* using the average from *start* to *middle* and from *middle* to *end* for x, y and z values, and then takes the sum of the squares of these values. Note that we do not need to take the square root of the final answer (to find the Euclidean norm), as doing so and comparing it to some threshold *x* would be equivalent to not doing so and comparing it to the threshold *x^2* (square root is a computationally expensive operation).
Thus
Dynamic[normAvg[-155,-150,-146]]
will yield the square of the Euclidean norm of the gradients of the x, y and z values of the data (approximated by calculating the averages from the 155th most recent to 150th most recent and 150th most recent to 146th most recent values). As accelerometer data is sent to the Wolfram Language, this value will update.
### Data Collection
To train the network, we must collect gesture data. To do this, we have a variety of options - we can either represent the gesture as a tensor of 3 dimensional vectors (x,y,z accelerometer data points) and perform time series classification on these sequences of vectors using hidden Markov models or recurrent neural networks, or we can represent the gesture as a rasterised image of a graph much like the one below:
![Rasterised image of a gesture][4]
and perform image classification on the image of the graph.
Since the latter has had some degree of success (e.g. in http://community.wolfram.com/groups/-/m/t/1142260), we attempt a similar method:
PrepareDataIMU[dat_]:=Rasterize@ListLinePlot[{dat[[All,1]],dat[[All,2]],dat[[All,3]]},PlotRange->All,Axes->None,AxesLabel->None,PlotStyle->{Red, Green, Blue}];
^ Plots the data points in *dat* with no axes or axis labels, and with x coordinates in red, y coordinates in green, z coordinates in blue (this makes processing easier as Wolfram operates in RGB colours).
threshold = 0.8;
trainlist={};
appendToSample[n_,step_,x_]:=AppendTo [trainlist,PrepareDataIMU[Part[x,n;;step]]];
Dynamic[If[normAvg[-155,-150,-146]>threshold,appendToSample[-210,-70,stream],False],UpdateInterval->0.1]
^ Every 0.1 seconds, checks whether or not the normed average of the gradient of accelerometer data at the 150th most recent data point (using the *normAvg* function) is greater than the threshold - if it is, it will create a rasterised image of a graph of accelerometer data from the 210th most recent data point to the 70th most recent data point and append it to *trainlist* - a list of graphs of gestures. Patience is recommended here - there can be up to ~5 seconds' lag before a gesture appears. Ensure gestures are made reasonably vigorously.
As a first test, we attempted to generate 30 samples of each of the digits 1 to 5 drawn in the air with a phone - the images of these graphs were stored in *trainlist*. Then, we classified them as 1, 2, 3, 4 or 5, converting *trainlist* into an association with key <image of graph> and value <number>).
We split the data into training data (25 samples from each category) and test data (all remaining data):
TrainingTest[data_,number_]:=Module[{maxindex,sets,trainingdata,testdata},
maxindex=Max[Values[data]];
sets = Table[Select[data,#[[2]]==x&],{x,1,maxindex}];
sets=Map[RandomSample,sets];
trainingdata =Flatten[Map[#[[1;;number]]&,sets]];
testdata =Flatten[Map[#[[number+1;;-1]]&,sets]];
Return[{trainingdata,testdata}]
]
^ Randomly selects *number* training elements and *Length[data]-number* test elements for each value in the list *data*
gestTrainingTest=TrainingTest[gesture1to5w30,25];
gestTraining=gestTrainingTest[[1]];
gestTest=gestTrainingTest[[2]];
*gestTraining* and *gestTest* now contain key-value pairs like those below:
![Key-value pairs of accelerometer graphs and labels][5]
##### Machine Learning
To train a model on these images, we first attempt a basic *Classify*:
TrainWithClassify = Classify[gestTraining]
ClassifierInformation[TrainWithClassify]
![A poor result in ClassifierInformation][6]
Evidently, this gives a very poor training accuracy of 24.4% - given that there are 5 classes, this is only marginally better than random chance.
As the input data consists of images, we try transfer learning on an image identification neural network (specifically the VGG-16 network):
net = NetModel["VGG-16 Trained on ImageNet Competition Data"]
We remove the last few layers from the network (which classify images into the classes the network was trained on), leaving the earlier layers which perform more general image feature extraction:
featureFunction = Take[net,{1,"fc6"}]
[//]: # (No rules defined for Output)
We train a classifier using this neural net as a feature extractor:
NetGestClassifier = Classify[gestTraining,FeatureExtractor->featureFunction]
We now test the classifier using the data in *gestTest:*
NetGestTest = ClassifierMeasurements[NetGestClassifier,gestTest]
We check the training accuracy:
ClassifierInformation[NetGestClassifier]
![A better classifier information result][7]
NetGestTest["Accuracy"]
1.
NetGestTest["ConfusionMatrixPlot"]
![A good confusion matrix plot][8]
[//]: # (No rules defined for Output)
This method appears to be promising, as a training accuracy of 93.1% and a test accuracy of 100% was achieved.
## Implementation of Machine Learning Model
To use the model with live data, we use the same method of identifying gestures as before in 'Detecting Gestures' (detecting 'spikes' in the data using moving averages), but when a gesture is identified, instead of being appended to a list it is sent through the classifier:
results = {""};
ClassGestIMU[n_,step_,x_]:=Module[{aa,xa,ya},
aa = Part[x,n;;step];
xa=PrepareDataIMU[aa];
ya=gestClassifier[xa];
AppendTo[results,{Length@aa,xa,ya}]
];
Dynamic[If[normAvg[-155,-150,-146]>threshold,ClassGestIMU[-210,-70,stream],False],UpdateInterval->0.1]
Real time results (albeit with significant lag) can be seen by running
Dynamic@column[[-1]]
## Conclusions and Further Work
On the whole, this project was successful, with gesture detection and classification using rasterised graph images proving a viable method. However, the system as-is is impractical and unreliable, with a significant lag, training bias (trained to recognise the digits 1 to 5 the way they are drawn by one person only) and small sample size: these are problems that can be solved, given more time.
Further extensions to this project include:
- Serious code optimisation to reduce / eliminate lag
- An improved training interface to allow users to create their own gestures
- Integration of the gesture classification system with the Channel interface (as described earlier) and deployment of this to the cloud.
- Investigation of the feasibility of using an RNN or an LSTM for gesture classification - using a time series of raw gesture data rather than relying on rasterised images (which, although accurate, can be quite laggy). Alternatively, hidden Markov models could be used in an attempt to recover the states (gestures) that generate the observed data (accelerometer readings).
- Adding an API to trigger actions based on gestures and deployment of gesture recognition technology as a native app on smartphones / smartwatches.
- Improvement of gesture detection. At the moment, the code takes a predefined 1-2 second 'window' of data after a spike is detected - an improvement would be to detect when the gesture has ended and 'crop' the data values to include only the gesture made.
- Exploration of other applications of gesture recognition (e.g. walking style, safety, sign language recognition). Beyond limited UI navigation, a similar concept to what is currently implemented could be used with, say, a phone kept in the pocket, to analyse walking styles / gaits and, for instance, to predict or detect an elderly user falling and notify emergency services. Alternatively, with suitable methods for detecting finger motions, like flex sensors, such a system could be trained to recognise / transcribe sign language.
- Just for fun - training the model on Harry Potter-esque spell gestures, to use your phone as a wand...
A notebook version of this post is attached, along with a full version of the computational essay.
## Acknowledgements
We thank the mentors at the 2018 Wolfram High School Summer Camp - Andrea Griffin, Chip Hurst, Rick Hennigan, Michael Kaminsky, Robert Morris, Katie Orenstein, Christian Pasquel, Dariia Porechna and Douglas Smith - for their help and support during this project.
[1]: http://community.wolfram.com//c/portal/getImageAttachment?filename=SensorDemo2.gif&userId=1371970
[2]: https://play.google.com/store/apps/details?id=de.lorenz_fenster.sensorstreamgps
[3]: https://play.google.com/store/apps/details?id=xyz.omnicron.caffeinate&hl=en_US
[4]: http://community.wolfram.com//c/portal/getImageAttachment?filename=109724.png&userId=1371970
[5]: http://community.wolfram.com//c/portal/getImageAttachment?filename=12475.png&userId=1371970
[6]: http://community.wolfram.com//c/portal/getImageAttachment?filename=19276.png&userId=1371970
[7]: http://community.wolfram.com//c/portal/getImageAttachment?filename=77697.png&userId=1371970
[8]: http://community.wolfram.com//c/portal/getImageAttachment?filename=96898.png&userId=1371970Euan Ong2018-07-17T21:16:07Z[WSC18] Analyzing and visualizing chord sequences in music
http://community.wolfram.com/groups/-/m/t/1383630
During this year's Wolfram Summer Camp, being mentored by Christian Pasquel, I developed a tool that identifies chord sequences in music (from MIDI files) and generates a corresponding graph. The graph represents all [unique] chords as vertices, and connects every pair of chronologically subsequent chords with a directed edge. Here is an example of a graph I generated:
![Graph genehrated from Bach's prelude no.1 of the Well Tempered Klavier (Book I)][1]
Below is a detailed account on the development and current state of the project, plus some background on the corresponding musical theory notions.
#Introduction
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**GOAL** | The aim of this project is to develop a utility that identifies chords (e.g. C Major, A minor, G7, etc.) from MIDI files, in chronological order, and then generates a graph for visualizing that chord sequence. In the graph, each vertex would represent a unique chord, and each pair chronologically adjacent chords would be connected by a directed edge (i.e. an arrow). So, for example, if at some point in the music that is being analyzed there is a transition from a major G chord to a major C chord, there would be an arrow that goes from the G Major chord to the C Major chord. Therefore, the graph would describe a [Markov chain][2] for the chords. The purpose of the graph is to visualize frequent chord sequences and progressions within a certain piece of music.
&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;&nbsp;**MOTIVATION** | While brainstorming for project ideas, I don't know why, I had a desire to do something with graphs. Then I asked myself, "What are graphs good at modelling?". I mentally browsed through my areas of interest, searching for any that matched that requirement. One of my main interests is music; I am somewhat of a musician myself. And, in fact, [musical] harmony *is* a good subject to be modelled by graphs. Harmony, one of the fundamental pillars of music (and perhaps the most important), not only involves the chords themselves, but, more significantly, the *transitions* between those, which is what gives character to music. And directed graphs, and, specifically, Markov models, are a perfect match for transitions between states.
----------
#Some background
*Skip this if you aren't interested in the musical theory part or if you already have a background in music theory!*
##What is a chord?
A chord is basically a group of notes played together (contemporarily). Chords are the quanta of musical "feeling"; the typical—but somewhat naïve—example is the sensation of major chords sounding "happy" and minor chords sounding "sad" or melancholic (more on types of chords later).
Types of chords are defined by the [intervals][3] (distance in pitch) between the notes. The *root* of a chord is the "most important" or fundamental note of the cord, in the sense that it is the "base" from which the aforementioned intervals are measured. In other words, the archetype of chord defines the "feel" and the general harmonic properties of the chord, while the root defines the pitch of the chord. So a "C Major" chord is a chord with archetype "major triad" (more on that later) built on the note C; i.e., its root is C.
The *sequence* of chords in a piece constitutes its **harmony**, and it can convey much more complex musical messages or feelings than a single chord, just as in language: a single word does have meaning, but a sentence can have a much more complex meaning than any single word.
##Patterns in chord sequences
The main difference that between language and music is that language, in general, has a much stricter structure (i.e. the order of words, a.k.a. syntax) than music: the latter is an art, and there are no predetermined rules to follow. But humans \[have a tendency to\] like patterns, and music wouldn't be so universally beloved if it didn't contain any patterns. This also explains the unpopularity of [atonal music][4] (example [here][5]). But even atonal music has patterns: it may do its best to avoid harmonic patterns, but it still contains some level of rythmic, textural or other kinds of patterns.
This is why using graphs to visualize chord sequences is interesting: it is a semidirect way of identifying the harmonic patterns that distinguish different genres, styles, forms, pieces or even fragments of music. In my project, I have mainly focused on the "western" conception of tonal music, an particularly in its "classical" version (what I mean by "classical" is, in lack of a better definition, a classification that encompasses all music where the composer is, culturally, the most important artist). That doesn't mean this tool isn't apt for other types of music; it just means it will analyze it from this specific standpoint.
In tonal music, the harmonic patterns are all related to a certain notion of "center of gravity": the [*tonic*][6], which is, in some way the music's harmonic "home". Classical (as in pre-XX-century) tonal music usually ends (and often starts) with the tonic chord. In fact, we can further extend the analogy with gravity by saying that music consists in a game of tension, in which the closer you are to the center of gravity (the tonic), the greater the "pull". In an oversimplified manner, the musical equivalent of the [Schwarzschild radius][7] is the [dominant chord][8]: it tends towards the tonic. Well, not really, because you *can* turn back from it—and in fact a lot of interesting harmonical sequences consist in doing just that.
##Some types of chords
In "classical" music (see definition above), there are mainly these kinds of chords (based on the amount of unique notes they contain): triad chords (i.e. three-note chords), seventh chords (i.e. four-note chords; we'll see why they're called *seventh* in a bit), and ninth chords (five-note chords). There is another main distinction: major and minor chords (i.e. the cliché "happy" vs "sad" distinction).
###Triad chords
Probably the most simple and frequent chord is the triad chord (either major or minor). Here is a picture of a major and a minor triad C chord (left to right):
![Major and minor triad C chords (ltr)][9]
###Seventh chords
[Seventh chords][10] are called so because they contain a seventh [interval][11]. Their main significance is in dominant chords, where they usually appear in the major-triad-minor-seventh (a.k.a ["dominant"][12]) form. Another important seventh chord form is the fully diminished seventh chord (these will be relevant for the code later), which also tends to resolve ("resolve" is music jargon for "transition to a chord with less tension") to tonic.
![Seventh chords][13]
###Ninth chords
Although not extremely frequent, they do appear in classical music. The most "popular" is the dominant ninth chord (an extension of the dominant 7th). An alternative for this chord is the minor ninth dominant chord (built from the same dominant 7th chord, but with a minor ninth instead).
<br>
----------
#Algorithms and Code
In this section I'm going to walk through my code in order of execution. Four main parts can be distinguished in my project: importing and preprocessing, splitting the note sequence into "chunks" to be analyzed as chords, identifying the chord in each of those chunks, and visualizing the whole sequence as a graph.
##First phase: importing and preprocessing the MIDI file
The first operation that needs to be done is importing the MIDI file and preprocessing it. This includes selecting which elements to import from the file, converting them to a given simplified form, and performing any sorting, deletion of superfluous elements, or other modification that needs to be done.
For this purpose I defined the function `importMIDI`:
importMIDI[filename_String] := MapAt[Flatten[#, 1] &, MapAt[flattenAndSortSoundNotes,
Import[(dir <> filename <> ".mid"), {{"SoundNotes", "Metadata"}}],
1], 2]
Here `dir` stands for the directory where I saved all my MIDIs (to avoid having to type in the whole directory every time). Notice that we're importing the music as SoundNotes *and* the file's metadata—we will need it for determining the boundaries of measures (see below). The function `flattenAndSortNotes` does what it sound like: it converts the list of `SoundNote`s that `Import` returned into a flattened list of notes (i.e. a single track), sorted by their starting time. It also gets rid of anything that isn't necessary for chord identification (i.e. rhythmic sounds or effects). Consult the attached notebook for the explicit definition.
Here is the format the sequence of notes is returned in (i.e. `importMIDI[...][[1]]`):
{{"C4", {0., 1.4625}}, {"E4", {0.18125, 1.4625}}, {"G4", {0.36875, 0.525}}, <<562>>, {"G2", {105., 107.963}}, {"G4", {105., 107.963}}}
Each sub-list represents a note. Its first element is the actual pitch; the second is a list that represents the timespan (i.e. start and end time in seconds).
<br>
##Second phase: splitting the note sequence into chunks
The challenge in this part of the project is knowing how to determine which notes form a single chord; i.e., where to put the boundary between one chord and the next.
The solution I came up with is not optimal, but, until now, nothing better has occurred to me (suggestions are welcome!). It involves determining where each measure start/end lies in time from the metadata and splitting each of those into a certain amount of sub-parts; then the notes are grouped by the specific sub-part of the specific measure they pertain to. The rationale behind this is that chords in classical music tend to be well-contained within measures or rational fractions of these.
This procedure is contained in the function `chordSequenceAnalyzeUsingMeasures`. I'm going to go over it quickly:
chordSequenceUsingMeasures[midiData_List /; Length@midiData == 2,
measureSplit_: 2, analyzer_String: "Heuristic"] :=
Block[{noteSequence, metadata, chunkKeyframes, chunkedSequence,
result},
(*Separate notes from metadata*)
noteSequence = midiData[[1]];
metadata = midiData[[2]];
Until here it's pretty self evident.
(*Get measure keyframes*)
chunkKeyframes =
divideByN[
measureKeyframesFromMetadata[
metadata, (Last@noteSequence)[[2, 2]]], measureSplit];
Here the function `measureKeyframesFromMetadata` is called. It fetches all of the `TimeSignature` and `SetTempo` tags in the metadata and identifies the position of each measure from them. `divideByN` subdivides each measure by `measureSplit` (an optional argument with default value `2`).
(*Chunk sequence*)
chunkedSequence = {};
Module[{i = 1},
Do[
With[{k0 = chunkKeyframes[[j]], k1 = chunkKeyframes[[j + 1]]},
Module[{chunk = {}},
While[
i <= Length@noteSequence && (
k0 <= noteSequence[[i, 2, 1]] < k1 ||
k0 < noteSequence[[i, 2, 2]] <= k1 ),
AppendTo[chunk, noteSequence[[i]]] i++;];
AppendTo[chunkedSequence, chunk]
]
],
{j, Length@chunkKeyframes - 1}]];
chunkedSequence =
DeleteCases[chunkedSequence, l_List /; Length@l == 0];
Once the measures' timespan has been determined, a list of "chunks" (lists of notes grouped by measure part) is generated.
(*Call analyzer*)
Switch[analyzer,
"Deterministic", result = chordChunkAnalyze /@ chunkedSequence,
"Heuristic",
result = heuristicChordAnalyze /@ justPitch /@ chunkedSequence
];
result = resolveDiminished7th[Split[result][[All, 1]]]
]
Finally, each chunk is sent to the chord analyzer function `heuristicChordAnalyze`, which I'll talk about in the next section, along with the currently mysterious `resolveDiminished7th`.
Since this algorithm for "chunking" a note sequence doesn't work for everything, I also developed an alternative, more naïve approach:
chordSequenceNaïve[midiData_List /; Length@midiData == 2,
analyzer_String: "Heuristic", n1_Integer: 6, n2_Integer: 1] :=
Module[{noteSequence, chunkedSequence, result},
(*Separate notes from metadata*)
noteSequence = midiData[[1]];
(*Chunk sequence*)
chunkedSequence = Partition[noteSequence, n1, n2];
(*Call analyzer*)
result = heuristicChordAnalyze /@ justPitch@chunkedSequence;
result = resolveDiminished7th[Split[result][[All, 1]]]
]
<br>
##Phase 3: identifying the chord from a group of notes
This has been the main conceptual challenge in the whole project. After some unsucsessful ideas, with some suggestions from Rob Morris (one of the mentors), whom I thank, I ended up developing the following algorithm. It iterates through each note and assigns it a score that represents the likeliness of that note being the root of the chord based on the presence of certain indicators (i.e. notes the presence of which define a chord, to some degree), each of which with a different weight: having a fifth, having a third, a minor seventh... Then the note with the highest chord is assumed to be the root of the chord.
In code:
heuristicChordAnalyze[notes_List] :=
Block[{chordNotes, scores, root},
(*Calls to helper functions*)
chordNotes = octaveReduce /@ convertToSemitones /@ notes // DeleteDuplicates;
(*Scoring*)
scores = Table[Total@
Pick[
(*Score points*)
{24, 16, 16, 8, 2, 3, 1, 1,
10, 15, 15, 18},
(*Conditions*)
SubsetQ[chordNotes, #] & /@octaveReduce /@
{{nt + 7}, {nt + 4}, {nt + 3}, {nt + 10}, {nt + 11}, {nt + 2}, {nt + 5}, {nt + 9},
{nt + 4, nt + 10}, {nt + 3, nt + 6, nt + 10}, {nt + 3, nt + 6, nt + 9}, {nt + 1, nt + 4, nt + 10}}
]
(*Substract outliers*)
- 18*Length@Complement[chordNotes, octaveReduce /@ {nt, 7 + nt, 4 + nt, 3 + nt, 10 + nt, 11 + nt,
2 + nt, 5 + nt, 9 + nt, 6 + nt}],
{nt, chordNotes}];
(*Return*)
root = Part[chordNotes, Position[scores, Max @@ scores][[1, 1]]];
{root, Which[
SubsetQ[chordNotes, octaveReduce /@ {root + 10 , root + 2, root + 5, root + 9}], "13",
SubsetQ[chordNotes, octaveReduce /@ {root + 10, root + 2, root + 5}], "11",
SubsetQ[chordNotes, octaveReduce /@ {root + 4, root + 10, root + 2}], "Dom9",
SubsetQ[chordNotes, octaveReduce /@ {root + 4, root + 10, root + 1}], "Dom9m",
SubsetQ[chordNotes, octaveReduce /@ {root + 11, root + 7, root + 3}], "m7M",
SubsetQ[chordNotes, {octaveReduce[root + 11], octaveReduce[root + 4]}], "7M",
SubsetQ[chordNotes, {octaveReduce[root + 10], octaveReduce[root + 4]}], "Dom7",
SubsetQ[chordNotes, {octaveReduce[root + 10], octaveReduce[root + 7]}], "Dom7",
SubsetQ[chordNotes, {octaveReduce[root + 10], octaveReduce[root + 6]}], "d7",
SubsetQ[ chordNotes, {octaveReduce[root + 9], octaveReduce[root + 6]}], "d7d",
SubsetQ[chordNotes, {octaveReduce[root + 10], octaveReduce[root + 3]}], "m7",
MemberQ[chordNotes, octaveReduce[root + 4]], "M",
MemberQ[chordNotes, octaveReduce[root + 3]], "m",
MemberQ[chordNotes, octaveReduce[root + 7]], "5",
True, "undef"]}
]
###A note on notation
In this project I use the following abbreviations for chord notation (they're not in the standard format). "X" represents the root of the chord.
- *X-**5*** = undefined triad chord (just the root and the fifth)
- *X-**M*** = Major
- *X-**m*** = minor
- *X-**m7*** = minor triad with minor (a.k.a dominant) seventh
- *X-**d7d*** = fully diminished 7thchord
- *X-**d7*** = half diminished 7thchord
- *X-**Dom7*** = Dominant 7th chord
- *X-**7M*** = Major triad with Major 7th
- *X-**m7M*** = minor triad with Major 7th
- *X-**Dom9*** = Dominant 9th chord
- *X-**Dom9m*** = Dominant 7th chord with a minor 9th
- *X-**11*** = 11th chord
- *X-**13*** = 13th chord
###Dealing with diminished 7th chords
Now, on to `resolveDiminished7th`. What is this function on about?
Well, recall the fully diminished seventh chords I mentioned in the Background section. Here's the problem: they're completely symmetrical! What I mean by that is that the intervals between subsequent notes are identical, even if you [invert][14] the chord. In other words, the distance in semitones between notes is constant (it's 3) and is a factor of 12 (distance of 12 semitones = octave). So, given one of these chords, there is no way to determine which note is the root just by analyzing the chord itself. In the context of our algorithm, every note would have the same score!
At this point I thought: "How do humans deal with this?". And I concluded that the only way to resolve this issue is to have some contextual vision (looking at the next chord, particularly), which is how humans do it. So what `resolveDiminished7th` does is it brushes through the chord sequence stored in `result`, looking for fully diminished chords (marked with the string "d7d"), and re-assigns each of those a root by looking at the next chord:
resolveDiminished7th[chordSequence_List] :=
Module[{result},
result = Partition[chordSequence, 2, 1] /. {{nt_, "d7d"}, c2_List} :> Which[
MemberQ[octaveReduce /@ {nt, nt + 3, nt + 6, nt + 9}, octaveReduce[c2[[1]] - 1]], {{c2[[1]] - 1, "d7d"}, c2},
MemberQ[octaveReduce /@ {nt, nt + 3, nt + 6, nt + 9}, octaveReduce[c2[[1]] + 4]], {{c2[[1]] + 4, "d7d"}, c2},
MemberQ[octaveReduce /@ {nt, nt + 3, nt + 6, nt + 9}, octaveReduce[c2[[1]] + 6]], {{c2[[1]] + 6, "d7d"}, c2},
True, {{nt, "d7d"}, c2}];
result = Append[result[[All, 1]], Last[result][[2]]]
]
##Phase 4: Visualization
Basically, my visualization function (`visualizeChordSequence`) is fundamentally a highly customized call of the `Graph` function; so I'll just paste the code below and then explain what some parameters do:
visualizeChords[chordSequence_List, layoutSpec_String: "Unspecified", version_String: "Full", mVSize_: "Auto", simplicitySpec_Integer: 0, normalizationSpec_String: "Softmax"] :=
Module[{purgedChordSequence, chordList, transitionRules, weights, graphicalWeights, nOfCases, edgeStyle, vertexLabels, vertexSize, vertexStyle, vertexShapeFunction, clip},
(*Preprocess*)
Switch[version,
"Full",
purgedChordSequence =
StringJoin[toNoteName[#1], "-", #2] & @@@ chordSequence,
"Basic",
purgedChordSequence =
Split[toNoteName /@ chordSequence[[All, 1]]][[All, 1]]];
(*Amount of each chord*)
chordList = DeleteDuplicates[purgedChordSequence];
nOfCases = Table[{c, Count[purgedChordSequence, c]}, {c, chordList}];
(*Transition rules between chords*)
Switch[version,
"Full",
transitionRules =
Gather[Rule @@@ Partition[purgedChordSequence, 2, 1]],
"Basic",
transitionRules =(*DeleteCases[*)
Gather[Rule @@@ Partition[purgedChordSequence, 2, 1]](*, t_/;
Length@t\[LessEqual]2]*) ];
(*Get processed weight for each transition*)
weights = Length /@ transitionRules;
If[normalizationSpec == "Softmax", graphicalWeights = SoftmaxLayer[][weights]];;
graphicalWeights =
If[Min@graphicalWeights != Max@graphicalWeights,
Rescale[graphicalWeights,
MinMax@graphicalWeights, {0.003, 0.04}],
graphicalWeights /. _?NumericQ :> 0.03 ];
(*Final transition list*)
transitionRules = transitionRules[[All, 1]];
(*Graph display specs*)
clip = RankedMax[weights, 4];
edgeStyle =
Table[(transitionRules[[i]]) ->
Directive[Thickness[graphicalWeights[[i]]],
Arrowheads[2.5 graphicalWeights[[i]] + 0.015],
Opacity[Which[
weights[[i]] <= Clip[simplicitySpec - 2, {0, clip - 2}], 0,
weights[[i]] <= Clip[simplicitySpec, {0, clip}], 0.2,
True, 0.6]],
RandomColor[Hue[_, 0.75, 0.7]],
Sequence @@ If[weights[[i]] <= Clip[simplicitySpec - 1, {0, clip - 1}], {
Dotted}, {}] ], {i, Length@transitionRules}];
vertexLabels =
Thread[nOfCases[[All,
1]] -> (Placed[#,
Center] & /@ (Style[#[[1]], Bold,
Rescale[#[[2]], MinMax[nOfCases[[All, 2]]],
Switch[mVSize, "Auto", {6, 20}, _List,
10*mVSize[[1]]/0.3*{1, mVSize[[2]]/mVSize[[1]]}]]] & /@
nOfCases))];
vertexSize =
Thread[nOfCases[[All, 1]] ->
Rescale[nOfCases[[All, 2]], MinMax[nOfCases[[All, 2]]],
Switch[mVSize,
"Auto", (Floor[Length@chordList/10] + 1)*{0.1, 0.3}, _List,
mVSize]]];
vertexStyle =
Thread[nOfCases[[All, 1]] ->
Directive[Hue[0.53, 0.27, 1, 0.6], EdgeForm[Blue]]];
vertexShapeFunction =
Switch[version, "Full", Ellipsoid[#1, {3.5, 1} #3] &, "Basic",
Ellipsoid[#1, {2, 1} #3] &];
Graph[transitionRules,
GraphLayout ->
Switch[layoutSpec, "Unspecified", Automatic, _, layoutSpec],
EdgeStyle -> edgeStyle,
EdgeWeight -> weights,
VertexLabels -> vertexLabels,
VertexSize -> vertexSize,
VertexStyle -> vertexStyle,
VertexShapeFunction -> vertexShapeFunction,
PerformanceGoal -> "Quality"]
]
There are five main things to focus on in the above definition: the graph layout (passed as the argument `layoutSpec`), the edge thickness (defined in `edgeStyle`), the vertex size (defined in `vertexSize`), the version (passed as argument `version`) and the simplicity specification (`simplicitySpec`).
The graph layout is a `Graph` option that can be specified in the argument `layoutSpec`. If `"Unspecified"` is passed, an automatic layout will be used. I find that the best layouts tend to be, in order of preference, "BalloonEmbedding" and "RadialEmbedding"; nevertheless, neither are a perfect fit for every piece. In the future I would like to to implement custom (i.e. pre-defined) positioning, so that I can design it in a way that best fits this project.
The edge thickness is a function of the amount of times a certain transition between two chords has occurred in the chord sequence. There is an option (namely the `normalizationSpec` argument) to enable or disable using a Softmax function for assigning thicknesses to edges. This is due to the fact that for simple/short chord sequences, Softmax is actually counterproductive because it suppresses secondary but still top-ranked transitions; i.e., it assigns a very high thickness to the most frequent transition and a low thickness to all other transitions (even those that come in second or third in frequency ranking). But for large or complex sequences it is actually useful, because it "gets rid of" a lot of the \[relatively\] insignificant instances, thus making the output actually understandable (and not just a [jumbled mess of thick lines][15]).
The vertex size is proportional to the number of occurrences of each particular chord (that is, without taking into account the transitions). It can also be specified manually by passing `vSize` as a list `{a,b}` such that `a` is the minimum size an `b` is the maximum.
The `version` can be either `"Full"` or `"Basic"`; the default is `"Full"`. The `"Basic"` version consists of a simplified chord set in which only the root note of the chord is taken into account, and not the archetype. For example, all C chords (M, Dom7, m...) would be represented by a single `"C"` vertex.
Finally, the simplicity specification (`simplicitySpec`) is a number that can be thought of, in some way, as a "noise" threshold: as it gets larger, fewer edges "stand out"—that is, more of the lower-significance ones are rendered with reduced opacity or are shown as dotted lines. This is useful for large or complex sequences.
<br>
----------
#Some examples
Here I will show some specific examples generated with this tool. I tried to use different styles of music for comparison.
- **Bach**'s [prelude no.1][16] from the Well Tempered Clavier:
![Visualization of Bach's prelude no.1 ][17]
- **Debussy**'s [*Passepied*][18] from the *Suite Bergamasque*:
![Visualization of Debussy's *Passepied*][19]
- A "template" blues progression:
![Blues template][20]
- **Beethoven**'s second movement from the *Pathétique* sonata (no.8):
![Beethoven][21]
- Any "reggaeton" song (e.g. Despacito):
![Reggaeton][22]
#Microsite
Check out the form page (a.k.a. microsite) of this project [here][23]:
https://www.wolframcloud.com/objects/lammenspaolo/Chord%20sequence%20visualization
[![enter image description here][24]][23]
Briefly, here is what each option does (see the section **Algorithms and code** for a more detailed explanation):
- **Chunkifier functon**: choose between splitting notes by measures
- **Measure split factor**: choose into how many pieces you want to divide measures (each piece will be analyzed as a separate chord)
- **Graph layout**: choose the layout option for the `Graph` call
- **Normalization function**: choose whether to apply a Softmax function to the weights of edges (to make results clearer in case of complex sequences).
- **Version**: choose "Full" for complete chord info (e.g. "C-M", "D-Dom7", "C-7M"...) or "Basic" for just the root of the chord (e.g. "C", "D"...)
- **Vertex size**: specify vertex size as a list `{a,b}` where `a` is the minimum and `b` is the maximum size
- **Simplicity parameter**: visual simplification of the graph (a value of 0 means no simplification is applied)
<br>
#Conclusions
I have developed a functional tool to visualize chord sequences as graphs. It is far from perfect, though. In the future, I would like improving the positioning of vertices, being able to eliminate insignificant transitions from the graph altogether, and making other visual adjustments. Furthermore, I plan to refine and optimize the chord analyzer, as right now it is just an experimental version that isn't too accurate. A better "chunkifier" function could be developed too.
Finally, I'd like to thank my mentor Christian Pasquel and all of the other WSC staff for this amazing opportunity. I'd also like to thank my music theory teacher, Raimon Romaní, for making me, over the years, sufficiently less terrible at musical analysis to be able to undertake this project.
[1]: http://community.wolfram.com//c/portal/getImageAttachment?filename=Prelude.png&userId=1372342
[2]: https://en.wikipedia.org/wiki/Markov_chain "Wikipedia: Markov chain"
[3]: https://en.wikipedia.org/wiki/Interval_(music) "Wikipedia: Interval"
[4]: https://en.wikipedia.org/wiki/Atonality "Wikipedia: Atonality"
[5]: https://youtu.be/L85XTLr5eBE "Schönberg's 4th string quartet on YouTube"
[6]: https://en.wikipedia.org/wiki/Tonic_%28music%29 "Wikipedia: Tonic"
[7]: http://astronomy.swin.edu.au/cosmos/S/Schwarzschild+Radius "Basic info on Schwartzschild radius"
[8]: https://en.wikipedia.org/wiki/Dominant_(music) "Dominant chord"
[9]: http://community.wolfram.com//c/portal/getImageAttachment?filename=2548Macro_analysis_chords_on_C.jpg&userId=1372342
[10]: https://en.wikipedia.org/wiki/Seventh_chord "Wikipedia: Seventh chord"
[11]: https://en.wikipedia.org/wiki/Interval_(music) "Wikipedia: Interval"
[12]: https://en.wikipedia.org/wiki/Dominant_seventh_chord "Wikipedia: Dominant seventh"
[13]: http://community.wolfram.com//c/portal/getImageAttachment?filename=images.png&userId=1372342
[14]: https://en.wikipedia.org/wiki/Inversion_(music)#Chords "Wikipedia: Inversion#Chords"
[15]: http://community.wolfram.com//c/portal/getImageAttachment?filename=Passepied.png&userId=1372342 "Jumbled mess!"
[16]: https://www.youtube.com/watch?v=aengbLEFnM8
[17]: http://community.wolfram.com//c/portal/getImageAttachment?filename=Prelude.png&userId=1372342
[18]: https://www.youtube.com/watch?v=hDWbVP-5DSA "Passepied"
[19]: http://community.wolfram.com//c/portal/getImageAttachment?filename=deb_pass2.png&userId=1372342
[20]: http://community.wolfram.com//c/portal/getImageAttachment?filename=Blues.png&userId=1372342
[21]: http://community.wolfram.com//c/portal/getImageAttachment?filename=pathetique.png&userId=1372342
[22]: http://community.wolfram.com//c/portal/getImageAttachment?filename=Reggaeton.png&userId=1372342
[23]: https://www.wolframcloud.com/objects/lammenspaolo/Chord%20sequence%20visualization "Microsite"
[24]: http://community.wolfram.com//c/portal/getImageAttachment?filename=ScreenShot2018-07-19at1.53.02PM.png&userId=11733Paolo Lammens2018-07-14T05:10:03ZCloud Deploy broken with neural net (new?)
http://community.wolfram.com/groups/-/m/t/1382601
Over the last few weeks I routinely deployed neural nets to the cloud for computation. Today, even the simplest example does not work anymore:
(*initialise random neural net that talkes 200x200 image as input*)
scrnet = NetInitialize@
NetChain[{ConvolutionLayer[1, 1], PartLayer[1]},
"Input" -> {1, 200, 200}]
(*Deploy to cloud*)
cnet = CloudExport[scrnet, "MX", Permissions -> "Public"];
(*test on example image *WORKS**)
img = Import["ExampleData/ocelot.jpg"];
With[{net = CloudImport@cnet}, Image@net@{ImageData@img}]
(*Deploy as cloud form page *)
CloudDeploy[
FormPage[{"image" -> "Image"},
With[{net = CloudImport@cnet}, Image@net@{ImageData@#image}] &],
Permissions -> "Public"]
The last piece of code generates a cloud object. If I try to upload the ocelot image I get Image[Failed] as the sole output. If I do not use a net the form works fine:
CloudDeploy[FormPage[{"image" -> "Image"}, Image@ImageData@#image &],
Permissions -> "Public"]
I contacted wolfram support already and they told me to post here. Am I doing something wrong? I have ~1000 cloud credits left on a free account so that should not be the issue. Frankly, I am having a lot of issues with the documentation for cloud computing in general and am starting to wonder if it is worth the hassle.
Best, MaxMaximilian Jakobs2018-07-13T14:19:54Z