Group Abstract

Message Boards

WOLFRAM COMMUNITY

17.8K Views

4 Replies

11 Total Likes

View groups...

Follow this post

Share this post:

GROUPS:

Staff Picks Data Science Image Processing Curated Data Wolfram Language Connected Devices Machine Learning Know-How Neural Networks Artificial Intelligence

Facial gesture detection with Classify

William Duhe

Posted 9 years ago

A lot of interest has gone into conveniently and quickly detecting facial gestures of users lately using low quality image feeds. I am going to stab at doing this in Mathematica here., Check out the youtube video for a walk-through: Facial Gesture Detection using Mathematica 10 I will take a series of images from our connected webcam and then train a machine learning process to identify specific facial gestures based on the images. The proper way to control how fast one can capture images from the camera is with the Device property "FrameRate". My understanding is that this controls the frequency that the ScheduledTask runs at, I found that in testing trying to run this little demo with just one person in the camera's viewpoint caps out in performance at around 15 FPS. camera = DeviceOpen["Camera"]; camera["FrameRate"] = 15; camera["RasterSize"] = {480, 340}; camera["Timeout"] = 900; Take 30 frames of me smiling from my webcam and store them - then show the first 10 images. smiles = CurrentImage[30]; smiles[[1 ;; 10]] Do the same thing for a neutral facial state neutral = CurrentImage[30]; neutral[[1 ;; 10]] This function takes a list of elements and associates them with a feature createAssociation[dataSet_, feature_] := Table[ dataSet[[i]] -> feature, {i, 1, Length[dataSet]}] Here we have a function that trims the faces out of a set of images for analysis takeFaces[dataSet_] := ( Module[{tmpVar}, tmpVar = Table[ (ImageTrim[dataSet[[i]], #] & /@ FindFaces[dataSet[[i]](,{50,150})])[[1]], {i, 1, Length[dataSet]}]; Select[tmpVar, UnsameQ[#, {}] &] ] ) Now create two separate training sets which are associations from the gathered images. We also trim the faces from these images. Show the first ten elements of both generated data sets. smileData = createAssociation[takeFaces[smiles], "Smile"]; neutralData = createAssociation[takeFaces[neutral], "Neutral"]; smileData[[1 ;; 10]] neutralData[[1 ;; 10]] Create a classifier from the gathered images setting the method to quality c1 = Classify[Join[smileData, neutralData], Method -> "NeuralNetwork", PerformanceGoal -> "Quality"] This is the main function that labels faces in the image. We use fold to apply ImageCompose to the original image, nesting the ImageComposes like so ImageCompose[ImageCompose[img, label1], label2]. This allows us full support for as many faces as necessary in the future, not just one. You can edit the styling options here if you want the text to be styled differently, I didn' t generalize out those options but that part of the function is pretty easily modified imageLabel[image_, labelFunc_] := Module[{img = image, faces}, faces = FindFaces[img]; If[Flatten[faces] === {}, img, Fold[ImageCompose[#1, Sequence @@ #2] &, HighlightImage[img, Join[fullRectangleInLines /@ faces], "HighlightColor" -> Blue], {Graphics[ Text[Style[labelFunc[ImageTrim[img, #]], Black, Bold, Italic, 16, Background -> White]]], (First[#] + {( First[Last[#]] - First[First[#]])/2, -25})} & /@ faces]]] This is a helper function that takes the coords produced by FindFaces and makes 4 line objects for the box around the face. fullRectangleInLines[positions_] := Line[{ {First[First[positions]], Last[First[positions]]}, {First[First[positions]], Last[Last[positions]]}, {First[Last[positions]], Last[Last[positions]]}, {First[Last[positions]], Last[First[positions]]}, {First[First[positions]], Last[First[positions]]} }] This is just a simple labeling function that returns a string of a number that gets incremented every time it is run.It doubles as a useful little framerate function too in the dynamic version. labelingFunc[i_Image] := Column[{ "User: William Duhe", "Gesture: " <> c1[ImageTrim[CurrentImage[], #] & /@ FindFaces[CurrentImage[], {50, 150}]] }] Here I take a single frame from from the feed which shows you the facial gesture of the user bellow it. I do this for both smiling and neutral as a test. imageLabel[CurrentImage[], labelingFunc] Thanks for taking the time to read through this post and I hope that you find it useful! I might make another post where I wrap all this up in a nice toolkit which allows you to train different states using buttons and then show a live stream of the webcam labeling the users gestures using a Dynamically updating feed. Let me know if anyone would find this useful.

POSTED BY: William Duhe

4 Replies

Sort By:

William Duhe

Posted 9 years ago

Ok - so with a little bit of changes I can get this to work 99% of the time with 4 facial states dynamically. -Smile -Frown -Excited -Scowl Check out the live demo at: https://www.youtube.com/watch?v=a39LL5wV_A4&feature=youtu.be All I did was implement validations sets of equal length of the training sets - both being 50 - and set the FeatureExtractor to be "FaceFeatures". This is new to Mathematica 11. I am really excited about how well this works. c1 = Classify[ Join[smileData[[1 ;; 50]], neutralData[[1 ;; 50]], scowlData[[1 ;; 50]], excitedData[[51 ;; 100]]], Method -> "NeuralNetwork", PerformanceGoal -> "Quality", FeatureExtractor -> "FaceFeatures", ValidationSet -> Join[smileData[[51 ;; 100]], neutralData[[51 ;; 100]], scowlData[[51 ;; 100]], excitedData[[51 ;; 100]]]]

POSTED BY: William Duhe

William Duhe

Posted 9 years ago

I will submit another post where I show this process working with 4 different facial states and 2 users dynamically in the next week! It iwll be able to tell the difference between different users and label them appropriately. I will also integrate voice analysis that works to analyze what a user is saying and put chat bubbles above their heads with a coloring based on the sentiment of what they said. Thanks for supporting the post and if there are any suggestions on how to improve the process in the future - let me know.

POSTED BY: William Duhe

Bianca Eifert

Bianca Eifert, Justus-Liebig-Universität Gießen

Posted 9 years ago

That's cool! Yes, please do keep us updated. I'd be very interested to see how this approach fares with a larger pool of gestures and a larger pool of people. This might actually be a useful tool for people with visual impairments who want to use video chat without missing important cues (using audio annotations instead of text, but that part is trivial). Also: Please let us know when you're confident enough to use a series of grimaces as your password! ;)

POSTED BY: Bianca Eifert

EDITORIAL BOARD

EDITORIAL BOARD, WOLFRAM

Posted 9 years ago

- you earned "Featured Contributor" badge, congratulations ! This is a great post and it has been selected for the curated Staff Picks group. Your profile is now distinguished by a "Featured Contributor" badge and displayed on the "Featured Contributor" board.

POSTED BY: EDITORIAL BOARD

Reply to this discussion

Reply Preview

Attachments

Remove Add a file to this post

Follow this discussion

or Discard

Feedback