Message Boards Message Boards

[WSC19] Tracking Algorithm For Squash


Squash is a racket sport played in a four-walled court with a soft ball. The goal of this project is to recognize the positions of the players and the ball at every frame of a recorded squash game. This is obviously useful for gathering all kinds of statistics about a game, and the strategy and tactics of individual players. This task is challenging because the ball might be blurry, and only a few pixels in size in any given frame, and it is sometimes obscured by the players. The players also constantly cross over each other and often take up the same space in the frame, making it harder to track them individually. The existing solutions to these problems either involve slow deep learning algorithms or multiple perfectly set up cameras. Here I propose an almost entirely non-ML approach to tracking the ball and a simple color-based algorithm for tracking the players.


Tracking the ball

The algorithm starts with a frame like this: enter image description here The frame is binarized, normalized, and then binarized again with a very high threshold. This removes all of the stationary elements from each frame. After that, the algorithm puts detects the players and puts black boxes around them, and after a round of filtering and denoising we get an image:

binimages = Binarize[#, initialbin] & /@ images;
scaledimgs = ImageMultiply[#, 1/frames] & /@ binimages;
imgmask = ImageAdd[scaledimgs];
maskedimgs = ImageDifference[#, imgmask] & /@ binimages;
    boxedimgs = 
         Graphics[{Opacity[1], Black, 
            Entity["Concept", "Person::93r37"], 
            AcceptanceThreshold -> humandetect]}]]], {n, frames}];
imgbins = 
    TotalVariationFilter[Binarize[boxedimgs[[n]], finalbin]], 
    smallcomponents], {n, frames}];

enter image description here This is all of the processed images layered on top of each other: enter image description here A human can clearly see the trajectory of the ball, but there is also a lot of noise that comes from players' body parts and rackets sticking out of their bounding box. However, there are still noticeable patterns that distinguish the ball from other objects. If there is an object in the top part of the image, it's probably the ball. If the ball is the left side, for example, it's probably going to be the leftmost element, vice versa for the right side. Using these simple heuristics the ball can be detected in a high percentage of the frames and with little error.

Tracking the players Initially, I attempted to track the players using ImageFeatureTrack, but because the players cross over and turn almost constantly all of the points that the function was tracking were lost in the span of a few frames. Instead, I decided to track the players using the colors of their t-shirts. I used ImageCases to get the crop of each player and then ran DominantColors. After that, I deleted the common colors and ended up with these reference color schemes:

pplnum = Table[
     Entity["Concept", "Person::93r37"], 
     AcceptanceThreshold -> humandetect]], {n, frames}];
player1   = 
      ImageCases[images[[n]], Entity["Concept", "Person::93r37"], 
        AcceptanceThreshold -> humandetect][[1]], 
      MinColorDistance -> 0.3], {n, {1, 2, 3, 5}}]]];
player2   = 
      ImageCases[images[[n]], Entity["Concept", "Person::93r37"], 
        AcceptanceThreshold -> humandetect][[2]], 
      MinColorDistance -> 0.3], {n, {1, 2, 3, 5}}]]];
common = DeleteDuplicates[
        ColorDistance[player1[[c2]], player2[[c1]]] < 
         0.3, {player2[[c1]], player1[[c2]]}], {c1, 
        Length[player2]}], {c2, Length[player1]}]]] /. 
   Null -> Sequence[];
sortedplayer1 = Select[player1, ! MemberQ[common, #] &];
sortedplayer2  = Select[player2, ! MemberQ[common, #] &];

enter image description here

Then, for each frame, I determined which ImageCase matched which reference color scheme using ColorDistance. Putting the tracking of the ball and the players together we get something like this:

enter image description here

The players can be reliably tracked in every frame and the ball is accurately tracked in almost 70% of all frames.


Though the image processing algorithm with default parameters works fairly well for all courts, areas of the image where the ball is likely to be need to be redefined for footage of different tournaments, because of the variance in camera angles (if the camera is closer to the ground, the players be covering more of the front wall).

Future Directions

In the future, I would like to use this data to produce a log of shots for a game, with timestamps, and then use that data to analyze the strategies of different players.

2 Replies

This is very interesting. I suspect you used a pretty high quality video camera (compared to, say, an iPhone 6s). I have fooled around with using iPhone videos to track various thrown projectiles (such as an indoor (so as not to damage the walls) shot-put thrown in a racquet court . One annoying feature of the iPhone video is that successive frame images are sometimes exactly identical, which indicates that something anomalous occasionally occurs within the innards of the iPhone. I am interested in seeing how close the trajectories are to exact parabolas and understanding the inevitable deviations therefrom. Once again, thank you for an excellent post.

POSTED BY: David Vasholz
Posted 4 years ago

First, congratulations Alex!! Second, I would suggest developing a three-body system similar to the famous three-body problem. Of course, we have forces like gravity and collision but are bounded. So it is clear we may assume Lipschitz continuity w.r.t time on the mapping that describes the trajectory of the ball. This dynamical system could be feedback with a neural network in order to calculate certain parameters of the system. I do not know if my idea is correct but may inspire other correct ideas. Angel

POSTED BY: Angel Rojas
Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract