Message Boards Message Boards


[WSC19] Parse what is happening in a videogame from images

Posted 12 days ago
0 Replies
5 Total Likes

During this year's Wolfram Summer Camp, I developed a program that identifies footages from video games and computes what was going in the footage. While working on my project at the program I got to understand the realm of image processing in great detail. Now since image processing is comparatively hard in other languages, it became one of my greatest fears and thus I got a chance to battle it with Mathematica. The trajectory of this project kinda changed when I had a one on one with Mr. Stephen Wolfram, He asked me to make the Computer understand each frame in such detail that even a blind person can understand it. Among the 3 layers of outputs, the graphical visualization of the processes of the project looked like this:

enter image description here

I began working collecting some pictures from Pac-Man and one of them looked like this:

enter image description here

The first thing to be done was to make the computer read any useful details that may change the game's state. Among them the "Ready!" was one of the Messages that halted the game until the user moves the sprite again. To start TextRecognition on this image above I first had to remove any extra detail from the image that distorted the result of the text recognition.

Definitely, it was the grid behind it, I used ColorDetect on the Green color channel of the Image and deleted that which gave me a result like this:

enter image description here

After Reversing the colors for a better view I applied the text recognize function and got a stable result on an attempt of a few images. Now that we have found the halting point of the code I began first wrote the LegacyTracker Function

LegacyTracker[image_, clr_, disp_, 
  imp_] := (                                                          \

  pixelize = 
    ColorNegate[Binarize[ColorDistance[image, clr], imp]], 
  dop = CoordinateBounds[
    pixelize];                                                        \

  If[disp == True, 
    Graphics[{EdgeForm[{Thick, Green}], Opacity[0], 
      BoundingRegion[pixelize, "MinRectangle"]}]], dop]  

This function basically tracked the object with a color key and either returned a rendered image with a bounding box around the tracked object or returned just the coordinates of the boundingboxes. I used this function to track the objects in the game such as the Pac-Man itself and the enemies and stored the results in this order

I first used this function to Track the yellow life bars in the Game HUD and cropped out the middle row of the image and simply read the Pixel Values and counted each time a yellow group of pixels appeared to determine the number of lives the player has and save them to the main String which would be narrated to you each time you died.

f = LegacyTracker[
   LegacyTrackerMasker[footage, RGBColor[0/255, 128/255, 248/255], 
    True, 0.1], Yellow, False, 0.1];
tst = ImageTrim[
   footage, {Take[f, 1][[1]][[1]], 
    Take[f, 2][[2]][[1]]}, {Take[f, 1][[1]][[2]], 
    Take[f, 2][[2]][[2]]}];
{c, r} = ImageDimensions[tst];
ImageTake[RemoveAlphaChannel[tst], {Round[r/2]}, {1, c}];
count = Length[
   Flatten[Length /@ Split [#] & /@ 
      ImageTake[RemoveAlphaChannel[tst], {Round[r/2]}, {1, c}]]]];
If[count == 7, MainStr <> " ," <> "3 Lives Remaining", 
 If[count == 5, MainStr <> " ," <> "2 Lives Remaining", 
  If[count == 3, MainStr <> " ," <> "1 Life Remaining", 
   MainStr <> " ," <> "Game Over"]]]
If[SecStr == "Ready!", SpeechSynthesize[MainStr]]

As for Calculating the main thing, the angle vectors of the Ghosts with respect to Pac-Man for processing the direction they are headed towards and if they were are a safe distance or not. The code below called the legacy tracker to give out the coordinates.

abt = False;
coordEnemyRed = N[Mean[LegacyTracker[a, Red, abt, 0.4]]];
coordEnemyCyan = N[Mean[LegacyTracker[a, Cyan, abt, 0.4]]];
coordEnemyYellow = 
  N[Mean[LegacyTracker[a, RGBColor[255/255, 201/255, 51/255], abt, 
coordEnemyPink = 
  N[Mean[LegacyTracker[a, RGBColor[255/255, 153/255, 204/255], abt, 
coordPacMan = N[Mean[LegacyTracker[a, Yellow, abt, 0.2]]];

I took the mean since it gave me the center of the image which I later was used for calculating the vector angle of the Ghosts with respect to Pac-Man which later was mapped out to give the direction the ghosts were at.

distance = 
 Min[N[EuclideanDistance[coordPacMan, coordEnemyCyan]], 
  N[EuclideanDistance[coordPacMan, coordEnemyYellow]], 
  N[EuclideanDistance[coordPacMan, coordEnemyPink]], 
  N[EuclideanDistance[coordPacMan, coordEnemyRed]]]

I Calculated the distance between the points and later the Vector angle that ultimately decides which enemy is most probably a threat to you at the moment and accordingly the program uses a speech synthesizer to produce an output.

After the Ghosts, the next thing to detect was the maze, for which I first applied a series of filters to the Cropped Image of the grid (Maze)

    ColorDetect[FinCrp, RGBColor[0/255, 128/255, 248/255]]], 
   LightGray] // Normal]

This ultimately gave me the output like this:

enter image description here

After this, I took the PixelValue position of the White bits and coded a construct that immediately makes a pop sound whenever the Pac-Man hits a wall.

At this time the Program is indeed not real-time thus ill be working on this to make it process in the background by connecting my Pacman game with UnityLink and Mathematica. A lot of development yet remains in this project but the prime goal to parse the detail from a frame has been achieved.

As for my future work, I would actually like to work my way towards the accessibility computing and would take this project of mine to a much-advanced level where it actually does all of this in Real-time while producing using music as a key component to communicate with the player.


Reply to this discussion
Community posts can be styled and formatted using the Markdown syntax.
Reply Preview
or Discard

Group Abstract Group Abstract