Hi all, just sharing my project that was created at HackMIT 2018.
The premise of the project was to take online course videos (e.g. those on MIT OpenCourseWare), to feed it into a Mathematica notebook, which would then:
- Find the key frames in the video where there is the most writing on the board
- Break that frame down into lines of texts/equations in order
- Send these cropped images to the API https://mathpix.com/ to convert them into LaTeX equations
Of course the resulting combined LaTeX would not be 100% accurate, but we believed this would save a lot of time in note-taking since modifying incorrect LaTeX is still much faster than LaTeX-ing from scratch. More details are available at https://devpost.com/software/r00t (note that you need your own API key for MathPix to function).
Some possible extensions that might be interesting to work on:
- Converting the Mathematica notebook to some kind of cloud service to be more easily accessible to the general public
- Allowing Mathematica to look up videos on YouTube automatically rather than needing to download the video prior to using this software
- A way to address professors' writing being slanted (up or down) rather than in a horizontal line: the current method of gaussian blurring in the horizontal direction to find individual equations would not work anymore
- Some form of machine learning to allow more accurate recognition of the equations written on the board
- Anything else you can think of that would be interesting to work on, really... :)
Thanks for reading!