3 years ago

As the web increases in popularity, more and more applications that were previously contrained to the desktop are moving online. One such application is computer vision.

In the past, computer vision algorithms were restricted to embedded systems, computers that were programmed to perform only one task. As the need for general purpose computers increased, computer vision algorithms were implemented for these machines as well. One such library that bought computer vision to the desktop computer was OpenCV.


OpenCV (Open Computer Vision) is a great library, with an enormous number of computer vision algorithms implemented. I honestly couldn't even begin to list how many things this library can do (other people have listed it, and supplied it as documentation).

There is however a downside to OpenCV...
Setting it up is not so great. In fact, it can be a bit of a nightmare.

I eventually got OpenCV working on a Mac with a fresh Mavericks install, but only after I had spent some time resolving issues involving my HomeBrew install of Python and wasted a great deal of time tracking down a long list of option flags for the cMake compiler.

And that's what's wrong with it.

I'm a fairly experienced developer, and while I was able to work through any errors encountered and understand compiler documentation (and not without difficulty), it is easy to see how one could quickly be put off by the list of foreign terminal commands and the scary compiler messages.

Computer Vision is super interesting. There are a large number of applications for it, from very practical uses to interactive artwork for those more creative souls.

It's a shame that for a long time, the barrier of entry to exploring this facinating realm of computing was so high. Heck, there's been a number of times I would abandon a project in frustration.

These days however, the barrier of entry has been smashed down by the advancement of the web.

Enter JavaScript and HTML 5

With the Web booming as it is, browser makers are pouring resources into making their JavaScript engines super fast.

And now they're even fast enough to run these fancy computer vision algorithms. Remember that program you wanted to make to give you computer alerts when your toast has finished toasting? I know I do. And now you can make that on the web, and with that, share the conveinience with others!

There are even great libraries such as JsFeat and TrackingJS that are popping up to fill the large shoes of OpenCV.

The advantage of JavaScript is that it's super accessible. Most developers of almost any skill level can pick up and learn JavaScript. With libraries such as TrackingJS, chuck in a couple of <script> tags and open your web browser, and you have a running Computer Vision application. The ease with which you can quickly prototype ideas on the web encourages new programmers to learn and explore these technologies. It's great.

Computer Vision at Gramercy

Here at Gramercy Studios, we have developers that love to push what we can do on the web.

Our idea is fairly simple. We wrote a simple game similar to the game Breakout, in which the player has a paddle and trys to hit the ball to break all the bricks above them. We put our own twist on this in that we used computer vision techniques to use the player as the paddle. Using the computers webcam, we can track objects in the scene and the ball can collide with the person playing.

You can keep track of this project as we develop it here.

With computer vision so accessible with JavaScript now, hopefully we'll soon see lots of other people pushing what this technology can do.

For those interested, the image at the top of this post is Lena, that has become the standard testing image for computer vision applications.