Adam Harvey and Viola Jones Face Detection

05 April 2012

Last week, I kicked off Makematics with an interview about FaceTracker. FaceTracker is an example of a complex technique that builds on top of a series of computer vision, image processing, and machine learning techniques in order to achieve its result. Kyle did a terrific job explaining FaceTracker by breaking it down into its basic components. He explained how FaceTracker uses the Viola-Jones face detection algorithm to find the general location of the face. He explained how FaceTracker detects individual parts of the face based on machine learning data. He explained how FaceTracker searches for these parts as they move around from frame-to-frame.

One of the chief goals of Makematics is to tease complex topics apart into their individual components in order to make them easier to learn. Mathematics and programming topics form a complex network of interdependent concepts. That’s part of what makes them so hard to learn: you start with wanting to know about face tracking and pretty soon you find yourself with a list of prerequisites that scrolls and scrolls.

Slowly but surely, article by article, Makematics is dedicated to filling in as much of this dependency graph as possible. The site will cover advanced composite topics like FaceTracker as well as the component techniques like Viola-Jones face detection.

In fact, last week I conducted an interview with artist and programmer Adam Harvey about Viola-Jones. Harvey is best known for his project CV Dazzle. CV Dazzle is camouflage from computer vision. Harvey creates hair, clothing, and makeup designs that cause the Viola-Jones face detection algorithm to fail to detect the faces of people who are wearing them. In order to build this project, Harvey had to learn an incredibly amount about how the algorithm works. He even built tools that slow it down and visualize its operation so he could see exactly how his makeup was defeating it. This work makes him the perfect person to explain how Viola-Jones works and I think his interview represents a great introduction to the topic.

Further, Harvey was generous enough to share not just his time, but his code as well. As of today, he’s released Haarcascade Visualizer, a Processing-based tool that he uses to produce visualizations of the face detection process. Haarcascade Visualizer is available on Github. It’s the first piece of a larger project of Harvey’s to implement much of OpenCV’s face detection functionality natively in Java to make it easier to understand and do experiments with. We talk more about that effort in the interview. Harvey expects code to become public for that in May.

In addition to this technical knowledge, Harvey also has a very interesting artistic and cultural take on computer vision. In the interview he compared CV Dazzle to Levi’s jeans, explaining that “the original Levis became successful because they’re a really durable jean for miners to wear, but today whose mining? If privacy is one of your needs, then we can certainly design something into clothing that would help with that.” In other words, preventing yourself from being automatically tagged in Facebook photos and other online services may become just as pragmatic as the need to protect your knees from harsh mining conditions was for our fore-bearers.

This kind of thinking represents exactly the kind of creative possibilities that can be unlocked by a deep investigation of these mathematical and computer science. I hope you enjoy the interview.

Math for Makers

23 March 2012

Artists build on top of science.

Renaissance painters used Cartesian geometry to invent perspective. Nineteenth century photographers' adapted industrial chemistry into the photographic process. Contemporary filmmakers shoot on cutting edge cameras made possible by the latest in sensor miniaturization. Each generation of artists turns the knowledge of their time into new creative tools.

No generation of artists has ever been more dependent on scientific and technical advances than today’s. Today’s artists work on computers. Advances in computer science and related mathematical fields underlie everything that digital artists make. Recently these advances have lead to the advent of whole new creative fields like interactive art, generative graphics, data visualization, and digital fabrication.

In order to produce excellent and novel work in these new fields, artists have had to learn computational and mathematical techniques. They started with basic material like trigonometry for 2D games and graphics, the rudiments of computer vision for interactive installations, and primitive signal processing for embedded electronics.

Increasingly these new creative fields are becoming the basis of art and design across our culture. And these techniques are becoming the foundation of a new kind of art and design education. A cohort of today’s best art and design schools teach introductory programming and a survey of these basic topics as their core curriculum (NYU ITP, CMU’s Studio for Creative Inquiry, Parson’s, CIID, UCLA Design Media Arts, etc.).

However, as these new creative fields advance, driven by their own internal momentum and inspired by the advent of new developments in computer science and technology, they are beginning to require more advanced math and programming techniques. Today’s SIGGRAPH papers and academic journal articles are tomorrow’s breakthrough creative coding projects.

Topics like linear algebra, topology, graph theory, and machine learning are becoming vital prerequisites both to doing daily work in these fields and, more importantly, to inventing, popularizing, and teaching the new creative tools that are rapidly arising. Without them, artists are forced to wait for others to digest this new knowledge before they can work with it. Their creative options shrink to those parts of this research selected by Adobe for inclusion in prepackaged tools. Instead of the themes and concerns of creative work driving the selection of tools from a growing technical cornucopia, artists find themselves turned into passive users of tools that are already curated, contextualized, and circumscribed by others.

So, I want to do something about this. I want to figure out a way to teach myself and others these more advanced mathematical and computational concepts with a specific eye towards applying them in creative technology. For example, I want to be able to read SIGGRAPH papers, understand what’s going on in them, explain that to others, and create software libraries that implement their techniques. I want my peers who do data visualization to be able to implement regressions, curve fitting, statistical analysis and machine learning so that their projects are more than just pretty graphs with good typography. I want the hardware hackers who are building the next generation of DIY 3D printers to be able to turn topological algorithms and concepts into open source tool path generation software that creates more efficient gcode and enables the fabrication of previously impossible physical forms.

I don’t know the best way to go about this, but this site is intended to act as home for my experiments. Here, I’ll collect research papers with potential as creative tools. I’ll catalog the mathematical and computer science terminology and techniques used in these papers. I’ll work to explain this terminology in plain English and to translate these techniques into comprehensible, publicly available code. I’ll interview other digital artists and researchers who understand parts of this material in order to capture and spread their knowledge. I’ll try any approach that has the potential to help us gain some grasp on these new areas and I’ll collaborate with anyone’s who’s interested in helping.

To inaugurate this site, I conducted an interview with Kyle McDonald about FaceTracker. Kyle is an artist and researcher who embodies much of the spirit of this site. He’s a key member of the OpenFrameworks community and he’s done extensive work democratizing 3D scanning. Last October he released the first public results of a library for fast, accurate, face-tracking he built for OpenFrameworks. That library was based on the work of Jason Saragih, a computer vision researcher at CSIRO. Since the release of Kyle’s library, there’s been an explosion of projects in the creative coding community using his code to explore the possibilities of face-based interaction. I asked Kyle to explain how he first found Saragih’s work and to walk through how the algorithm works. I think the results represent a terrific introduction to many key areas of computer vision and to understanding FaceTracker in particular. In addition to to the video, I’ve provided a transcript of the interview that I’ve annotated with links and additional media.

Much of the rest of this site is currently under construction, but I’ll be announcing additional sections as well as further interviews and other material soon. You can follow along and get in touch with me about the project through the makematics twitter account. If you’d like to get involved, this site is on Github: atduskgreg/makematics. Contributions of all forms are welcome, from writing to code to graphic design or other media.

In conclusion, a note about the intended voice of this site. All too frequently the technical and mathematical issues discussed here are written about solely by experts for other experts. And even when it does appear in venues intended for more general consumption, the material is usually presented as if its own internal logic and rigor made it naturally comprehensible. I don’t believe this is true.

I think my greatest advantage in this effort is that I am a beginner myself. I am not an expert in computer vision, computer science, or mathematics. I’m a programmer and an artist who’s committed to struggling with this material until I understand it and can make it comprehensible and useful for myself and others. I won’t hide the frustrations and confusions that are inevitable in the process. I hope to show that a normal programmer with no special academic training can grapple with these areas of research and find a way in to understanding them and making them part of my creative work. And as I go I aim to create material that will help others do the same.

If I can do it, there’s no reason you can’t.