April 5, 2019

The Feynman Algorithm: Getting Started with a Proof of Technology

Terry May

Richard Feynman is a personal hero of mine. He was a theoretical physicist known for his work in quantum physics. Feynman had a unique and uncanny ability to solve tough problems. He was so good at it, in fact, that a colleague of Feynman’s jokingly suggested the “Feynman Algorithm.”

Write down the problem.
Think real hard.
Write down the solution.

The point of the joke was that Feynman was quicker than most at solving complex problems. From a very young age, Feynman was obsessed with figuring things out and, over the years, developed his own methods for deductive reasoning. His need to find out how things work and his interesting lectures are what makes him my hero.

What I do isn’t as complex as quantum physics, but, as a developer, I have to come up with creative solutions every day that involve hardware and software working together. Figuring out how to articulate “think(ing) real hard” to clients and describing my own process for breaking down abstract problems has been a challenge — but, luckily, like any good developer, I love a good challenge.

At Detroit Labs, we offer a product to help you get clarity when faced with a business challenge. Our Proof of Technology engagements take your wild idea and match it with today’s technology to create the beginnings of something great. Using a tried and true eight-week, exploratory and iterative process, we’ll work together to find out if your idea is possible.

This exploratory and iterative process can sound scary because it’s where we start to define something that is not yet defined — but I would like to help make it more accessible.

In this post, I will walk you through how I get started, what I do when I am stuck, and how to tie your ideas to the real world while figuring out if something is possible.

Our hypothetical client problem:

Art History

A museum client is working on making the museum more accessible and interactive. The building is large, and some areas are temperature controlled. We were tasked with figuring out what choices for information should be offered based on where a museum guest is looking. To do this, we had to come up with a way to know where the user was positioned and in what direction their gaze might be focused.

Ok! Step one is complete. We wrote down our problem. Now, if we are going to get somewhere we need to dig in.

Understand the challenge

Start by asking a lot of questions, “What considerations do I need to keep in mind? Limitations? Any history? Who are the end users?” Interview your point of contact at the museum (product owner), museum guests and donors (your users), and anyone technical at the museum that serves in a more technical role if applicable (other developers).

Immerse yourself in the problem. Gather enough data to help you get a better picture of the problem or challenge. Hopefully, this gets you to the point where you find creative inspiration.

The goal should be to fully understand the problem so you can utilize your developer viewpoint and skills to postulate a theory. The more you talk to others, the more you can use your past experience and research to crystallize new ideas.

Define a repeatable algorithm

Blackboard - Education

Once you have an idea, it’s time to start failing. Yes, I said failing; this is how you find your limitations and boundaries. Start trying your ideas fast — forget the rules, just make the simplest thing that doesn’t totally suck, and worry about making it cleaner later. I usually start by jumping in and writing code or experimenting. The idea is to prove your theory even if you have to hard-code values or fake input signals — whatever it takes to find a working algorithm or achieve something that gives you a result that your theory has predicted.

Here’s where detailed notes and documentation come in handy because you never know when you’re going to hit a wall. This series of rapid prototyping helps you figure out where your assumptions are wrong (and right). If you hit a wall and have to start over, your notes not only help you do that but also help the product owners to understand why the approach didn’t work out. Note-taking is also a good practice, in case you stumble on some new IP and the client wants to pursue a patent. Then the documentation exists to move forward.

Back to our hypothetical client problem…

We created a simple 3D virtual version of the museum. We picked a point where a viewer’s eyes could be and defined an angle to simulate the direction that our virtual viewer was looking. With this data, we draw a line to represent the line of sight. Our algorithm breaks that line up into discrete points. We test each point to see how close it might be to an item in the scene we care about, such as a sculpture, painting, murals, signage, etc.

In our museum, we chose a reference point that would mark the origin coordinates in our scene. From that, we measured the distance from that origin point to all of various targets. Now we have our first set of variables!

Variables

Sight targets – paintings, sculptures, signage, etc.
Origin points – museum guests of all heights

Additionally, to be able to present relevant information to the museum guests, we will also need to understand where they are looking. We will need to define where exactly the guest’s head is and how it is posed.

Tie possibilities to the real world

So even though we might now be armed with a process that can provide an answer to our problem, we’re not done yet because all we have is an abstraction at this point. We need to figure out how to apply our methods in a useful way.

As in our above hypothetical client problem, one of our actual clients we were able to model the whole problem in 3D and work on the math in a simulation. It involved a few variables like where a person’s head was and what they might be looking at. In the simulation, it’s easy to just plop these values in and see if your function works, but in the real world how do I get the real values to use in a practical way?

For me, that is where the fun begins

Yayoi Kusama

So our variables are now:

Sight targets – paintings, sculptures, signage, etc.
Origin points – museum guests of all heights
Field of view (FOV) – a person’s head positioning

At first, this seemed like it was going to lead us down a very complex path because we need to have a few moving targets understood in order to be able to present relevant information (height and head positioning), but the research was on my side. I found a number of options that could help: eye trackers, stereoscopic cameras, 3D sensing cameras, and other technologies. I dug in to learn about each option, benefits, and reasons why not to use them. I decided to start with two cameras that face museum guests.

These cameras have a wide FOV and can cover a large area of the scene. We mounted them and measured the distance from the scene origin point to the camera. The cameras by themselves don’t give us the variable we need, but they are a tool we can use to find the next variable.

Using the video from the cameras, we used open-source software to do two things:

Find faces in the video frame — more accurately we were finding the “landmark” features of a human face.
Using these landmarks, we could use another tool that would give the estimated head pose and location within a scene. This data gives us the rest of the variables we need for our algorithm.

At this point, we know where a person’s head is pointing and we know where all the static observable targets are located (the actual art on display). With this data, we are able to calculate the general intent of the viewer.

There was a lot of tweaking and it didn’t always work. There were issues with people wearing hats or glasses, and we only tested the software on Caucasian people, so accessibility for everyone else is unknown — BUT BUT BUT — we had shown in a very short time that it was possible to do what was needed.

Lessons learned

In a project like this where many of the “key components” may vary from start to finish, we found that it was best to build both hardware and software in a modular way. We didn’t want camera upgrades or changes in an algorithm to eat up valuable sprint time. In the end, we had our algorithm which we proved would work in a 3D simulation, and then took on the task of using that algorithm in the “real” world in real time.