Monday, February 18, 2008

A Survey of Hand Posture and Gesture Recognition Techniques and Technology (LaViola)

Summary (section 3):

This section of the paper summarizes a number of algorithmic techniques that have been applied to recognition of hand postures and gestures.

They cover feature extraction, statistics, and models, which includes template matching* (simple, accurate over a small set, small amount of calibration needed, recommended for postures not gestures), feature extraction* (a statistical technique to reduce data's dimensionality, handles gestures as well as postures, slow if many features), active shape models, principal components* (recognizes around 30 postures, requires lots of training by multiple users, requires normalization), linear fingertip models, and causal analysis.

They discuss three learning algorithms: neural networks* and HMMs* (both can recognize large posture/gesture sets, good accuracy given extensive training), and instance-based learning* (relatively simple to implement, moderately high accuracy for large set of postures, provides continuous training, memory/time intensive as data set grows, not well-researched for this application).

They also discuss three miscellaneous techniques: Linguistic approach* (uses formal grammar to represent posture & gesture set, simple approach with so-far low accuracy), appearance-based motion analysis, and spatio-temporal vector analysis.

*can be done using glove data rather than only applying to vision data.

Summary (section 4):

This section discusses applications that use hand postures and gestures: sign language, gesture-to-speech, presentations, virtual environments, 3D modeling, multimodal interaction, human/robot interaction, and television control.

Discussion:

This seems like it could be a very useful paper for an introduction to haptics -- the reader would get an idea of what kinds of tools have been used for recognition and their strengths and weaknesses. The linguistic approach sounds like what we were discussing in class regarding an evolution of LADDER, and while it's slightly worrying that this paper says the accuracy found in other implementations of it so far has been low, it could also be a good contribution to the field if we are in fact able to make it work well. It might be worth looking at the 1994 paper they cite ("A Linguistic Approach to the Recognition of Hand Gestures" -- Hand, Sexton & Mullan).

3 comments:

Brandon said...

are you volunteering that paper :)

i don't think we should be too discouraged by the lack of accuracy achieved by the linguistic approach mentioned in the paper. for one, their grammar simply consisted of whether or not a finger is open or closed. there was no intermediate positions, no abduction measurements, and no palm measurements taken into account. also, they used a power glove which is the predecessor of the p5 glove. i'm sure the glove didn't help the recognition process.

Grandmaster Mash said...

I think Luke wants you to get your memory checked. You should listen to him, since he's reliable and has a valid name.

I'm also not worried about the lack of a linguistic approach. Approaches that are more free-form (linguistic) are almost always going to perform worse than ones that are heavily constrained (templates).

Paul Taele said...

What a nice guy that Luke is, trying to help you with your memory.

The paper really was a nice overview. It could use a slight update now, and perhaps a re-ordering, but it more or less captured many of the ideas in the class. A shame that there hasn't really been major advancement since this paper came out.