Wednesday, January 30, 2008

Online, Interactive Learning of Gestures for Human/Robot Interfaces (Lee & Xu)

This paper presents a system that can recognize gestures and learn new ones with one or two examples online, using HMMs. They base their idea on the procedure: (1) the user makes a series of gestures, (2) the system segments the data into separate gestures, then either reacts to the gesture if it recognizes it, or asks for clarification from the user, and (3) the system adds the new example to a list of examples it has seen and retrains the HMM on the data so far seen using the Baum-Welch algorithm. They chose to represent gestures by reducing the data to a one-dimensional sequence of symbols, after resampling at even intervals, dividing into time windows, and undergoing vector quantization. They generate a codebook for this using the LBG algorithm, offline. Their segmentation process requires that the hand be still for a short time between gestures, though they believe an acceleration threshold would be useful if the hand does not stop. They have a simple function to give them a confidence measure for each gesture's classification, and they tested the system on 14 letters of the sign language alphabet which they chose for not being ambiguous without hand orientation data. They found 1%-2.4% error after 2 examples and close to none after 4 or 6 examples in their two tests. Their future goals include increasing vocabulary size by using 5 dimensions of symbols (one per finger).

Discussion:

I am curious how natural pausing between gestures will be in how many applications. As we've discussed in class, applications like sign language might use a very fluid series of gestures. But in the case of some kinds of commands, pauses are probably very natural, unless you want to do a fast sequence of commands and not have to wait for confirmation of comprehension between them. I can imagine "corner finding" based on direction and speed could be another useful tool to segment gestures into more manageable pieces.

I think resampling at even intervals as in this paper will be a very good thing to keep in mind, along with jitter reduction.

2 comments:

Brandon said...

i ... naturally ... pause ... between ... every ... thing ... that ... i ... do. ... how ... about ... you?

yeah i'm not sure how user-friendly this system is...

Paul Taele said...

I thought the "corner finding" analog for hand gestures seemed like a good idea. The Harling & Edwards paper on hand tension gave some convincing arguments against it. It's easier to segment in sketch recognition, because possible segmenting only occurs during the act of the drawing. Their argument is that it's more difficult in hand gesture recognition, because there is no way to tell when the user is making the motions and when the user isn't.

Hand gesture recognition is hard... :(