On hand tracking & gesture recognition: An Introduction to Hidden Markov Models (Rabiner, Juang)

(note: this paper is not haptics-specific)

Summary:

This paper is intended as a tutorial to help readers understand the theory of Markov models and how they have been applied to speech recognition problems. The authors describe how some data might be modeled according to a simple function like a sine wave, but noise might cause it to vary over time in some variable. Within a "short time" period, part of the data might be able to be modeled with a simpler function than the whole set. The whole can be modeled by stringing together a bunch of these pieces, but it can be more efficient to use a common short time model for each well-behaved part of the data signal along with a characterization of how one of these parts evolves to the next. This leads to hidden Markov models (HMM). Three problems to be addressed are given: (1) how to identify a well-behaved period, (2) how the sequentially evolving nature of these periods can be characterized, and
(3) what typical or common short time models should be chosen for each period. This paper focuses on what is a HMM, when it's appropriate, and how to use it.

They define an HMM as a doubly stochastic process (a sometimes random process - wikipedia) where one of the processes is hidden and can only be observed through another set of processes that produce the output sequence. They illustrate this with a coin toss example, where only the result of the toss is known for a series of tosses. They give examples with fair coins (50-50 odds) and biased coins (weighted towards heads or tails) to illustrate that without knowing anything besides the output sequence, important considerations are that it's hard to determine the number of states of the model; how do we choose model parameters; and how large is the training data set.

They define elements, mechanism, and notation for HMMs in their paper, so as to give sample problems and solutions using HMMs. The first problem is, given an observation sequence and a model, how to compute the probability of the observation sequence (or evaluate and give it a score). The second is how to choose a state sequence which is optimal in some meaningful sense, given an observation sequence (how to uncover the hidden model). The third is how to adjust model parameters to maximize the probability of the observation sequence (to describe how the training sequence came about via the parameters). They give explanations of how to solve these problems, with formulae. They give some issues related to HMMs that should be kept in mind when applying one.

They give an example of use of an HMM in isolated word recognition with a vocabulary of V words, a training set for each word, and an independent testing set. To enable word recognition, they build an HMM for each word, calculate a probability for each word model, and choose the one with the highest probability.

Discussion:

I like that the HMM is designed around the idea of a degree of randomness and the idea that you can't directly see the cause of the output that you can see -- this makes it seem like a good tool to apply to situations that match these criteria. I can see some parallels in human perception and recognition: we try to make sense of events that happen, but sometimes there are causes we can't see, and we're less likely to acquire an incorrect belief if we're open to the idea that we can't know everything.

I'm not yet convinced as to how quick and easy it would be to implement the math, but this seems like a fairly beginner-friendly reference for it.

2 comments:

- D said...: Your analysis of a HMM's power seems to be right on. They are powerful tools because they average the heck out of observations and state transitions, giving them a good deal of flexibility in the types of things they can recognize.

You shouldn't be convinced that it's easy to implement an HMM. It's not. Aside from the math and complex algorithms, there are things like numeric underflow to consider. There are MATLAB and Java (Brandon mentioned C++) implementations of HMMs out there. Unless you want the academic exercise, don't bother to try and roll your own.; January 24, 2008 at 7:35 AM
Brandon said...: I agree with Josh - toolkits are the way to go.; January 24, 2008 at 2:05 PM

On hand tracking & gesture recognition

Wednesday, January 23, 2008

An Introduction to Hidden Markov Models (Rabiner, Juang)

2 comments:

Blog Archive

Haptics class blogs

About Me