of a 1,000 words, reasonably
fast response times, and less than a 10% error rate.
The main government contractors
included, Carnegie Mellon University (CMU), Stanford Research Institute
(SRI), MITs Lincoln Laboratory, Systems Development Corporation
(SDC), and Bolt, Beranek, and Newman (BNN). A few other institutions
also received a number of sub-contracts. In the early development
phase (1971-1973), CMU produced the HEARSAY-I and DRAGON systems and
later HEARSAY-II and HARPY. (www-2.cs.cmu.edu/~msiegler/ASR/
futureofcmu-final.html).
Note that the lead CMU researchers
for the DRAGON program, James and Janet Baker, pioneered the Hidden
Markov Model (HMM) to recognize continuous speech. An HMM is a sophisticated
statistical technique that uses probability distributions to infer
the most likely word spoken based on prior continuous (or discrete)
speech. (www.cs.brown.edu/research/ai/dynamics/
tutorial/Documents /HiddenMarkovModels.html and www.ai.mit.edu/~murphyk
/Software /HMM/hmm.html). An HMM maintains a probability distribution
over a set of possible observations for each state, making it possible
to discern words or phonemes (basic sounds of speech) in continuous
or discrete speech.
The first program developed by
BNN was called SPEECHLIS, their second attempt held the acronym HWIM
(Hear What I Mean) - see www.bbn.com/speech/docs/
papers/bbn-slp-w-call-020401.pdf for a paper on BNN work).
Five years later, in 1976, DARPA evaluated CMUs HARPY and HEARSAY-I,
along with BNNs HWIM. The speech recognition programs built
cooperatively by SRI and SDC were never tested. The software coming
closest to achieving the SUR project benchmarks (and may have even
surpassed them) was CMUs HARPY. Interestingly, one of
|
|
HARPYs developers,
Fil Alleva, now works on speech at Microsoft (http://research.microsoft.com/srg/fil).
Unfortunately, since DARPA did
not define the precise details for testing the systems in advance,
some researchers disputed the test results. Some researchers believed
the project had actually failed to meet its initial objectives, which
led to a great deal of controversy. This conflict caused DARPA to
terminate funding for the program and even cancel a planned five year
follow-up study.
In 1984, DARPA was once again
funding speech recognition research, this time on a larger scale,
as part of the Strategic Computing Program (see PC AI 16.5 for information
on some of the latest DARPA contacts). Many of the original participants
in the SUR project took part in the new program, including CMU, BBN,
SRI, and MIT. Private firms also contributed such as IBM and Dragon
Systems (now scansoft - www.scansoft.com).
Dragon systems roots trace back to the CMU DRAGON program).
This time, to minimize testing
controversies, DARPA and the National Institute of Standards and Technology
(NIST) established a standard-setting process that included government
contractors, industry, and academic groups from around the world.
These improved measurements, and the annual system evaluations (or
bake-offs), helped promote rapid advances in the speech recognition
field, including the incorporation of syntactic and semantic information.
CMU Takes Center Stage
CMU was clearly a hotbed for
DARPA sponsored speech research in the 1980s and therefore attracted
the best and brightest students. Kai-Fu Lee (www.microsoft.com/
presspass/exec/kaifu/default.asp) , a graduate student at
CMU,
|