By the close of 2002, Bill Dolan, manager of Natural
Language Processing (NLP) Group at Microsoft Research
(MSR), expects the software titan to deploy a Web-based system capable
of accurately and automatically translating the entire Microsoft Product
Support Services (PSS) Knowledge Base from English into Spanish, providing
real-time responses to Spanish queries. The PSS Knowledge Base is
an enormous collection of information used to identify and solve problems
with Microsoft software.
Accurately and automatically converting this sizable
knowledge base from English to Spanish, without human
editing, represents an incredible leap forward in the world of computational
linguistics. This breakthrough, made possible by the fruition of a
two-decade plus research effort, produced an NLP system known internally
at Microsoft as "NLPWin."
The system has already been successfully beta tested.
The NLP Group is now working on English to Japanese,
German, French, and Chinese versions of the NLPWin executable (program).
These systems will likely save the company millions of dollars and
that is just the beginning.
The NLP Group is exploring how its Machine
Translation (MT) technology can help product divisions
within Microsoft. "Demand for the technology is far outpacing
the capacity of our 30 member research group to satisfy requests,"
says Dolan. One Microsoft product team especially interested in the
technology is the Productivity Tools Group, which has the daunting
task of product "localization."
Localization, that lengthy process of preparing
for a foreign market, typically involves translating
the text embedded in the software itself (e.g. menus, message dialog
boxes) as well as the associated user manuals and other help text
into the native language. As you can imagine, this is a considerable
undertaking for a product like Office XP.
The NLP Group's MT technology has the potential
making this time-consuming, complex, and expensive
procedure fast, simple, and inexpensive.
They also see their innovation eventually being
for use by other large corporations in need of translating
sizable bodies of documents quickly and cheaply. Some in the NLP Group
envision a time when a "mega-translator," based on their
technology, will allow Internet users to converse in unrestricted
domains instantaneously. Unleashing this type of communication power
for public use could open a entirely new world of global interactions.
A Little History
To better appreciate and understand the implications
of Microsoft's Machine Translation breakthrough, it
is helpful to briefly examine the evolution of the field. The quest
for accurate, automatic, on the fly MT has been the Holy Grail of
leading computational linguistics and AI researchers for over fifty
years. The effort began when Warren Weaver, then director of the Rockefeller
Foundation, wrote a 1949 memorandum to 200 top scientists, suggesting
that computers could be programmed to translate language mathematically,
without actually "understanding" the meaning of words. This
seminal 12-page memo literally launched the field of MT.
Within in a couple of years, MT efforts were underway
at UCLA, the National Bureau of Standards, the University
of Washington, the Rand Corporation, and MIT. In 1953, a Georgetown
University team worked with IBM to actually create the first working
MT program, which translated Russian into English - the language choices
were no doubt inspired by the Cold War atmosphere of that era. On
January 7, 1954, the Georgetown team unveiled the MT program publicly
at IBM's Technical Computing Bureau in New York. Despite the fact
that it was limited to just 250 words, 6 grammar rules, and 49 handpicked
sentences, the idea of MT caught fire in the press. |