Home Site
 

    Page 21
16.4 Table of Contents Bottom of Page Site Map
"The goal of NLPWin is to enable the machine to
produce an internal representation that corresponds to what we understand in our minds when we hear natural language. This is the key - the understanding of natural language leads to intelligence. I do not think humans become intelligent just through natural language. I think as we are born we take in all kinds of sensor input. We have emotions that are just native to react with, we learn by being immersed in this environment, language comes along and we put the symbols on these experiences. Machines are not like that, if they are going to become intelligent, it is going to have to be some other way. Therefore, our way is through experience and their way is through symbol manipulation. We put symbols on our experience and machines are going to have to learn to put experiences on the symbols," says Karen Jensen, former manager of the NLP Group.
This bottom-up vision for building intelligent machines
flies in the face of large-scale top-down AI efforts such as the 18-year-old CYC project pioneered by AI legend Doug Lenat. A second major area of difference between NLPWin and CYC is in self-training. The NLP Group strongly believes that CYC's handcrafting is counterproductive. Every time CYC encounters a new lexicon, it requires more hand coding to surgically implant the new knowledge, slowing development and possibly creating conflicting information. Instead, NLPWin automatically assimilates the meaning of words from the text.

Assimilating the Meaning of Words from the Text
This process involves a series of successive stages,
beginning with a very rudimentary analysis of how words
connect together to form grammatically correct sentences. It then explores the deeper structures in the language hoping to attach meanings to the words and sentences in the context of the world. As shown in Figure 1, the systems first component breaks or parses words, arranging them in a tree-like structure.
The next component, Morphology, identifies the
various forms of a word. For example, the root word jump has a variety of variations, or morphs, such as jumping, jumped, and jumps. By storing just the root word jump, and retaining the capacity to recognize the other morphs of the word, the system saves approximately one half the space it otherwise requires to store all variations of the English words. The savings is even greater for other languages, such as Spanish, Arabic and Japanese, where the savings can run as much as three to four times.

Microsoft Natural Language Dictionary (MIND)
Component
Joseph Pentheroudakis designed the Morphology
component and the Microsoft Natural Language Dictionary (MIND), which was originally built using two different machine-readable dictionaries, the Longman Dictionary of Contemporary English and the American Heritage Dictionary. Although NLPWin uses dictionaries to train itself, "the parser has not been specifically tuned to process dictionary definitions. All enhancements to the parser are geared to handle the immense variety of general text, of which dictionary definitions are simply a modest subset," says Pentheroudakis.

Figure 2: The conceptual view of how words interlock in MindNet.

To Page 20

16.4 Table of Contents
Top of Page

To Page 22


16.4 2002
21

PC AI Magazine - PO Box 30130 Phoenix, AZ 85046 - Voice: 602.971.1869 Fax: 602.971.2321
e-mail: info@pcai.com - Comments? webmaster@pcai.com