AIML

The Artificial Intelligence Mark-up Language (AIML) is a language derived from XML used to manage the knowledge of the misadventure S and club-footed S (virtual robots). This language uses a score of marks out S basic. The AIML was developed by Richard Wallace between 1995 and 2002.

Overall picture of AIML Written starting from an article of Dr. Richard S. Wallace.

AIML, or Artificial Intelligence Mark-up Language makes it possible people to insert knowledge in robots of discussion based on the technology of free software of A.L.I.C.E.

AIML was developed by the community of the Alicebot free software and myself of 1995 to 2000. It, in the beginning, adapted of a grammar non-XML also was called AIML, and formed the base of the first Alicebot robot, A.L.I.C.E., the Artificial Linguistic Internet Computer Entity .

AIML describes a class of objects of data called objects AIML and partly described the algorithm of the programs which can treat them. Objects AIML are composed of units called topics (matters) and categories. These objects can contain data to be treated or not ( either parsed gold unparsed dated ).

The data to be treated ( parsed dated ) are composed of character strings, of which some are only raw data and others are elements AIML. Elements AIML encapsulate the knowledge of the stimuli-answers contained in the document. The raw data in these elements is sometimes analyzed by an interpreter AIML, and sometimes left in the state to be treated later by the system.

Categories

The basic unit of knowledge in AIML is called a category. Each category is composed of a question of entry, an answer, and an optional context. The question - or stimulus - is called the pattern (model). The answer is called the template. The two types of optional context are called the " ça" (that) and the " sujet" (topic). The language of modeling AIML is simple, consisting only of words, spaces, and characters jocker such as _ and *. The words can be made up of letters and figures, but of any other characters. This language of modeling is insensitive with the difference between capital letter and tiny. The words are separated by a simple space and the characters jocker (wildcard) function like words.

The first versions of AIML allowed only one character jocker by pattern. The standard of AIML 1.01 allows multiple wildcards in each pattern, but the language is conceived to be as simple as possible, simpler even as of the regular expressions. A template is used to represent an answer in language AIML. In its simplest form, the template is composed only of one flat text and not marked.

More generally, beacons AIML transform the answer into program of minicomputer to save data, to launch other programs, to give conditional answers, and to invite periodically the analyzer of pattern (pattern matcher) to insert the answers resulting from other categories. The majority of beacons AIML relate to in fact the patterns.

AIML currently supports two manners of being interfaced with other languages and other systems. The beacon carries out any accessible program like orders operating system, and inserts the results in the answer. In the same way, the beacon allows programming inside the patterns.

The optional part of context of the category is composed of two alternatives, called and . The beacon appears inside the category, and its model must correspond to the last form of the robot. It is important to note that the last expression is important if the robot raises a question. The beacon of appears apart from the category, and gathers a group of categories together. The can be placed inside any template.

AIML is not exactly a simple database of questions and answers. It is a language making it possible to model a system of answer to questions by recognition of pattern and it is much simpler than something like the SQL. However, it should be noted that AIML includes the concept of recursivity in its treatment of an answer which can with the first access being difficult to apprehend. Thus a template of category can contain the recursive beacon , so that an answer has depends not only one compatible category, but also all the others periodically reached by the tag .

Recursion

AIML applies recursion with the operator of . No agreement exists about the significance of the acronym. The “AD INTERIM one” represents the artificial intelligence, but “S.R.” can mean the “stimulus-answer,” “rewriting syntactic,” “reduction symbolic system,” “recursion simple,” or “resolution of synonym. ” The dissension above the acronym reflects the variety of requests for in AIML. Each one of these last is described more in detail in a sub-section below:

  1. Reduction symbolic system: To bring back the complex grammatical forms to simpler.

  2. To divide and conquer: To cut an entry in two under-shares or more, and to combine the answers to each one.
  3. Synonymous: To bind the various manners of saying the same thing to the same answer.
  4. Corrections of epellation or grammar.
  5. Detection of key words anywhere in the entry.
  6. Conditional: The certain shapes of junction can be applied with the .
  7. Any combination of (1) - (6).

The danger of the is that it makes it possible to the Master of the robot (botmaster) to create infinite loops. Although posing a certain risk with the initial programmers, we thought that the was much simpler of use than the beacons of order structured per iterative block.

Reduction symbolic system

The reduction symbolic system refers to the process to simplify the complex grammatical forms in simpler. Usually, the atomic models in the categories storing knowledge of robot are stated within the simplest possible limits, for example we tend to prefer models as “WHICH IS SOCRATES” with those as “YOU KNOW WHO IS SOCRATES” when it is necessary to store biographical information on Socrates.

The majority of the most complex forms are reduced to simpler forms by using categories of AIML conceived for the reduction symbolic system:

VOUS KNOW WHICH IS *

Some is the entry which corresponds to this model, the part related to star * can be inserted in the answer with the beacon . Does this category reduce any entry of the form “you know who X is? ” in “which is X? ”

To divide and conquer

Many various sentences can be reduced with two or more under-sentences, and the answer made up by combining the answers of each one. A sentence starting with the word “yes” for example, if it has more than one word, can be treated like the under-sentence “yes. ” more that which follows it.

OUI *

The beacon is simply an abbreviation for .

Synonyms

The standard of AIML 1.01 does not allow more than one model by category. The synonyms are perhaps the most common utlisation of the . Many manners of saying the same thing are reduced to a category, which contains the answer:

HELLO

HI

Hé to you là

Bonjour

HOLA

Correction of epellation and grammar

The simple most common error of epellation of customer is the use of the “your” instead of “you' Re” or “you are”. But each occurrence of “your” cannot be transformed into “you' Re”. A little grammatical context is usually necessary to detect this error:

YOUR has *

Here the robot corrects the entry of the customer and acts as a professor of language.

Key words

Frequently we would like to write a gauge of AIML which is activated by the appearance of a key word anywhere in the sentence of entry. The general format of four categories of AIML is illustrated for this example borrowed from ELIZA:

MERE

_MERE

MERE_

_MERE*_

The first category detects at the same time the key word when it appears telquel and provides the generic answer. The second category detects the key word like suffix of a sentence. The third detects it like prefixes of a sentence of entry, and finally the last category detects the key word like infix. Each tois uses the bond with the first, so that each of the four cases produces the same answer, but with it to be written and stored only once.

Conditions

It is possible to write the conditional branches in AIML, by using only the beacon . Let us consider three categories:

QUI IS

QUIESTIL *

QUIESTIL INCONNU

Provided that the attribute “him” is initialized with “UNKNOWN”, the categories carry out a conditional branch according to if “him” were placed. Like suitability with the Master of the robot, AIML also provides the equivalent function by the beacon of .

Targeting

The personalities of robot are created and formed by a cyclic process of directed study called targeting (TARGETING). The targeting is a cycle which includes/understands the customer, a robot, and a Master of robot, where the entries of customer which do not find any correspondence complete among the categories is noted by the robot and is provided like targets to the Master of robot, which creates appropriate responses then, starting with the most common questions. The cycle of targeting produces an increasingly fine personality of robot gradually.

A technique which illustrates the art of writing AIML amounts creating categories by defect, which provide evasive answers to a range of entries. Here a very nice example found on the mailing list alicebot-general:

User: Where can I find the Web page with the Australopithecus?

Alice: Where do you want to find it?

User: Bad answer Alice, it is time to learn a new line

Alice: How you would answer “Where can I find the Web page with the Australopithecus?”

User: I would say: “http://www.ancientskulls.net/australopithecus.html”

Alice: Thank you, human. Try to rest me the question.

User: Where can I find the Web page with the Australopithecus?

Alice: http://www.ancientskulls.net/australopithecus.html

It is a very pleasant example. It is exactly the same thing that the targeting when the Master robot is the professor. Or I should say, it is an interface of AIML to the targeting. Naturally this scenario raises many problems, such as the way of throwing outside the intruders and other givers of refuse, which could engage the training during a certain time, whereas it is the voluntary work of many chatters to help the robot to grow quickly in wisdom.

It is the heart of the problem of the training. One always returns from there to the edition. Or the Master of the robot publishes answers of good quality to start, or the robot is opened with the public then the professor must publish all the miserable answers provided by the robot to the voluntary workers. My personal sight is that, compared with a hypothetical machine of study able to learn the language like a child, the approach of targeting of AIML is more effective.

Context

The key word “that” in AIML refers to the preceding expression of the robot. Specifically, if the robot answers with several sentences, the value of “that” is the last sentence in the order. The choice of the key word “that” is justified by its use in the ordinary language:

R: Today is yesterday.

C: That does not have any direction.

R: The answer is 3.1412926 roughly.

C: They is sympathetic.

In AIML, syntax includes a model which corresponds to the preceding expression of the robot. A common application of is used in oui-non questions:

OUI VOUS LIKE FILMS

This category is activated when the customer says YES. The robot must discover with what the customer says yes. If the robot requires “you like films? ”, this category corresponds, and the answer “Which is your preferred film?” continue the conversation.

An interesting application of are the categories which allow the robot to answer the jokes tap-tap kind. Categories:

TOC FAKE TOC

* QUI IS LA

* * QUI very funny

produce the following dialog:

C: Fake fake fake.

R: Who is there?

C: Banana.

R: Banana which?

C: Fake fake fake.

R: Who is there?

C: Banana.

R: Banana which?

C: Fake fake fake.

R: Who is there?

C: Orange.

R: Orange which?

C: Very happy orange not to have said banana.

R: Very funny ha ha, Nancy.

Internally the interpreter of AIML stores the model of entry, this matter model and model along a simple way, like: Of ENTRY THIS MATTER OF . When the values of the or the are not indicated, the program implicitly places the values of the model correspondence of THAT or MATTER at the wildcard *.

The first part of the way to the match is the entry. So more than one category have the same model of entry, the program can distinguish them according to the value from the . If two categories or more have same the and , the final stage is to choose the answer based on the .

This structure suggests a rule of design: never not to employ the unless you wrote two categories with same the , and never not to employ the unless you write two categories with same the and . Always, one of the most useful requests for is to create the “lines subject-dependant on collection,” like:

name= *

Considering the vast size of the whole of things people could say who are grammatically correct or semantically signicatif, the number of things which people really say are surprisingly small. Steven Pinker, in its book how work of spirit wrote, “to say that you have ten choices so that the first word begins a sentence, ten choices for the second word (bringing back 100 beginnings of two words), ten choices for the third word (bringing back beginnings thousand three-words), and so on. (Ten is in fact the approximate geometrical means of the number of choices of word available to each point by assembling a grammatical sentence and significant). A little of arithmetic proves that the number of sentences of 20 words or less (not a not very common length) is approximately 1020. ”

Fortunately for programmers of robot of talk, calculations of Pinker are extinct manner. Our experiments with A.L.I.C.E. indicate that the number of choices for the “first word” is more than ten, but it is only approximately two thousand. Specifically, approximately 2000 express covers 95% of very first words entered A.L.I.C.E. the number of choices for the second word is only approximately two. To be sure, there are some first words (“I” and “you” for example) which have many second possible words, but the total average is just below two words. The average factor connecting up decreases with each successive word.

We traced some beautiful images of the contents of brain of A.L.I.C.E. represented by this graph (http://alice.sunlitsurf.com/documentation/gallery/). More than just the elegant images of the brain of A.L.I.C.E., these images in spiral (see more) describe a territory of the language which “was effectively conquered” by A.L.I.C.E. and AIML.

No other theory of treatment of natural language can better explain or reproduce the results in our territory. You do not need a complex theory of study, neuraux nets, or cognitive maps to explain how to cause within the limits of the categories of A.L.I.C.E. 25.000. Our model of stimulus-answer is as good a theory as very other for these cases, and certainly simplest. If there is any part still for “higher” theories of natural language, it is apart from the chart of the brain of A.L.I.C.E.

The academics are fanatic to invent linguistic enigmas and paradoxes which show supposedly how difficult the problem of natural language is. “John saw the mountains flying above the flies to fruit of Zurich” or “like a banana” to indicate the ambiguity of the language and the limits of a A.L.I.C.E. - approach of model (however not these particular examples, naturally, A.L.I.C.E. knows already they). In a few years to come we will further advance only the border. The basic contour of the graph in spiral can look at the more or less even thing, because we found all the “large trees” of “A *” “YOUR *”. These trees can become larger, but unless the language itself changes we will not find any more large trees (except naturally in foreign languages). The work of those seeking to explain the natural language in terms of something more complex than the answer of stimulus will take place beyond our border, more and more in the hinterlands occupied by only the rarest forms of language. Our territory of language contains already the highest population of the sentences which populate the use. Increasing the borders even more we will continue to absorb the latecomers outside, until the last whole critical human cannot think of a sentence “to deceive” A.L.I.C.E.

See too

Related articles

External references

  • to see the site of Alicebot
  • Character Builder - Flash Misadventures IA - Netsbrain - Character Builder

Random links:Afonso VI del Portugal | Dolmen | Benagues | Buitre egipcio | Hénon | Championship of France of Rugby at XV 1984-85 | Quartett | Prix_de_Helen_B._Warner_pour_l'astronomie