See also: Analyzer
The analyzes syntactic consists with exhiber the structure of a text, generally a computer program or text written in a natural language. A parser ( parser , in English) is a Computer program which carries out this task. This operation supposes a formalization of the text, which is seen generally like an element of a formal Language, defined by a whole of rules of Syntaxe forming a formal Grammaire. The structure revealed by the analysis then gives precisely the way in which the rules of syntax are combined in the text. This structure is often a hierarchy of syntagms, representable by a syntactic Arbre whose nodes can be decorated (equipped with additional details).
The syntactic analysis usually makes following a lexical Analyze which cuts out the text in a flow (sometimes a DAG) of lexemes, and is used in its turn as precondition to a analyzes semantic. To know the syntactic structure of a statement makes it possible to clarify the relations of dependence (for example between subject and object) between the various lexemes, then to build a representation of the direction of this statement.
The methods employed to carry out a syntactic analysis depend largely on the formalism employed for the syntax of the language but also of the language itself. However, it is often made use, to model a language or a language, grammars of rewriting, among which most popular are the noncontextual grammars.
Thus, the computer programming languages are usually described by these grammars, and this since the formalization of Algol in BNF. In the same way, if noncontextual grammars are considered little adapted for the description of the natural languages, the algorithms of syntactic analysis invented for the noncontextual languages can sometimes be adapted to the more complex formalisms used in treatment of the natural languages, like the grammars of assistant trees (TAG).
Equivalence between the definable languages by certain classes of grammars and those which certain classes of automats recognize make it possible to build parsers using automats. Thus, the definable languages by a noncontextual grammar are also those which are recognizable by a Automate with pile.
A parser must recall the advance of application of the rules of syntax which led axiom to the analyzed text.
A parser, as a system of rewriting, is deterministic if only one rewriting rule is applicable in each configuration of the analyzer. By extension, there can then be only one sequence of rules making it possible to analyze the text in its totality, and thus this one cannot be syntactically ambiguous. However, it can be made use of techniques such as the forecast (in English lookahead ) or the Backtracking to determine which rule it is necessary to apply to a given point of the analysis.
The deterministic methods of analysis are mainly employed for the analysis of the computer programming languages. For example, the analyzes LR, L, or LALR (employed by Yacc) are all deterministic. One cannot however build a deterministic analyzer for any noncontextual grammar. In this case, and if one wish not to have that only one analysis at exit, one is constrained to associate additional mechanisms to him, like rules of clarification probabilistic or models allowing for choice of the " meilleure" analyzes.
A downward and deterministic method of analysis is known as predictive.
The size and the complexity of the natural languages, without forgetting their inevitable ambiguity, make their analysis deterministic completely impossible. A nondeterministic analysis is connected with a resolution in a constrained system, and is expressed rather easily in Prolog.
The use of tabular Methods, memorizing intermediate calculations, will be more effective than simple a backtracking . The Analyze CYK is an example of tabulée analysis, to which one will prefer more sophisticated methods
In syntactic analysis of the computer programming languages, it is necessary to be able to continue the analysis even when the source code contains errors, to avoid cycles of compilation/tiresome correction for the developer. In the same way, in syntactic analysis of the natural languages, it is necessary to be able to analyze statements even if they are not covered by grammar, inevitably incomplete. The recuperation on erreur, or rattrapage of erreur, must be sufficiently effective to detect the problems, and " to make avec" , with the help of a correction of the source or faculty to produce analyzes (slightly) deviating compared to grammar. One can quote four approaches which go in this direction, namely
| Random links: | Burbach | Canton of Pesmes | Sviatopolk II | Marittima massed | Postal stamps of the Irish Republic |