The recognition of the manuscript writing is a treatment Informatique the purpose of which is to translate a text written into a numerically coded text.

It is necessary to distinguish two distinct recognitions, with different problems and solutions:

  • in line recognition;
  • the recognition out-line.

The recognition of the manuscript writing calls upon the Pattern recognition, but also to the automatic Traitement of the natural language. That wants to say that the system, just like the Human brain, recognizes words and sentences existing in a language known rather than a succession of characters. This improves the robustness largely.

Recognition out-line

The recognition out-line works on instantaneous of numerical ink (on an image). It is the case in particular of the Optical Reconnaissance of the Writing. In this context it is impossible to know how was traced the various reasons. It is only possible to extract from the forms starting from the image, while being based on technologies of pattern recognition.

It is obviously the type of recognition privileged for the asynchronous treatments such as the reading of bank check or the postal sorting.

In line recognition

Within the framework of in line recognition, the sample of ink consists of a whole of coordinates ordered in time. It is thus possible to follow the layout, to know posed and raised pen and possibly the slope and speed. One needs obviously a specific material to seize such a sample, it is the case in particular numerical pens or stylets on electronic diaries or the Tablets PC.

In line recognition is generally much more effective than the recognition out-line bus the samples are much more informative. On the other hand, it requires a material much more expensive and imposes strong constraints to the script writer since the capture of ink must be made at the time seizure (captures synchronous) and not a posteriori (captures asynchronous).

The used techniques can have a vaster applicatif field allowing the recognition of any simple abstract form (cf Pattern recognition, weak Artificial intelligence). The current systems (2005) mainly proceed by a comparison between the sample to recognize and those contained in a Database. This database can be created of all part or be the object of a phase of training.

The techniques of comparison generally rest on simple statistical methods to gain of speed of treatment. The consequence is that the number of recognizable forms must be limited, without what the results are likely to be often erroneous. Indeed, all the difficulty of the recognition is to evaluate the similarity between a studied form and each form of the database (it is almost impossible that there is an exact correspondence). It is then enough to choose the most similar form. The ideal recognition must have the same evaluation of similarity as the brain, that which one approaches with the networks neurons. But the faster methods (less complex) will evaluate a similarity sullied with error. When there are few forms in the database, quite separate, the most similar form will remain the same one, and thus the end result will be right. By increasing the size of the base of the data, one " rapproche" necessarily model forms between them, and the error on the similarity can more easily tip the scales towards a bad form.

Pattern recognition

The pattern recognition plays a very important part in the recognition of the writing on two levels:

Extraction of graphème

The pattern recognition applies to a reason. The various reasons thus should initially be separated composing the words (letters, figures, symbols…) before recognizing them.

On the following example, the various possible points of separation are annotated.

It is obvious that all the segmentations are not correct and that only some must be preserved. There thus exists an ambiguity which should be raised to optimize the recognition.

Recognition of reasons

Starting from the graphèmes extracted previously, the pattern recognition makes it possible to obtain the various reasons composing it. The recognition of reasons also will assist the extraction of graphèmes by drawing aside impossible part of the segmentations. Thus, more the recognition of reason is effective and more the segmentation is. In the same way, an effective segmentation necessarily leads to a better recognition. It is necessary to segment to recognize, and admit to segment .

Assistance of the model of language

It remains much of ambiguities after the operations of segmentation and recognition. The treatment of the language intervenes on this level by drawing aside the least probable solutions, from a linguistic point of view.

In the preceding example, the stages of segmentations and pattern recognition led to the " choices; lrj" or " by". The model of language (sometimes a simple dictionary) will choose probably the solution " by" according to the language. The model of language perhaps much more complex and to recognize for example continuations of forms (N-grams). Thus “ It is ” will be preferred with “ It have ” in the event of ambiguity.

Collaboration of the treatments

The course of the recognition is not linear: the various treatments bringing each time a little more information on the probable solutions, it can be interesting to take again a stage starting from the furnished informations by preceding treatment to refine the result. There is thus a collaboration of the various treatments to increase the reliability of the recognition.

A priori on the language

Whatever the type of recognition of the writing, the refining of the model of language is the key of optimization. Indeed, to guarantee good performances it is rather necessary to see the treatment like rather making a choice of solution (S) among a whole of choice suggested a priori than to seek with to guess , starting from the form, which the script writer wanted to write. To seek to recognize a text without any information is to date very difficult, whereas to seek to recognize the same text if the language employed is known and the register (taken of note, “correct” text, SMS) is much more effective.

In this way technology is sufficiently advanced to make it possible to recognize very quickly and with an excellent reliability the address on an envelope: the system does not seek to recognize information randomly, but to extract a zip code (5 digits) among all those which he knows. A new sorting by district is then possible: the system will seek to extract the street among those which he knows for this zip code.

As analogy, it is possible for an human being to include/understand the integrality of a sentence even when a part is disturbed, for example the reader will manage without any doubt to include/understand the following disturbed sentence: " I went to Ci *** to see a film" , thanks to the context posed by the remainder of the sentence. This context gives a a priori on the disturbed word to recognize.

References

  • Jean-Pierre Crettez and Guy Lorette, Recognition of the manuscript writing , 1998 (article);

External bonds

  • Recognition of the handwritten layouts, with video demonstrations on the site Interstices

Random links:Conference of Havana | Syndrome of Imerslund-Grasbeck | Culture Mauritian | Romeo Pérusse | Drum “Low for Dad