dScript: An agent-based handwriting recognition demo

By Louis Vuurpijl and Lambert Schomaker

For references to this work, see the section References


Based on 15 years of experience in handwriting recognition, the dScript system was developed in the NICI/CE lab in Nijmegen. The system comprises:

The dScript system is depicted below:

The application

Goal of the dScript system is 1) to promote handwriting recognition in general, 2) to demonstrate our handwriting expertise in a real application. As such, our system has been a great succes. Two versions of the system have been on display at different museums in The Netherlands. Visitors of these museums have shown a large interest in using the system, even when the dScript makes some recognition mistakes.

For the current application of dScript, the goal for the writer is to write one out of about 2800 Dutch city names, like Nijmegen, Amsterdam or Arnhem. If dScript recognizes the city name, the city weapon is displayed, being a reward to the writer. Experience has shown that users are attracted to the system and furthermore, that they are willing to put efforts in their handwriting such that the system recognizes their input correctly. Even though large groups of children have used dScript, less than 5% of the collected data contained uncooperative junk.

The user-agent

This figure depicts the user interface of dScript, comprising:

The handwriting experts

The hypotheses from in total nine different letter experts are combined and compared to the possible words in the dictionary. This yields the top-3 word list. Each expert contributing the the top word is displayed below the corresponding letter. The icon associated with the expert displays the confidence of the expert: the more he smiles, the more sure the expert is. Note that by combining different expertise, we are able to recognize a mixture of connected cursive, mixed, handprint and block capital characters.

Pim
This is a character classifier. The feature vector consists of a monochrome image of 16x16 pixels. It uses simple nearest-neighbour search, based on a weighted Euclidean distance measure. The squared feature value differences are weighted by the inverse of that feature's variance. The character prototypes are organized as a flat list. This classifier has been trained on characters from within a handprint style with isolated characters. It is invoked only on (series of) ink blobs, which are combined in a sliding window containing a variable number of ink blobs.

Sterre
This letter expert is actually an agent which tries to propose alternative solutions for known character confusions, or, alternatively, to rule out character hypotheses for which conflicting evidence can be found. Mostly, this is done by reshuffling the confidence measure of the involved character hypotheses. For example, an <l> which is crossed later, might be a <t>. The information which is used at this stage is based on the geometrical relations between the rectangles containing ink blobs, the estimated lineation, and in particular, the presence of horizontal bars, dots, periods and commas.

Roos
This is a character classifier. The feature vector consists of a number of dedicated geometric features for characters, such as vectors radiating from the center of gravity of a shape, each with length, and angle. It uses simple nearest-neighbour search, based on a weighted Euclidean distance measure. The squared feature value differences are weighted by the inverse of that feature's variance. The character prototypes are organized as a flat list. The matching of a prototype occurs with a window size of a varying number of strokes at all possible stroke-start positions in the ink. This classifier has been trained on characters from within a mixed-style context.

Lou
This is a character classifier. The feature vector consists of normalized (x,y) coordinates, augmented with the running angle f, which is represented as a (cos(f),sin(f)) pair. It uses simple nearest-neighbour search, based on the usual Euclidean distance measure. The character prototypes are organized as a flat list. However, this list of prototypes has been developed using N-ary hierarchical clustering on a large training set. The matching of a prototype occurs with a window size of a varying number of strokes at all possible stroke-start positions in the ink. A special attention is given to possible (upper case) characters at the beginnning of a word, on the left. This classifier has been trained on UPPER CASE characters from within a mixed-style word context.

Max
This is a character classifier. The feature vector consists of normalized (x,y) coordinates, augmented with the running angle f, which is represented as a (cos(f),sin(f)) pair. It uses simple nearest-neighbour search, based on the usual Euclidean distance measure. The character prototypes are organized as a tree, a hierarchical Kohonen self-organized map, 2x2 nodes per map, in four layers. This speeds up the matching process, at the cost of occasionally derailing due to an early wrong decision. The matching of a prototype occurs with a window size of a varying number of strokes at all possible stroke-start positions in the ink. This classifier has been trained on characters from within a mixed-style but mostly cursive word context.

Sofie
This is a character classifier. The feature vector consists of normalized (x,y) coordinates, augmented with the running angle f, which is represented as a (cos(f),sin(f)) pair. It uses simple nearest-neighbour search, based on a weighted Euclidean distance measure. The squared feature value differences are weighted by the inverse of that feature's variance. The character prototypes are organized as a flat list. The matching of a prototype occurs with a window size of a varying number of strokes at all possible stroke-start positions in the ink. This classifier has been trained on characters from within a mixed-style but mostly cursive word context.

Bert
This is a character classifier, based on stroke sequences. A character is defined to consist of a sequence of strokes. A stroke is a trajectory bounded by two points of high curvature. These points are determined by computing the velocity signal, finding minima, and segmenting the ink trace. Each stroke is characterized by 14 features (9 angles, vertical position, pen up/down, length, loop area). An alphabet of prototypical strokes has been computed by using a 2D Kohonen self-organized map on 1 million such strokes. The map contains 20x20 prototypical strokes. A character is in fact a path of nodes in this 2D map. A Markov model is formed, using the probabilities of stroke-name transitions within a character. As an example, the three-stroked letter <a> may be represented by the stroke sequence <a1/3>,<a2/3>,<a3/3>. This classifier has been trained on a large database of mixed styles, but performs best on connected-cursive handwriting.

Anke
This is a character-based classifier, which uses a simple grammar for consecutive classified pen-down shapes (for instance, an L shape followed by two - might be a capital E). A number of primitives are classified, such are straight lines, circles, V shapes and U shapes. The matching is according to a simple grammar. This classifier is invoked on series of ink blobs in a variable window. The classifier has been designed on the basis of a number of clearly identifiable 'nice' examples.

David
This is a character classifier. The feature vector consists of a monochrome image of 16x16 pixels. It uses a neural-network classifier (multi-layer perceptron or MLP) with four layers (256x48x32x26 is the net architecture). This classifier has been trained on UPPER CASE characters from within a handprint style with isolated characters, also called block print. It is invoked only on ink blobs which are clearly separated by white space.

The monitor-agent

Consider Figure 3 below:

References

Schomaker, L.R.B., & Teulings, H.-L. (1990). A Handwriting Recognition System based on the Properties and Architectures of the Human Motor System. Proceedings of the International Workshop on Frontiers in Handwriting Recognition (IWFHR) . (pp. 195-211). Montreal: CENPARMI Concordia. ISBN: 1-895193-00-1. (Gzipped PostScript)

Schomaker, L.R.B. (1993). Using Stroke- or Character-based Self-organizing Maps in the Recognition of On-line, Connected Cursive Script. Pattern Recognition , 26(3), 443-450.

Vuurpijl, L. & Schomaker, L. (1997). Finding structure in diversity: A hierarchical clustering method for the categorization of allographs in handwriting, Proceedings of the Fourth International Conference on Document Analysis and Recognition, Piscataway (NJ): IEEE Computer Society, p. 387-393. ISBN 981-02-3084-2

Vuurpijl, L. & Schomaker, L. (1998). A framework for using multiple classifiers in a multiple-agent architecture. Proceedings of the Third European Workshop on Handwriting Analysis and Recognition, 14-15 July, 1998, London: The Institution of Electrical Engineers, Digest Number 1998/440, (ISSN 0963-3308), pp. 8/1-8/6

Vuurpijl, L. & Schomaker, L. (1998). Multiple-agent architectures for the classification of handwritten text. Proceedings of IWFHR'98, 12-14 August, Taejon, Korea, pp. 335-346.