(there is also a frame-based version of this page)

The UNIPEN Project

On-line handwriting recognition addresses the problem of recognizing handwriting from data collected with a sensitive pad which provides discretized pen trajectory information. Contrarily to other pattern recognition fields, such as speech recognition and optical character recognition, no significant progresses have been made, in the past few years, in on-line handwriting recognition to make large corpora of training and test data publicly available, and no open competitions have been organized.

The first impulse to UNIPEN was given at the 11th IAPR-IEEE International Conference on Pattern Recognition, in September 1992, by a group of experts, the Technical Committee 11 of the IAPR, Professor Rejean Plamondon chairman. Information on the International Association for Pattern Recognition (IAPR)) and the Technical Committee 11 is available. Two IAPR delegates (Isabelle Guyon and Lambert Schomaker) were designated to explore the possibility to create large databases for on-line handwriting recognition research and development.

A small working group was constituted to get the project started. In May 1993, a nucleus of experts in on-line handwriting recognition (Tetsu Fujisaki (IBM), Ronjon Nag (Lexicus), Sandy Benett (GO/EO), Dick Lyons (Apple), Yves Chauvin (NetID), Dave Reynolds and Dan Flickinger (HP), Isabelle Guyon (AT&T) and Lambert Schomaker (NICI)) laid the foundations of UNIPEN. It was proposed that a common data format would be designed to facilitate data exchange. It was decided that contacts would be made with the Linguistic Data Consortium and the National Institute of Standards and Technologies (NIST) to get the data distributed and arbitrate benchmarks.

In summer 1993, the UNIPEN format was designed, incorporating features of the internal formats of several institutions, including IBM, Apple (Tap), Microsoft, Slate (Jot), HP, AT&T, NICI, GO and CIC. The format was then tested independently by the members of working group. A second iteration of test was organized in autumn 1993 to check the changes and additions to the format. In particular, the benchmark protocol was tested. The resulting format was internally used at AT&T and NICI (the home institutions of Isabelle Guyon and Lambert Schomaker) to collect data and benchmark recognizers.

In parallel, a set of Unipen Software Tools to parse the format and to browse the data was developed at AT&T and NICI.

In January 1994, the negotiations with LDC and NIST concretized into the organization of the first UNIPEN benchmark.

In March 1994, UNIPEN advertized its existence on several electronic mailing lists, resulting in nearly 200 subscriptions to the UNIPEN newsletter.

In June 1994, the instructions for participations to the first UNIPEN benchmark, limited to the Latin alphabet, are released. October 1st, 1994 was the deadline for submitting data. The benchmark will take place in 1997.

Latest counts

The totals (counted October 1995) are: Over 40 institutions, donating over 5 million characters, from more than 2200 writers!

The activities of UNIPEN will expand in the future according to the needs and desires of the participants.

Other UNIPEN-related pages:

o UNIPEN FAQs

o LOOK, HANDWRITING!

o UNIPEN data examples and software (FTP)

o The International Association for Pattern Recognition (IAPR)

o The US National Institute of Standards and Technologies (NIST)

o UNIPEN SCRAWLS


Interesting material:

o Handwriting Recognition and Document Analysis Conferences

o Pen & Mobile Computing

o NICI Handwriting Recognition Group home page


schomaker@computer.org