UNIPEN Benchmark Tests


Note: each test also corresponds to a database subset.

Benchmark Description

1a

isolated digits

1b

isolated upper case

1c

isolated lower case

1d

isolated symbols (punctuations etc.)

2

isolated characters, mixed case

3

isolated characters in the context of words or texts

4

isolated printed words, not mixed with digits and symbols

5

isolated printed words, full character set

6

isolated cursive or mixed-style words (without digits and symbols)

7

isolated words, any style, full character set

8

text: (minimally two words of) free text, full character set

Note that only Benchmark #8 is a realistic, application-oriented test, because the word segmentation problem must also have been solved by the recognizer. No manual word segmentation is allowed in test Benchmark #8.


Lambert Schomaker, January 1997