Kanji and English

This is a WWW representation of the work of Jim Breen who produced KANJIDIC. Lambert Schomaker wrote a little C program and Unix scripts, using kterm and the netpbm package, to transform KANJIDIC into a Web-readable form. The font was "-jis-fixed-medium-r-normal--24-230-75-75-c-240-jisx0208.1983-0" as provided with Linux/X11.

This was done, on the basis of discussions in the UNIPEN project concerning the coding of Kanji in the databases of on-line handwriting. The UNIPEN project is intended to create a common basis for the comparison of handwriting recognition algorithms as used in pen computers. This WWW version of KANJIDIC allows for a comparison of different coding schemes as compiled by Breen. Note: the top-left character is Kanji, then there is a sequence of kana/katakana/hiragana characters describing it (--> top right), and several ASCII codings are on the bottom of each panel. Look for details in the kanjidic.doc.

There is also a crude English to Kanji index.

Jim Breen's EDICT and KANJIDIC files:

These files provides lots of information on kanjis, codes, readings, meanings (KANJIDIC) and Japanese-English correspondence (EDICT). Format is available on some read-me files.

       AUSTRALIA: ftp.cc.monash.edu.au [130.194.1.106], 
       files /pub/nihongo/edict.* and kanjidic.* 
A list with ftp nodes in your vicinity is shown when you ftp there.

As required by the Copyright statement at the end, the file kanjidic.doc is included.

Synopsis

The KANJIDIC file contains comprehensive information about Japanese kanji. It is a text file currently 6,355 lines long, with one line for each kanji in the two levels of the characters specified in the JIS X 0208-1990 set. (For information about this set, see Appendix A.)

The file contains a mixture of ASCII characters and kana/kanji encoded using the EUC (Extended Unix Code) coding.

Attention is drawn to the KANJIDIC LICENCE STATEMENT AND COPYRIGHT NOTICE included below in this document.

A similar file, KANJD212, is available for the 5,801 supplementary kanji in the JIS X 0212-1990 set. (not in this WWW format, yet, LS))

Also look at Jeffrey Friedl's CGI database approach to KANJIDIC.


Organization

Since the amount of image data is much too large to include in a single .html file, a hierarchical organization was chosen. From the 2-byte JIS character code represented as a 4-digit hexadecimal number, the first two hex characters are used as the top index.

jis-x30/*

jis-x31/*

jis-x32/*

jis-x33/*

jis-x34/*

jis-x35/*

jis-x36/*

jis-x37/*

jis-x38/*

jis-x39/*

jis-x3a/*

jis-x3b/*

jis-x3c/*

jis-x3d/*

jis-x3e/*

jis-x3f/*

jis-x40/*

jis-x41/*

jis-x42/*

jis-x43/*

jis-x44/*

jis-x45/*

jis-x46/*

jis-x47/*

jis-x48/*

jis-x49/*

jis-x4a/*

jis-x4b/*

jis-x4c/*

jis-x4d/*

jis-x4e/*

jis-x4f/*

jis-x50/*

jis-x51/*

jis-x52/*

jis-x53/*

jis-x54/*

jis-x55/*

jis-x56/*

jis-x57/*

jis-x58/*

jis-x59/*

jis-x5a/*

jis-x5b/*

jis-x5c/*

jis-x5d/*

jis-x5e/*

jis-x5f/*

jis-x60/*

jis-x61/*

jis-x62/*

jis-x63/*

jis-x64/*

jis-x65/*

jis-x66/*

jis-x67/*

jis-x68/*

jis-x69/*

jis-x6a/*

jis-x6b/*

jis-x6c/*

jis-x6d/*

jis-x6e/*

jis-x6f/*

jis-x70/*

jis-x71/*

jis-x72/*

jis-x73/*

jis-x74/*


COPYING AND DISTRIBUTION 

Permission  is  granted to make and distribute verbatim copies of these files 
provided  this KANJIDIC.DOC file,  the copyright notice and permission notice 
is distributed with all copies.  Any distribution  of  the  files  must  take 
place  without  a financial return,  except a charge to cover the cost of the 
distribution medium. 

Permission is granted to make and  distribute  extracts  or  subsets  of  the 
KANJIDIC files under the same conditions applying to verbatim copies. 

Permission  is granted to translate the English elements of the KANJIDIC file 
into other languages, and to make and distribute copies of those translations 
under the same conditions applying to verbatim copies. 

KANJIDIC USAGE

These files may be freely used by individuals and small groups for  reference 
and  research  purposes,  and  may  be  accessed by software belonging to, or 
operated by, such individuals and small groups. 

The files, extracts from the files, and translations of the files must not be 
sold  as  part  of  any  commercial  software  package,   nor  must  they  be 
incorporated in any published dictionary or other  printed  document  without 
the specific permission of the copyright holders. 

COPYRIGHT

Copyright  over  the  documents  covered  by  this statement is held by James 
William BREEN, subject to the exceptions outlined below. 

The following people have granted permission for material for which they hold 
copyright to be included in  the  files,  and  distributed  under  the  above 
conditions, while retaining their copyright over that material: 

Jack HALPERN: The SKIP codes and Frequency codes in the KANJIDIC file. 

With regard to the SKIP and Frequency codes, Mr Halpern stated as follows:

        "The commercial utilization of the frequency  numbers  is  prohibited 
        without written permission from Jack Halpern.  Use by individuals and 
        small  groups  for  reference and research purposes is permitted,  on 
        condition that acknowledgement of the  source  and  this  notice  are 
        included."                                                

        "SKIP  is  protected  by  copyright,  copyleft  and patent laws.  The 
        commercial utilization of SKIP in  any  form  is  strictly  forbidden 
        without  the  written  permission  of  Jack  Halpern,  the  copyright 
        holder." 

Christian WITTERN and Koichi YASUOKA: The Pinyin information in the KANJIDIC 
        file. 

Urs APP:  the Four Corner codes and the Morohashi information in the KANJIDIC 
        file. 

Mark SPAHN and Wolfgang HADAMITSKY: the kanji descriptors from their 
        dictionary.
        


KANJIDIC © Jim Breen
WWW version, Lambert Schomaker

since Sept '96