Summary of Research Project Results Under the JSPS FY2000
"Research for the future Program"



1.Research Institution University of Tokyo
 
2.Research Area Physical and Engineering Sciences
 
3.Research Field Advanced Multimedia Information and Communication Systems
 
4.Term of Project FY1996〜FY2000
 
5.Project Number 96P00605
 
6.Title of Project Study for the Multilingual Text Processing on the Multimedia Communication System

7.Projetct Leader
Name Institution,Department Title of Position
Tamura, Takeshi University of Tokyo, Graduate School of Humanities and Sociology Professor

8.Core Members

Names Institution,Department Title of Position
Yamaguchi, Akiho University of Chuuou, Faculty of Letters Professor
Aoyagi, Masanori University of Tokyo, Graduate School of Humanities and Sociology Professor
Katayama, Hideo University of Tokyo, Graduate School of Humanities and Sociology Professor

9.Cooperating Researchers

Names Institution,Department Title of Position
Nagashima, Hiroaki University of Tokyo, Graduate School of Humanities and Sociology Professor
Ooki, Ysushi University of Tokyo, Graduate School of Humanities and Sociology Associate Professor
Koshizuka, Noburu University of Tokyo, Information Technology Center Associate Professor

10.Summary of Research Results

The first objective of the research project was to collect and identify all of the Japanese Kanji Characters. This objective was attained by publishing GT2000 TrueType font set consisting of 74,086 Kanji characters. At the same time with the publication of GT2000 font set, Kanji database for searching characters was also published on the WWW. The project also made 10,486 Kanji elements in addition to Kanji characters.
The final objective of the research project was to digitalize multilingual text data as much as possible. This objective was made attainable by developing a method for handling low-precision OCR output in conjunction with high-resolution graphic data. After experimenting with the voluminous data of Larousse Dictionnaire universel du XlXe siecle, the project has formed "Kouki-Jiten" database by making use of the GT2000 font set.
The members of the research project intend to keep studying for the maintenance and addition of these published fonts and databases.

11.Key Words

(1)multilingual、(2)Kanji characters、(3)font
(4)text data、(5)graphic data、(6)full-text search
(7)optical character recognition、(8)data bank、(9)communication system

12.References

[Reference Articles]
Author Title of Article
Tamura, Takeshi From Printing Type to Digital Font
Journal Volume Year Pages Concerned
Sakamura, Ken (ed. ) Digital Museum   1997 36

Author Title of Article
Tamura, Takeshi For the Publication of 64,000 Kanji Fonto Set
Journal Volume Year Pages Concerned
Heibonsha (ed. ) Computer Culture and Future of Kanji   1998 188-200

Author Title of Article
Nagashima, Hiroaki Incomplete Word Processors and Degradaded Digital Documents
Journal Volume Year Pages Concerned
Heibonsha (ed. ) Computer Culture and Future of Kanji   1998 145-156

Author Title of Article
Yamaguchi, Akiho Characters and Culture: Study of Multilingual Text Processing on the Multimedia Communication System
Journal Volume Year Pages Concerned
Japanese Scientific Monthly 657 1998 17-21

Author Title of Article
Ooki, Yasushi International Union Catalogue of Chinese Literature
Journal Volume Year Pages Concerned
Japanese Scientific Monthly 657 1998 54-57

[Reference Books]
Author Title of Book
Yamaguchi, Akiho(ed) Shinsen-Kanji-Souran
Publisher Year Pages
Shougakukan 2001 715+779

Author Title of Book
JSPS Larousse Dictionnaire universel du XIXe siecle
Publisher Year Pages
SystemSoft 2001 DVD-ROM


back