| 1.Research Institution | University of Tokyo | |
| 2.Research Area | Physical and Engineering Sciences | |
| 3.Research Field | Advanced Multimedia Information and Communication Systems | |
| 4.Term of Project | FY1996〜FY2000 | |
| 5.Project Number | 96P00605 | |
| 6.Title of Project | Study for the Multilingual Text Processing on the Multimedia Communication System |
| Name | Institution,Department | Title of Position |
| Tamura, Takeshi | University of Tokyo, Graduate School of Humanities and Sociology | Professor |
8.Core Members
| Names | Institution,Department | Title of Position |
| Yamaguchi, Akiho | University of Chuuou, Faculty of Letters | Professor |
| Aoyagi, Masanori | University of Tokyo, Graduate School of Humanities and Sociology | Professor |
| Katayama, Hideo | University of Tokyo, Graduate School of Humanities and Sociology | Professor |
9.Cooperating Researchers
| Names | Institution,Department | Title of Position |
| Nagashima, Hiroaki | University of Tokyo, Graduate School of Humanities and Sociology | Professor |
| Ooki, Ysushi | University of Tokyo, Graduate School of Humanities and Sociology | Associate Professor |
| Koshizuka, Noburu | University of Tokyo, Information Technology Center | Associate Professor |
10.Summary of Research Results
|
The first objective of the research project was to collect and identify all of the Japanese Kanji Characters. This objective was attained by publishing GT2000 TrueType font set consisting of 74,086 Kanji characters. At the same time with the publication of GT2000 font set, Kanji database for searching characters was also published on the WWW. The project also made 10,486 Kanji elements in addition to Kanji characters. The final objective of the research project was to digitalize multilingual text data as much as possible. This objective was made attainable by developing a method for handling low-precision OCR output in conjunction with high-resolution graphic data. After experimenting with the voluminous data of Larousse Dictionnaire universel du XlXe siecle, the project has formed "Kouki-Jiten" database by making use of the GT2000 font set. The members of the research project intend to keep studying for the maintenance and addition of these published fonts and databases. |
11.Key Words
(1)multilingual、(2)Kanji characters、(3)font
(4)text data、(5)graphic data、(6)full-text search
(7)optical character recognition、(8)data bank、(9)communication system
12.References
| Author | Title of Article | |||
| Tamura, Takeshi | From Printing Type to Digital Font | |||
| Journal | Volume | Year | Pages Concerned | |
| Sakamura, Ken (ed. ) Digital Museum | 1997 | 36 | ||
| Author | Title of Article | |||
| Tamura, Takeshi | For the Publication of 64,000 Kanji Fonto Set | |||
| Journal | Volume | Year | Pages Concerned | |
| Heibonsha (ed. ) Computer Culture and Future of Kanji | 1998 | 188-200 | ||
| Author | Title of Article | |||
| Nagashima, Hiroaki | Incomplete Word Processors and Degradaded Digital Documents | |||
| Journal | Volume | Year | Pages Concerned | |
| Heibonsha (ed. ) Computer Culture and Future of Kanji | 1998 | 145-156 | ||
| Author | Title of Article | |||
| Yamaguchi, Akiho | Characters and Culture: Study of Multilingual Text Processing on the Multimedia Communication System | |||
| Journal | Volume | Year | Pages Concerned | |
| Japanese Scientific Monthly | 657 | 1998 | 17-21 | |
| Author | Title of Article | |||
| Ooki, Yasushi | International Union Catalogue of Chinese Literature | |||
| Journal | Volume | Year | Pages Concerned | |
| Japanese Scientific Monthly | 657 | 1998 | 54-57 | |
| Author | Title of Book | ||
| Yamaguchi, Akiho(ed) | Shinsen-Kanji-Souran | ||
| Publisher | Year | Pages | |
| Shougakukan | 2001 | 715+779 | |
| Author | Title of Book | ||
| JSPS | Larousse Dictionnaire universel du XIXe siecle | ||
| Publisher | Year | Pages | |
| SystemSoft | 2001 | DVD-ROM | |