Dataset of Pre-Modern Japanese Text
Pre-Modern Japanese Text, owned by National Institute of Japanese Literature, is released image and text data as open data. In addition, some text has description, transcription, and tagging data.
Dataset of Edo Cooking Recipes
Cooking books in the period of Edo, included in Dataset of Pre-Modern Japanese Text were curated to create recipe datasets through the process of transcription, translation to modern Japanese, and structuring into the recipe format.
As a by-product of transcription on Dataset of Pre-Modern Japanese Text (PMJT), shapes and coordinates of old Japanese characters (Kuzushiji) were compiled to create another dataset for training to make machines and humans smarter.
Adapted from Kuzushiji Dataset, KMNIST dataset is a drop-in replacement for MNIST dataset. We provide three types of datasets, namely Kuzushiji-MNIST、Kuzushiji-49、Kuzushiji-Kanji, for different purposes.
Collection of Facial Expressions
The project aims at making research infrastructure for art history research by collecting facial expressions for style compartive study from Japanese Emaki (illustrated scroll), or potentially from work of art across the globe.
Dataset of Modern Magazines
Modern magazines are digitized and released as image datasets. n2i project is working on constructing the dataset of modern documents to develop OCR for those documents.
Geoshape repository is a data repository for sharing the geographic shape of a geographic entity. It includes "Historical Municipal Boundaries Dataset Beta Version" about the historical change of municipal boundaries since 1920 and "Village Boundaries Dataset" of 2015.
Edo Maps Beta
Edo Maps Beta is a project to construct geographic information infrastructure for the urban space of Edo City by extracting and reconstructing information from old documents from the Edo Period such as Edo old maps.