MC VIII SAMOIEDICA 2: JURAK-SAMOIEDICA 1: Line-aligned Ground Truth

dc.contributor.authorCastrén, M. A.
dc.contributor.roleGünter Hackl DataCurator
dc.date.accessioned2025-04-29T13:59:52Z
dc.date.issued2021-12-05
dc.date.issued2021-12-05
dc.descriptionMC VIII SAMOIEDICA 2: JURAK-SAMOIEDICA 1: Line-aligned Ground Truth This dataset contains 172 microfilm scans Tundra Nenets materials, in which the text content is manually aligned line by line with the scanned images. This material has been created in collaboration between the Finno-Ugrian Society and the University of Innsbruck. It is intended specifically for handwritten text recognition experiments, training and benchmarking. For electronic materials and printed volumes that are intended to be used in linguistic, ethnographic and folkloric research, please refer to other publications in this Zenodo collection or [Manuscripta Castreaniana website](https://www.sgr.fi/manuscripta/). The materials were aligned in the University of Innsbruck with contributions by Günter Mühlberger and Günter Hackl. Other contributors are Karina Lukin and Niko Partanen. [Transkribus](https://readcoop.eu/transkribus/?sc=Transkribus) platform was extensively used in processing this dataset, and the file format is a direct Transkribus image and Page XML export.  
dc.identifierhttps://doi.org/10.5281/zenodo.5759599
dc.identifier.urihttps://datakatalogi.helsinki.fi/handle/123456789/4530
dc.rights.licensecc-by-4.0
dc.subjectGround Truth
dc.subjectHTR
dc.subjectTundra Nenets
dc.subjectM. A. Castrén
dc.subjectManuscript
dc.subjectLinguistics
dc.subjectEthnography
dc.titleMC VIII SAMOIEDICA 2: JURAK-SAMOIEDICA 1: Line-aligned Ground Truth
dc.typedataset

Files

Repositories