datalab-dulaurier

calfa-co/datalab-dulaurier

Ground-truth of the Dulaurier project (HTR of Armenian manuscripts).

Stars: 3 Forks: 0 License: Apache-2.0 Data

Summary

A ground-truth dataset for Handwritten Text Recognition (HTR) of Armenian manuscripts from the Dulaurier collection at the Bibliothèque nationale de France (BnF). It contains 42 annotated images with detailed XML markup for text regions, baselines, and transcriptions, including expanded abbreviations. Created as part of a digital humanities project for training and evaluating HTR models on medieval Armenian historiography.

Similar Projects