ArmTDP-NER
myavrum/ArmTDP-NER
A ~150K-token named entity corpus of Modern Eastern Armenian
Summary
ArmTDP-NER is a manually annotated gold-standard named entity recognition (NER) corpus for Modern Eastern Armenian, containing ~150K tokens from 1,949 newspaper texts. It follows the OntoNotes 5.0 scheme with 18 entity types and provides detailed annotation statistics. The corpus supports NER and EDT tasks and has been used to train a model integrated into the Stanza NLP toolkit.