These projects collectively provide annotated datasets, corpora, and tools for computational and linguistic research on the Armenian language across its historical and modern dialects.
The PROIEL Treebank is a linguistic dataset containing dependency treebank annotations for texts in ancient Indo-Euro...
A Universal Dependencies treebank for Eastern Armenian, providing manually annotated morphological and syntactic data...
A curated speech corpus of Armenian question-answer dialogues designed for intonation and prosody studies. It contain...
ReRooted-ArmenianCorpus is a work-in-progress speech corpus project that processes and cleans transcribed audio testi...
A Universal Dependencies (UD) treebank for Western Armenian, containing manually annotated morphological and syntacti...
ArmTDP-NER is a manually annotated gold-standard named entity recognition (NER) corpus for Modern Eastern Armenian, c...
A curated list of Armenian language datasets, corpora, models, and digital resources for NLP and computational lingui...
This repository is part of the AI2001 project, specifically for Armenian language linguistic datasets. The README sta...
A Universal Dependencies (UD) treebank for Classical Armenian, containing annotated texts from the Gospels and Movses...
A Universal Dependencies treebank for Eastern Armenian, manually annotated from the ArmTDP v2.0 corpus. It includes e...
A dataset repository for a stylometric study on Classical Armenian texts, specifically for authorship attribution of ...
This repository is part of the TITUS-2-0 project, which hosts digital editions of historical texts in various languag...
A Universal Dependencies treebank for Middle Armenian, containing manually annotated grammatical examples for linguis...