This cluster consists of linguistic datasets and tools focused on the documentation, annotation, and computational processing of Armenian language varieties, including modern, historical, and dialectal forms.
The PROIEL Treebank is a linguistic dataset containing dependency treebank annotations for texts in ancient Indo-Euro...
A Universal Dependencies treebank for Eastern Armenian, providing manually annotated morphological and syntactic data...
A Universal Dependencies (UD) treebank for Western Armenian, containing manually annotated morphological and syntacti...
A curated speech corpus of Armenian question-answer dialogues designed for intonation and prosody studies. It contain...
ReRooted-ArmenianCorpus is a repository focused on cleaning and preparing a speech corpus from the ReRooted Archive, ...
ArmTDP-NER is a manually annotated gold-standard named entity recognition (NER) corpus for Modern Eastern Armenian, c...
A Universal Dependencies treebank for Classical Armenian, containing annotated texts from the Gospels and Movses Khor...
A Universal Dependencies treebank for Eastern Armenian, manually annotated from the ArmTDP v2.0 corpus. It includes e...
A fieldwork data archive for the Iranian Armenian dialect, containing audio recordings, transcriptions, and linguisti...
A public repository containing an annotated version of the Kouyoumdjian 1970 Armenian-English dictionary. It provides...
A Universal Dependencies treebank for Middle Armenian, manually annotated with morphological and syntactic data, deri...