This cluster consists of projects that create and curate annotated datasets, corpora, and computational tools for the study and processing of Armenian language varieties, including Western, Eastern, Middle, and Classical Armenian.
The PROIEL Treebank is a linguistic dataset containing dependency treebank annotations for texts in ancient Indo-Euro...
A Universal Dependencies treebank for Eastern Armenian, providing manually annotated morphological and syntactic data...
A curated speech corpus of Armenian question-answer dialogues designed for intonation and prosody studies. It contain...
ReRooted-ArmenianCorpus is a work-in-progress speech corpus project that processes and cleans transcribed audio testi...
A Universal Dependencies (UD) treebank for Western Armenian, containing manually annotated morphological and syntacti...
ArmTDP-NER is a manually annotated gold-standard named entity recognition (NER) corpus for Modern Eastern Armenian, c...
This repository is part of the AI2001 project, specifically for Armenian language linguistic datasets. The README sta...
A Universal Dependencies (UD) treebank for Classical Armenian, containing annotated texts from the Gospels and Movses...
A Universal Dependencies treebank for Eastern Armenian, manually annotated from the ArmTDP v2.0 corpus. It includes e...
A fieldwork data archive for the Iranian Armenian dialect, containing audio recordings, transcriptions, and linguisti...
A public repository containing an annotated version of the Kouyoumdjian 1970 Armenian-English dictionary. It provides...
A Universal Dependencies treebank for Middle Armenian, containing manually annotated grammatical examples for linguis...