This cluster consists of projects that create and curate linguistic datasets, corpora, and computational tools for the Armenian language, spanning its historical, dialectal, and modern varieties.
The PROIEL Treebank is a linguistic dataset containing dependency treebank annotations for texts in ancient Indo-Euro...
A Universal Dependencies treebank for Eastern Armenian, providing manually annotated morphological and syntactic data...
A curated speech corpus of Armenian question-answer dialogues designed for intonation and prosody studies. It contain...
ReRooted-ArmenianCorpus is a work-in-progress speech corpus project that processes and cleans transcribed audio testi...
A Universal Dependencies (UD) treebank for Western Armenian, containing manually annotated morphological and syntacti...
ArmTDP-NER is a manually annotated gold-standard named entity recognition (NER) corpus for Modern Eastern Armenian, c...
This repository is part of the AI2001 project, specifically for Armenian language linguistic datasets. The README sta...
A Universal Dependencies (UD) treebank for Classical Armenian, containing annotated texts from the Gospels and Movses...
A Universal Dependencies treebank for Eastern Armenian, manually annotated from the ArmTDP v2.0 corpus. It includes e...
A fieldwork data archive for the Iranian Armenian dialect, containing audio recordings, transcriptions, and linguisti...
A public repository containing an annotated version of the Kouyoumdjian 1970 Armenian-English dictionary. It provides...
A Universal Dependencies treebank for Middle Armenian, containing manually annotated grammatical examples for linguis...