ReRooted-ArmenianCorpus
jhdeov/ReRooted-ArmenianCorpus
ReRooted: Speech corpus of Syrian Armenian refugee testimonials
Summary
ReRooted-ArmenianCorpus is a repository focused on cleaning and preparing a speech corpus from the ReRooted Archive, which contains nearly 80 hours of transcribed testimonials from Syrian Armenian refugees. The work involves converting SRT transcripts to TextGrids, manual alignment and correction of utterance boundaries, and linking to audio files stored externally. The project aims to create a high-quality linguistic resource suitable for Armenian NLP and speech processing research.