ReRooted-ArmenianCorpus

jhdeov/ReRooted-ArmenianCorpus

ReRooted: Speech corpus of Syrian Armenian refugee testimonials

Stars: 3 Forks: 1 License: GPL-3.0 Data

Summary

ReRooted-ArmenianCorpus is a repository focused on cleaning and preparing a speech corpus from the ReRooted Archive, which contains nearly 80 hours of transcribed testimonials from Syrian Armenian refugees. The work involves converting SRT transcripts to TextGrids, manual alignment and correction of utterance boundaries, and linking to audio files stored externally. The project aims to create a high-quality linguistic resource suitable for Armenian NLP and speech processing research.

Similar Projects