loanword-detection-in-armenian
saro2808/loanword-detection-in-armenian
Loanword detection in Armenian
Summary
A research project exploring machine learning approaches for detecting loanwords in Armenian and predicting their language of origin. The repository contains Jupyter notebooks implementing feature extraction (syllables, n-grams, BPE) and classification models (Logistic Regression, Random Forest, CatBoost) on a manually curated dataset of 862 loanwords from 36 languages and 865 native Armenian words.