workshop-TUMO2025
CVidalG/workshop-TUMO2025
Learning lab about Damaged Document Analysis and Historical Document Enhancement
Summary
A workshop project from TUMO Labs Armenia, created in partnership with the National Library of Armenia. It provides a practical introduction to AI techniques for historical document analysis, focusing on Armenian newspapers. The repository contains a Streamlit web application that implements a pipeline for document quality assessment, enhancement (using YOLO for object detection), and post-processing (using Tesseract OCR and OpenAI's API for text correction). It serves as an educational resource with code, instructions, and sample data.