parameter-golf-armenia

Scivity/parameter-golf-armenia

Python Stars: 0 Forks: 0 ML/AI

Summary

A submission to the "Parameter Golf Armenia" competition, focusing on hyperparameter tuning for a GPT language model training script. The author modified optimizer settings (Muon + AdamW), learning rates, sequence length, and dataset sharding to achieve improved validation bits-per-byte (bpb) on the FineWeb dataset compared to a baseline configuration.

Similar Projects