archive

browse by topic or date.

research

infinite - a rubric driven prioritized replay to maximise continual learningAugust 20, 2025 · 6 min
a reinforcement learning replay mechanism that uses rubric-based prioritization to optimize continual learning through evaluation and adaptive curriculum design
avatarl: training language models from scratch with pure reinforcement learningAugust 9, 2025 · 17 min
replacing cross-entropy pretraining with a principled rl objective using expert-consensus rewards over active tokens

how i bring the best out of claude code - part 2June 20, 2025 · 6 min
custom commands, multi-agent systems, and the protocols that made claude code actually useful
how i bring the best out of claude code - part 1June 15, 2025 · 3 min
a comprehensive guide to effective claude code usage, context management, and building local multi-agent systems

an alchemical outlook to finding human goldJanuary 25, 2025 · 4 min
a framework for evaluating people through proof of interest, work, excellence, and exceptionalism
welcomeAugust 24, 2024 · 2 min
my journey from intel to llm research and the models that defined my path