archive
browse by topic or date.
research
2- infinite - a rubric driven prioritized replay to maximise continual learning
a reinforcement learning replay mechanism that uses rubric-based prioritization to optimize continual learning through evaluation and adaptive curriculum design
- avatarl: training language models from scratch with pure reinforcement learning
replacing cross-entropy pretraining with a principled rl objective using expert-consensus rewards over active tokens
technical
2- how i bring the best out of claude code - part 2
custom commands, multi-agent systems, and the protocols that made claude code actually useful
- how i bring the best out of claude code - part 1
a comprehensive guide to effective claude code usage, context management, and building local multi-agent systems
personal
2- an alchemical outlook to finding human gold
a framework for evaluating people through proof of interest, work, excellence, and exceptionalism
- welcome
my journey from intel to llm research and the models that defined my path
2025
- Aug 20infinite - a rubric driven prioritized replay to maximise continual learningresearch
- Aug 9avatarl: training language models from scratch with pure reinforcement learningresearch
- Jun 20how i bring the best out of claude code - part 2technical
- Jun 15how i bring the best out of claude code - part 1technical
- Jan 25an alchemical outlook to finding human goldpersonal
2024
- Aug 24welcomepersonal