Posts
Projects
- REINFORCE, landing on planets, and playing Flappy-Bird Code - 06 June 2026
- Practicing ML on Deep-ML, and fixing one of its problems Visit - 03 June 2026
- Backprop through time, from scratch - 16 May 2026
- A first LSTM in PyTorch, classifying MNIST as a sequence Code - 16 May 2026
- Five PyTorch notebooks I typed out by hand Code - 16 May 2026
- Transfer learning lost to my small CNN on FER2013 Code - 14 May 2026
- First PyTorch project, an MNIST classifier Code - 12 May 2026
- HackerRank Orchestrate 2026: a support triage agent that doesn't hallucinate Code - 03 May 2026
- A YouTube debate-video aggregator with AI recommendations Visit Code - 03 May 2026
- A full-stack web app for eight dollars a year Visit - 06 March 2026
Notes
- PPO, and keeping the policy on a short leash - 09 June 2026
- GRPO, and letting the group be the baseline - 09 June 2026
- Soft Actor-Critic, and paying the agent to stay unsure - 08 June 2026
- Dueling DQN, and splitting a state's worth from an action's - 04 June 2026
- Simply explained: dueling DQN - 04 June 2026
- Double DQN, and the trouble with taking the max - 04 June 2026
- Simply explained: double DQN - 04 June 2026
- Quantization notes, from FP32 down to packed 4-bit weights - 31 May 2026
- Eigenvectors: vectors for which a matrix doesn't cause a change in direction - 30 May 2026
- Basic linear algebra notes - 29 May 2026