Posts

Studying Generalization In a Toy Data Setting

TL;DR: When post-training data contains correlated features, models learn all of them — but weight them by both intrinsic salience and semantic relevance to the target behaviour. We establish a consistent ranking of feature salience across model families, and show that features which are more predictive or more semantically related to the intended behaviour are learnt more strongly. During pre-training, an LLM learns a distribution over its training data. Because the data is so broad, this approximates the true distribution of internet text. Post-training then narrows this distribution to a set of desired behaviours, roughly summarised as the “assistant persona”1. The model typically has far more parameters than the post-training data requires, and so is able to overfit. In Chunky Post-Training2 we show that this happens in practice. ...

"Chunky" Post-Training

Chunky Post-Training: Data Driven Failures of Generalization Seoirse Murray, Allison Qi, Timothy Qian, John Schulman, Collin Burns, Sara Price Paper (arXiv) | Code (SURF) | Results Explorer Overview Post-training transforms a base language model into a useful assistant by teaching it a range of behaviors. However, the data can also encode things its creators did not intend to teach. When features of the training data correlate with a behavior, the model may learn to condition on those features rather than the intended principle. ...

Replicating a Paper 30 Years On: Zhang et al, 1991

Figure 1: Corneal cells examined using spectral microscopy from the paper (left) and from a more modern dataset (right). Introduction The corneal endothelium is a singe layer of cells on the inner surface of the cornea which governs fluid transport to the cornea. The cells normally are uniformly sized and roughly hexagonal. Examining the cell morphology (shape, area distribution, etc.) can be a useful way of studying disease. Using specular microscopy, it is possible to examine these cells in-vivo. ...

Cometa: Physics Based Asteroids

This post describes some of the features of a game developed by Piotr and me. It can be played in your web browser here. The source code is visible on my GitHub here. The game is associated with our “studio” Honey Badger Attitude. This blog gives a bit of a flavour of how this game works and some of the gameplay aspects we implemented. Asteroids: The inspiration Figure 1: A sample of gameplay of Asteroids, showing moving, shooting, and screen wrapping. ...