Just over a year ago, I wrote about how integer tokenization using the GPT2 and GPT3 tokenizer was insane. This was because that it failed to create a coherent number representation in token space since large numbers of integers were assigned a single unique token, and even multi-token integers were...
[Read More]
Alignment In The Age Of Synthetic Data
Synthetic data is a new frontier in AI training. Phi3 and Llama3 and other recent models have demonstrated the ability of large amounts of synthetic, well-tailored data to significantly improve performance of small models to bring them closer to the frontier by implicitly cheaply distilling from larger, more powerful, models....
[Read More]
Does scaffolding help humans?
Epistemic status: Shower thoughts
[Read More]
Addendum to the Surprising Parameter Efficiency of Vision Models
In a post from last year – On the Surprising Parameter Efficiency of Vision Models, I discussed a question which had been puzzling me at the time – that image models appear to reach or exceed human parity with significantly fewer parameters than are seemingly used by the brain. This...
[Read More]
Fertility, Inheritance, and the Concentration of Wealth
Epistemic status: Shower thoughts
[Read More]