Discussion
LLM Neuroanatomy II: Modern LLM Hacking and hints of a Universal Language?
lostmsu: How's the reproducibility of the results? Like avg score of 10 runs vs original.
yodon: If you look at convolutional neural nets used in image processing, it's super common for the first layer or so to learn a family of wavelet basis functions. Later layers then do recognition in wavelet space, without that space ever being explained or communicated to the training algorithm.This work here is obviously more complex than that, but suggests something similar is going on with early layers transforming to some sort of generalized basis functions defining a universal language representation.