Discussion
Search code, repositories, users, issues, pull requests...
wk_end: > 25K parameters is about 70 million times smaller than GPT-4. It will produce broken sentences. That's the point - the architecture works at this scale.Since it seems to just produce broken and non-sensical sentences (at least based on the one example given) I'm not sure if it does work at this scale.Anyway, as written this passage doesn't really make a whole lot of sense (the point is that it produces broken sentences?), and given that it was almost certainly written by an AI, demonstrates that the architecture doesn't work especially well at any scale (I kid, I kid).
harel: Eliza called, and asked if we saw her grand kids...
tclancy: What makes you say that? This is about you, not me.(Came here to say an update to Eliza could really mess with the last person still talking to her.)
forinti: How does it compare to a Markov chain generator I wonder.
Lerc: Ok now we need 1541 flash attention.I'm not sure what the venn diagram of knowledge to understand that sentence looks like, it's probably more crowded in the intersection than one might think.
ghstinda: but can you make mac keyboards feel like a c64c?
classichasclass: If you're running this in VICE, run it under the SuperCPU with warp mode on.
brcmthrowaway: How does this compare to ELIZA?
daemonologist: You can chat with the model on the project page: https://indiepixel.de/meful/index.htmlIt (v3) mostly only says hello and bye, but I guess for 25k parameters you can't complain.