Discussion
mccoyb: It's fascinating to think about the space of problems which are amenable to RL scaling of these probability distributions.Before, we didn't have a fast (we had to rely on human cognition) way to try problems - even if the techniques and workflows were known by someone. Now, we've baked these patterns into probability distributions - anyone can access them with the correct "summoning spell". Experts will naturally use these systems more productively, because they know how to coerce models into the correct conditional distributions which light up the right techniques.One question this raises to me is how these models are going to keep up with the expanding boundary of science. If RL is required to get expert behavior into the models, what happens when experts start pushing the boundary faster? In 2030, how is Anthropic going to keep Claude "up-to-date" without either (a) continual learning with a fixed model (expanding context windows? seems hard) or (b) continual training (expensive)?Crazy times.
Aerroon: A bit related: open weights models are basically time capsules. These models have a knowledge cut off point and essentially forever live in that time.
bitexploder: This is the most fundamental argument that they are not, directly, an intelligence. They are not ever storing new information on a meaningful timescale. However, if you viewed them on some really large macro time scale where now LLMs are injecting information into the universe and the re-ingesting that maybe in some very philosophical way they are a /very/ slow oscillating intelligence right now. And as we narrow that gap (maybe with a totally new non-LLM paradigm) perhaps that is ultimately what gen AI becomes. Or some new insight that lets the models update themselves in some fundamental way without the insanely expensive training costs they have now.
dtj1123: Would you consider someone with anterograde amnesia not to be intelligent?
morleytj: A very good point. For anyone not familiar with anterograde amnesia, the classical case is patient H.M. (https://en.wikipedia.org/wiki/Henry_Molaison), whose condition was researched by Brenda Milner.
wang_li: Or you could have just said "they can't form new memories."
pdntspa: Sure, if you want to speak with the precision of a sledgehammer instead of a scalpel
saturnite: All that needed to be conveyed was that there are humans who cannot create new memories. That is enough to pose the philosophical question about these models having intelligence. Anything more is just adding an anecdote that isn't necessary.
morleytj: Why would adding more information and context be unnecessary? And why is that bad?
beepbooptheory: Sure, why can't both things be true? "Intelligence" is just what you call something and someone else knows what you mean. Why did AI discourse throw everyone back 100 years philosophically? Its like post-structuralism or Wittgenstein never happened..It's so much less important or interesting to like nail down some definition here (I would cite HN discourse the past three years or so), than it is to recognize what it means to assign "intelligent" to something. What assumptions does it make? What power does it valorize or curb?Each side of this debate does themselves a disservice essentially just trying to be Aristotle way too late. "Intelligence" did not precede someone saying it of some phenomena, there is nothing to uncover or finalize here. The point is you have one side that really wants, for explicit and implicit reasons, to call this thing intelligent, even if it looks like a duck but doesn't quack like one, and vice versa on the other side.Either way, we seem fundamentally incapable of being radical enough to reject AI on its own terms, or be proper champions of it. It is just tribal hypedom clinging to totem signifiers.Good luck though!
bitexploder: I think you can look at it dispassionately from a systems perspective. There is not /really/ a quantifiable threshold for capital I Intelligence. But there is a pretty well agreed set of properties for biological intelligence. As humans, we have conveniently made those properties match things only we have. But you can still mechanistically separate out the various parts of our brain, what they do, and how they interact and we actually have a pretty good understanding of that.You can also then compare that mapping of the human brain to other biological brains and start to figure out the delta and which of those things in the delta create something most people would consider intelligence. You can then do that same mapping to an LLM or any other AI construct that purports intelligence. It certainly will never be a biological intelligence in its current statistical model form. But could it be an Intelligence. Maybe.I don't think, if you are grounded, AI did anything to your philosophical mapping of the mind. In fact, it is pretty easy to do this mapping if you take some time and are honest. If you buy into the narratives constructed around the output of an LLM then you are not, by definition, being very grounded.The other thing is, human intelligence is the only real intelligence we know about. Intelligence is defined by thought and limited by our thought and language. It provides the upper bounds of what we can ever express in its current form. So, yes, we do have a tendency to stamp a narrative of human intelligence onto any other intelligence but that is just surface level. We de decompose it to the limits of our language and categorization capabilities therein.
marcus_holmes: > The other thing is, human intelligence is the only real intelligence we know about.There's a long and proud history of discounting animal intelligence, probably because if we actually thought animals were intelligent we'd want to stop eating them.Octopodes are sentient. Cetaceans have well-developed language. Elephants grieve their dead. Anyone who has owned a dog knows that it has some intelligence and is capable of communicating with us. There's a ton of other intelligences that we know about.> As humans, we have conveniently made those properties match things only we have.I think this is the key point. Machine intelligence is not going to look like human intelligence, any more than animal intelligence does. We can't talk to the dolphins, not because they're not smart and don't have language, but because we can't work out their language. Though I'm not sure what we'd even say to them, because they live in a world we'll never understand, and vice versa. When Claude finally reaches consciousness, it's not going to look like a human consciousness, and actually talking to that consciousness is going to be difficult because we won't share a reality.An LLM is a tool. I can just about stretch to it being an Artificial Intelligence, but I prefer to continue being specific and call it an LLM rather than an AI. It is not conscious or self-aware. It fakes self-awareness because as a tool the thing it does is have conversations with humans, and humans often ask it questions about itself. But I don't think anyone actually believes it is self-aware. Not least because the only time it thinks is when prompted.
bitexploder: This is an important point. We know what our DMN is and how we use language as a basis for thought to create concepts and complex ideas. However language also bounds our thought. What about the Dolphin? It is a fundamental philosophical problem of if advanced intelligence can exist without language. We have a pretty good notion that you need some sort of substrate (language) to create intelligence. And we know that mapping the internal state of a brain from inside of itself is incredibly hard and the way our human brain evolved to do it is really fascinating but also full of hacks and mismatched mappings based on what we know is actually going on.Cognitive computer science explores this whole area of mapping language and the underlying semantic meaning. Ultimately, these intelligences will be bound by physics (unless some new physics or understanding therein happens). And classical intelligences are still bound by classical physics. So I am not sure we can't relate to these other intelligences. We may be limited to some translation layer that does not fully map, but can we still relate to some other consciousness? For that matter consciousness is just another word that vaguely maps to a vast and extremely complex thing in the human brain and each person has a different understanding of what that is. I don't really have any conclusions, you brought up interesting points. We should sit within this realm of inquiry with a lot of humility IMO.
mihevc: Et tu, Knuthus?
wvlia5: This seems to be a bot comment. HN will lose its value if these bots are not purged.
mccoyb: Tune your bot detector, I'm a real person and I think about my comments before posting them.
wvlia5: Who was Rome best Caesar?
bitexploder: That is a good area to explore. Their map of the past is fixed. They are frozen at some point in their psychological time. What has stopped working? Their hippocampus and medial temporal lobe. These are like the write-head that move data from the hippocampus to the neo cortex. Their "I" can no longer update itself. Their DMN is frozen in time. So if intelligence is purely the "I" telling a continuous coherent story about itself. The difference is that although they are fixed in time which is a characteristic shared by a specific LLM model. They can still completely activate their task positive network for problem solving and if their previous information stored is adequate to solve the problem they can. You could argue that is pretty similar to an LLM and what it does. So it is certainly a signifiant component of intelligence.There is also the nature of the human brain, it is not just those systems of memory encoding, storage, and use of that in narratives. People with this type of amnesia still can learn physical skills and that happens in a totally different area of the brain with no need for the hippocampus->neocortex consolidation loop. So, the intelligence is significantly diminished, but not entirely. Other parts of the brain are still able to update themselves in ways an LLM currently cannot. The human with amnesia also has a complex biological sensory input mapping that is still active and integrating and restructuring the brain. So, I think when you get into the nuances of the human in this state vs. an LLM we can still say the human crosses some threshold for intelligence where the LLM does not in this framework.So, they have an "intelligence", localized to the present in terms of their TPN and memory formation. LLMs have this kind of "intelligence". But the human still has the capacity to rewire at least some of their brain in real time even with amnesia.
supern0va: >But the human still has the capacity to rewire at least some of their brain in real time even with amnesia.Sure, but just because LLMs don't have what we'd describe as human intelligence, doesn't mean they don't have intelligence.I think we're witnessing the creation and growth a weird new type of intelligence right now.
ainiriand: Are not LLMs supposed to just find the most probable word that follows next like many people here have touted? How this can be explained under that pretense? Is this way of problem solving 'thinking'?
dilap: That description is really only fair for base models†. Something like Opus 4.6 has all kinds of other training on top of that which teach it behaviors beyond "predict most probable token," like problem-solving and being a good chatbot.(†And even then is kind of overly-dismissive and underspecified. The "most probable word" is defined over some training data set. So imagine if you train on e.g. mathematicians solving problems... To do a good job at predicting [w/o overfitting] your model will have to in fact get good at thinking like a mathematician. In general "to be able to predict what is likely to happen next" is probably one pretty good definition of intelligence.)
gpm: I'd disagree, the other training on top doesn't alter the fundamental nature of the model that it's predicting the probabilities of the next token (and then there's a sampling step which can roughly be described as picking the most probable one).It just changes the probability distribution that it is approximating.To the extent that thinking is making a series of deductions from prior facts, it seems to me that thinking can be reduced to "pick the next most probable token from the correct probability distribution"...
dilap: The fundamental nature of the model is that it consumes tokens as input and produces token probabilities as output, but there's nothing inherently "predictive" about it -- that's just perspective hangover from the historical development of how LLMs were trained. It is, fundamentally, I think, a general-purpose thinking machine, operating over the inputs and outputs of tokens.(With this perspective, I can feel my own brain subtly oferring up a panoply of possible responses in a similar way. I can even turn up the temperature on my own brain, making it more likely to decide to say the less-obvious words in response, by having a drink or two.)(Similarly, mimicry is in humans too a very good learning technique to get started -- kids learning to speak are little parrots, artists just starting out will often copy existing works, etc. Before going on to develop further into their own style.)
earthscienceman: Non-sequitor: "perspective hangover" might be my favorite phrase I've ever read. So much of what we deal with is trying to correct-the-record on how we used to think about things. But the inertia that old ideas or modes have is monumental to overcome. If you just came up with that, kudos.