Discussion
Claude Struggles To Cope With ChatGPT Exodus
LaurensBER: I really enjoyed using Claude but the ever changing limits, weird policies (limited to Claude Code, you can't run Openclaw, etc) made switching a very easy choice.OpenAI simply provides more value for the money at the moment.
prngl: It's funny how the false choice of American politics (Red vs Blue) also makes it into its consumerist corporatist life. That Anthropic's threadbare "limits" on government usage are seen as a heroic stand is a testament to just how far the goalposts on "ethical" deployment of AI have moved to the (fascist) right. As ever, politics precedes technology. We have Reagan's internet, we will have Trump's AI. God help us.
next_xibalba: Have you considered using Gemini?Google seems to be on a hot streak with their models, and, since they're playing from behind, I'd expect favorable pricing and terms. But, I don't know anyone who is using or talking about Gemini. All the chatter seems to be Anthropic vs. OpenAI.
goldenarm: Still 8x less downtime than GitHubhttps://mrshu.github.io/github-statuses/
thereitgoes456: What a baffling comment. Aren’t you aware of why this exodus is happening? (It’s not related to “value for the money”!) What are your feelings on that part?
lilytweed: It is entirely okay to weigh the Department of War thing against other criteria when choosing a service.
t0mas88: > OpenAI, meanwhile, has been attempting to quell the backlash against its deal with the U.S. government, putting out a blog post claiming that “our tools will not be used to conduct domestic surveillance of U.S. persons,”As a non-US person, that sounds far more concerning than no statement at all. Because if their tools weren't used for surveillance against Europeans they would have said so as a marketing message...
m-schuetz: That's weird, I switched away from ChatGPT because I mostly got superior results from Gemini and Claude.
anonyfox: give 5.4 a shot - its straneg but surprisingly good for once. speaking as a daily opus user.
igor47: I've tried. It's just not very good compared to either mentioned alternative.
anonyfox: because gemini, despite what stats say, still produces garbage once the problem gets harder. it nails it for lab conditions, but messy reality or creativity or even code quality is a far cry from opus or the latest gpt5.4 by a long shot. and always has been. its pretty good inside the GSuite because of integrations, but standalone its near worthless compared to even grok-code-fast which doesn't think much at all (but damn it is fast). At this point google keeps throwing noodlepots with AI against every wall in reach to see what sticks, which is more kind of desperation that still works to increase wall street highscores, but not exactly a streak or breakthrough. just rapid fire shotgun launches to see if anything sticks. No one serious talks Gemini because its not even worth considering still for real things outside shiny presentations and artificial benchmarks.
gedy: From prior thread is there even "limits"? I thought the Anthropic statements were pointed out to be mostly toothless PR, e.g. "we don't agree", etc.
indigodaddy: Used codex cli (5.4) for the first time (had never used codex or gpt for coding before - was using Opus 4.5 for everything), and it seems quite good. One thing I like is it's very focused on tests. Like it will just start setting up units tests for specs without you asking (whereas Opus would never do that unless you asked)-- I like that and think it's generally good. One thing I don't like about GPT though is it pauses too much throughout tasks where the immediate plan and also the more outward plan are all extremely well defined already in agents.md, but it still pauses too much between tasks saying, next logical task is X, and I say yeah go ahead, instead of it just proceeding to the next task which Id rather it do. I suppose that is a preference that should be put in some document? (agents.md?)
adrianN: With n-eyes agreements it’s quite meaningless anyway. Whatever passport you have, somebody spies on you and sells the information to your government.
landl0rd: Tried using 5.4 xhigh/codex yesterday with very narrow direction to write bazel rules for something. This is a pretty boiler-plate-y task with specific requirements. All it had to do was produce a normal rule set s.t. one could write declarative statements to use them just like any other language integration. It gave back a dumpsterfire, just shoehorning specific imperative build scripts into starlark. Asked opus 4.6 and got a normal sane ruleset.5.4 seems terrible at anything that's even somewhat out-of-distribution.
ElFitz: I got it to build a stereoscopic Metal raytracing renderer of a tesseract for the Vision Pro in less than half a day. Right ow it’s trying its teeth at adding passthrough support. Mileage may vary.
partiallypro: If anyone thinks Anthropic or OpenAI are the "good guys," they've already lost the plot. If you look at additional reporting on the topic, not just the Anthropic PR spin, the disagreements were much more nuanced than it was portrayed by Anthropic. They aren't exactly a reliable narrator on the topic either. In fact it actually just seems like Amodei fumbled the deal and crashed out a bit. He's already walked back his internal memo, and is reportedly still seeking a deal with the Pentagon. I don't trust either CEO, I use their products, but if you're even leaning 51-49 on who is "less evil," I think you're giving too much slack.
baq: There’s a surge of demand for sure, but I’m not at all convinced that it’s at OpenAI’s expense. My bet is the non-swe folks caught wind the things got seriously good at a lot of boring office work, i.e. we’re seeing diffusion of AI into the wider economy.
cmiles8: All this demonstrates how non-sticky all this tech really is. When your product is basically just an API call it’s trivial to just swap you out for someone else. As such it’s unclear what the prize at the end of the present race to the bottom is.We swapped OpenAI out for Claude and it required updating about 15 lines of code. All these guys are just commodity to us.
causal: It helps a lot that Claude is just better. Codex isn't BAD, and in some narrow technical ways might even be more capable, but I find Claude to be hands-down the best collaborator of all the AI models and it has never been close.
xscott: Can you expand on that. I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.And I've been absolutely amazed with Codex. I started using that with version ChatGPT 5.3-Codex, and it was so much better than online ChatGPT 5.2, even sticking to single page apps which both can do. I don't have any way to measure the "smarts" for of the new 5.4, but it seems similar.Anyways, I'll try to get Claude running if it's better in some significant way. I'm happy enough the the Codex GUI on MacOS, but that's just one of several things that could be different between them.
ValentineC: > I've been wanting to try Claude for a while, but their payment processing wouldn't take any of my credit cards (they work everywhere else, so it's not the cards). I've heard I can work around this by installing their mobile app or something, but it was extra hurdles, so I didn't try very hard.Not Claude Code specifically, but you can try the Claude Opus and Sonnet 4.6 models for free using Google Antigravity.
scuff3d: I tend to use LLMs more for research then actual coding, so I ended up going with GPT over Claude because it's chat interface just seems to work better for me. It balances out Claude being slightly better at software tasks.
esseph: [delayed]
thereitgoes456: Agreed, but the comment should mention it. Nobody is talking about value for money right now.I didn't mean to advocate for Anthropic, apologies.
LaurensBER: Whatever Anthropic might or might not do with the department of war interests me in proportion to how much I can influence this. Rounded, speaking as an European citizen, that appears to be exactly 0 to me.
fluidcruft: I've generally thought that but lately I've been finding that the main difference is Claude wants a lot more attention than codex (I only use the cli for either). codex isn't great at guessing what you want, but once you get used to its conversation style it's pretty good and just finishing things quietly.
kakacik: I wouldn't give them any free pass and just give up, its highly amoral and inhuman behavior. Modern form of racism but based on passport.You have this one? You are subhuman, treated as such and you have very limited rights on our soil, we can do nasty things to you without any court, defense, or hope for fairness. You have that one? Please welcome back.Sociopathic behavior. Then don't wonder why most of the world is again starting to hate US with passion. I don't mean countries where you already killed hundreds of thousands of civilians, I mean whole world. There isn't a single country out there currently even OK with US, thats more than 95% of the mankind. Why the fuck do you guys allow this? Its not even current gov, rather long term US tradition going back at least till 9/11.
causal: Codex is not bad, I think it is still useful. But I find that it takes things far too literally, and is generally less collaborative. It is a bit like working with a robot that makes no effort to understand why a user is asking for something.Claude, IMO, is much better at empathizing with me as a user: It asks better questions, tries harder to understand WHY I'm trying to do something, and is more likely to tell me if there's a better way.Both have plenty of flaws. Codex might be better if you want to set it loose on a well-defined problem and let it churn overnight. But if you want a back-and-forth collaboration, I find Claude far better.
kromokromo: I’ve been juggling between ChatGPT, Claude and Gemini for the last couple of years, but ChatGPT has always been my main driver.Recently did the full transition to Claude, the model is great, but what I really love is how they seem to have landed on a clear path for their GUI/ecosystem. The cowork feature fits my workflows really well and connecting enterprise apps, skills and plugins works really well.Haven’t been this excited about AI since GPT 4o launched.
spwa4: It's amazing how bad FANG executives are at even knowing what a normal moral thought would be for average people ...Plus, you know, you'd think they'd ask their cleaner or baker or something. Or hire someone.
clcaev: Executives certainly are capable of following the moral/ethical path.Around 2003, a Yale Psychology PhD candidate asked me to write a web-based survey instrument with various questions, some on complex but straight forward business questions (the controls) and others with moral or ethical aspects. Senior executives participated and they scored similarly to rank & file, often completing the entire survey much faster. What they didn't know -- we were tracking how long they spent on each question. Questions with moral and/or ethical concerns took the executives relatively longer than the rank & file.
pinkmuffinere: Ya, agreed. This makes me think that (long term) the ai race won’t be won on the merits of individual models, but on pricing — I think Google has a some strong advantages here because they know how to provide cheap compute, and they already have a ton of engineers doing similar things, so it’s a marginal cost for them instead of having to hire and maintain whole devoted teams.
112233: Interesting to hear! I've had completely opposite experience, with Claude having 5 minutes of peerless lucidity, followed by panicking, existential crisis, attempts to sabotage it's own tests and code, psyops targeted at making user doubt their computer, OS, memory... Plus it prompts every 15 seconds, with alternative being YOLO.Meanwhile codex is ... boring. It keeps chugging on, asking for "please proceed" once in a while. No drama. Which is in complete contrast with ChatGPT the chatbot, that is a completely unusable, arrogant, unhelpful, and confrontational. How they made both from the same loaf I dunno.
AlotOfReading: I wish I could get Claude to stop every 15 seconds. There's a persistent bug in the state machine that causes it to miss esc/stop/ctrl-f and continue spending tokens when there's a long running background task or subagent. There's a lot of wasted tokens when it runs for 10, 15, 20 minutes and I can't stop it from running down the wrong rabbit hole.
HotHotLava: It's an ironic situation; logically what should be the moat are the models, costing hundreds of millions of investment cost to train and operate so it would make sense if we see different provider focusing in different directions.But right now we have 3-5 top contenders that are so evenly matched that the de-facto sticking point is mostly the harness, ie. the collection of proven plugins/commands/tools/agent features that are tuned to the users personal workflow.
wonnage: Cost is never a good moat.
dd82: the companies migrating off vmware due to broadcom shittiness would disagree with you
shimman: It's also meaningless because we know governments get around these "agreements" by buying data from third party companies that bought the data from OpenAI. The only way to stop this is to legislate it out of existence.
oytis: > someone elseWe have basically 4 companies in the world one can seriously consider, and they all seem to heavily subsidise usage, so under normal market conditions not all of them are going to survive.
groestl: > are so evenly matchedIt's because the real value of the models is in what we (humanity) fed them, and all of them have eaten the same thing for free.
nradov: That's why the frontier LLM companies are now spending a lot more to license exclusive proprietary training data from private sources in order to gain a quality edge in certain business domains.
Lerc: It's is a fairly ridiculous conclusion to draw that these people are leaving ChatGPT because of their stance. I doubt OpenAI's actions play much role in the influx at all.A couple of weeks ago, to huge numbers of people, ChatGPT was AI. The biggest public perception shift that will have come from the DoD/DoW spat will be how many people know that Claude exists at all, that they are being unreasonably punished by the government for taking a principled stance will benefit.People have been made aware of a product, made aware that it's good enough that the government wants to use it. They have then been shown a archetypical underdog against the government narrative. That makes almost a perfect storm for gaining customers.When they actually use the thing and discover that it actually is good, They will stay, and they will tell their friends.At this rate they should be sending Hegseth a thank you card.
stavarotti: I've largely found codex and claude code to be about the same however, codex tends to "think" harder and for longer which depending on the task, yields better results without too much steering.On an unrelated note, UI is such a personal preference that it's impossible, beyond core pillars that have been studied for decades, to say one is better over the other. That being said, I like OpenAI's design system much better than Anthropic. OpenAI products (cli and chat ui) "feel" nice and consumer focused whereas Anthropic's products feel utilitarian and "designed for business".
lavezzi: > "We have these two red lines... Not allowing Anthropic's AI to perform mass surveillance of Americans, and prohibiting its AI from powering fully-autonomous weapons..."Anthropic literally said the same, but seem to be getting positive PR.https://www.cbsnews.com/news/ai-executive-dario-amodei-on-th...
gruez: It's not "literally the same".https://www.lesswrong.com/posts/FSGfzDLFdFtRDADF4/openai-s-s...
ajross: That sounds like spin to me. If there were a clear "quality edge" in "certain business domains" stemming from "exclusive proprietary data", someone would have been exploiting it already using meat computers.But no, businesses are dumb. They always have been. Existing businesses get disrupted by new ideas and new technology all the time. This very site is a temple to disruption!Proprietary advantage is, 99.999% of the time, just structural advantage. You can't compete with Procter & Gamble because they already built their brands and factories and supply chains and you'd have to do all that from scratch while selling cheaper products as upstart value options. And there's not enough money in consumer junk to make that worth it.But if you did have funding and wanted to beat them on first principles? Would you really start by training an LLM on what they're already doing? No, you'd throw money at a bunch of hackers from YC. Duh.
rblatz: AI consumes entire data centers of compute. You aren’t tucking a few racks into a corner of a data center, you are building entirely new ones. There will be whole devoted teams.
pinkmuffinere: But Google already builds data centers. Will there really be devoted AI-datacenter teams? Or will they just expand the normal datacenter teams, and ask them to use GPUs/TPUs instead of CPUs?
pgt: No one left ChatGPT over that deal: they decided to try Anthropic's Claude because the Department of War gave them free marketing.
gdilla: just having strict control over context management in session is a nice differentiator. Shared tooling between desktop and cli and is nice too. they've differentiated enough.
michele_f: The DoW or the CEO of Anthropic and his telenovela?
ghywertelling: Let me explain a possible moat with an example.I have curated my youtube recommendations over the years. It knows my likes and dislikes very well. It knows about me a lot.The same moat exists in interactions with Claude. Claude remembers so many of preferences. It knows that I work in Python and Pandas and starts writing code for that combination. It knows about what type of person I am and what kind of toys I want my nephews and nieces to play. These "facts" about the person are the moat now. Stackoverflow was a repository of "facts" about what worked and what didn't. Those facts or user chat sessions are now Anthropic's moat.
remus: > As such it’s unclear what the prize at the end of the present race to the bottom is.It's a market worth many billions so the prize is a slice of that market. Perhaps it is just a commodity, but you can build a big company if you can take a big slice of that commodity e.g. by building a good product (claude code) on top of your commodity model.
dghlsakjg: “Hey Claude, write out a markdown file of all of my preferences so any AI agent can pick up where you left off”
politelemon: In fact, here, I'll do it myself.
furyofantares: I was paying both $200+/mo and I went down to only paying Anthropic $200/mo.My experience has, for a few months, been that OpenAI's models are consistently quite noticeably better for me, and so my Codex CLI usage had been probably 5x as much as my Claude Code usage. So it's a major bummer to have cancelled, but I don't have it in me to keep giving them money.I'd love to get off Anthropic too, despite the admirable stance they took, the whole deal made me extra uncomfortable that they were ever a defense contractor (war contractor?) to begin with.
christina97: We are in this fascinating stage where tokens that are nominally entirely fungible at a roughly equivalent intelligence level; yet at the same time there is huge market segmentation and differentiation in the non-tangible aspects of those tokens.
causal: > psyops targeted at making user doubt their computerIDEK what that means, specific examples?
fragmede: There's this bug in Claude Desktop where a response will disappear on you. When you're busy doing many things at once, you'll go back to the chat, and you'll be all "wait, didn't I already do this?" It's maddening and makes you question your own sanity.
linkregister: Frontier labs are paying the same constellation of firms offering proprietary data and access to experts in their fields to train LLMs.They are neck-and-neck only because they are participating in the arms race. The only other way to keep up is mass-distillation, which could prove to be fragile (so far it seems to be sustainable).
ajross: Meh. I think there's basically no benefit shown so far to careful curation. That's where we've been in machine learning for three decades, after all. Also recognize that the Great Leap Forward of LLMs was when they got big enough to abandon that strategy and just slurp in the Library of All The Junk.I think one needs to at least recognize the possibility that... there just isn't any more data for training. We've done it all. The models we have today have already distilled all of the output of human cleverness throughout history. If there's more data to be had, we need to make it the hard way.
mjburgess: You would also need to control for the degree to which people had a stake in the outcome (ie., virtue signalling).Since executives have to make decisions where choosing the moral option may impose an economic (or operational) cost, this requires thinking through the actual choice.Morality for the "rank and file" is just a signalling issue: there's nothing to think through, the answer they are "supposed to choose" is the one they do so, at no cost to them.
clcaev: I hope the addendum helps clarify.
cmiles8: But those holding said proprietary data have figured out they’re holding the cards now and have gotten a lot smarter recently. Companies are being very careful about what gets used for inference vs what they allow to be used for training.I don’t see the core models getting dramatically better from where they are now. We’ve clearly hit a plateau.
SilverElfin: Didn’t Anthropic hire the infrastructure head from stripe and give him a CTO title? I would’ve thought that would help bring stability but if anything, things have become worse.
kivle: I switched from ChatGPT Plus to Gemini Pro instead of Claude, since I'm a hobbyist and appreciate having more than just text chat and coding assist with my subscription (image gen, video gen, etc are all nice to have).At first I found the Gemini Code Assist to be absolutely terrible, bordering on unusable. It would mess up parameter order for function calls in simple 200 line Python. But then I found out about the "model router" which is a layer on top which dynamically routes requests between the flash and pro model. Disabling it and always using the pro model did wonders for my results.There are however some pretty aggressive rate limits that reset every 24 hours. For me it's okay though. As a hobbyist I only use it about 2-3 hours per day at most anyway.
10xDev: With Gemini Pro on Antigravity you get a quota reset every 5 hours and access to Claude Opus 4.6. That's what I use at home and don't need anything else.
hparadiz: Really? I mean I see regularly as I'm coding how much better it could be simply by running obvious prompts for me.When I use the planning mode and then code the success rate is much higher. When I ask it to work on specific isolated chunks of code with clear success/failure modes the success rate is again much higher.Now imagine a world where it recognizes that from my simple throw away non specific prompt. If it was able to fire off 20 different prompts in quick succession it could easily cut my time spent in front of the screen by a third.The patterns are obvious but they don't do that right now because it's a lot of compute.We'll be looking at this time where there's a progress bar showing context space the way we look at the Turbo button.Because the truth is to get the baseline I'm talking about is a finite amount of compute at a certain point.
grallm: Did you leave OpenAI because of the current backlash? If so, is Google even better?
indigodaddy: Didn't they tighten that quota WAY down though since everyone caught on to the AG/Opus game?
dghlsakjg: Open Router shows that commodity api providers have figured out how to do this unsubsidized.The training runs aren’t priced in, but the cost of inference is clearly pretty cheap.
Max-Ganz-II: To stop this, I today put most of my Amazon Redshift research web-site behind a basic auth username/password wall.It's all remains free, but you need to email me for a username and password.If I put in time and effort to make content and OpenAI et al copy it and sell it through their LLM such that no one comes to me any more, then plainly it makes no sense for me to create that content; and then it would not exist for OpenAI to take, or for anyone else. We all lose.It seems parasitic.
bestouff: An AI is more likely than me to take the time to send you an email for requesting access - I'm too lazy.
ghywertelling: You are missing the correlations that Claude can derive across all these user sessions across all users. In Google analytics, when I visit a page and navigate around till I find what I was looking for or didn't find it, that session data is important for website owners how to optimize. Even in Google search results, when I think on 6th link and not the first, it sends a signal how to rearrange the results next time or even personalize. That same paradigm will be applicable here. This is network effects and personalization and ranking coming togther beautifully. Once Anthropic builds that moat, it will be irreplaceable. If not, ask all users to jump from Whatsapp to Telegram or Signal and see how difficult it is. When anthropic gives you the best answer without asking too much, the experience is 100x better.
dghlsakjg: The underlying technology is a thin layer of queryable knowledge/“memories” in between you and the llm, that in turn gets added to the context of your message to the llm. Likely RAG. It can be as simple as a agents.md that you give it permission to modify as needed. I really don’t think that they are correlating your “memories” with other people’s conversations. There is no way for the LLM to know what is or isn’t appropriate to share between sessions, at the moment. That functionality may exist in the future, but if you just export your preferences, it still works.The moat - at this point in time - is really not as deep and wide as you are making it out to be. What you are imagining doesn’t exist yet. Indexing prior conversations is trivially easy at this point, you can do it locally using an api client right this moment.Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.In any case, my feeling is that we should have learned at this point not to keep our data in someone else’s walled garden.
ryandvm: Ironic indeed. The Great Replacers of white collar jobs are finding themselves easily replaceable. Delicious.
spiffytech: Unfortunately this is why Anthropic is so aggressive about preventing Claude subscriptions from being used with other tools.
latchkey: According to this article, they can't even service the amount of paying customers that they have.
tonyedgecombe: [delayed]
causal: You're totally allowed to use Claude for OpenClaw and you're totally able to use Claude Code with non-Anthropic models. You must be referring to the fact that you have to use an API key and cannot use the auth intended for Claude-only products, which AFAIK is the same at every AI company (with Google destroying whole Google accounts for offenders most recently).
LaurensBER: OpenAI and Github Copilot do explicitly allow this, so do many of the new/Chinese providers such as Synthetic, Z.ai, etc.Anthropic is the outlier here, obviously they can limit their subscriptions as they want but it's a major disadvantage compared to their competitors.
causal: > OpenAI ... explicitly allow thisExplicit means it's stated in OpenAI docs somewhere, but I can't find it. Link?
LaurensBER: https://x.com/thsottiaux/status/2009742187484065881There's probably a better source somewhere but this is the one I had at hand.
Balgair: →All these guys are just commodity to us.Just want to note something there:Okay, premise that AI really is 'intelligent' up to the point of business decisions.So, this all then implies that 'intelligence' is then a commodity too?Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.We did this with muscles and memory previously. We invented writing and so those with really good memories became just like everyone else. Then we did it with muscles and the industrial revolution, and so really strong or endurant people became just like everyone else. Yes, many exceptions here, but they mostly prove the rule, I think.Now it seems that really smart people we've made AI and so they're going to be like everyone else?
coldtea: >So, this all then implies that 'intelligence' is then a commodity too? Like, I'm trying to drive at that your's, mine, all of our 'intelligence' is now no longer a trait that I hold, but a thing to be used, at least as far as the economy is concerned.This is obviously already the case with the intelligence level required to produce blog posts and article slop, generade coding agent quality code, do mid-level translations, and things like that...
selfhoster11: Please add Internet Archive's bot to your auto-allows, at least. Their bot is presumably well behaved, and for public benefit.
credit_guy: The moat is compute.In my case, I always use Opus 4.6 in my work, but quite often I get a 504 error, and that's quite annoying. I get errors like that with Gemini too. I can't estimate if I'd get a similar number of errors with ChatGPT, since I use it very infrequently.But imagine that at some point one of the big 3 (OpenAI, Anthropic, Google) gets very high availability, while the others have very poor availability. Then people would switch to them, even if their models were a bit worse.Now, OpenAI has been building like crazy, and contracting for future builds like crazy too. Google has very deep pockets, so they'll probably have enough compute to stay in the game. But I fear that Anthropic will not be able to match OpenAI and Google in terms of datacenter build, so it's only a matter of time (and not a lot of time) until they'll be in a pretty tight spot.
replwoacause: The limits are what did it for me. They kept boasting about Opus performance and improvements, practically begging me to try it out, and when I did, it totally obliterated my usage. I'm sure its good, but I stick to Sonnet because I've been burned bad. Never had that problem with ChatGPT, but it turns out they're just unprincipled and evil, which is a shame.
bonoboTP: Ok, maybe pretraining is now complete and solved. Next up: post-training, reinforcement learning, engineering RL environments for realistic problem solving, recording data online during use, then offline simulation of how it could have gone better and faster, distilling that into the next model etc. etc. There's still decades worth of progress to be made this way.
k32k: " There's still decades worth of progress to be made this way."That's not true. Moreover the progress can slow to a crawl where it's barely noticeable. And in that world the humans continues to stay ahead - that's the magic of humans. To be aware of surroundings and adapt sufficiently whilst taking advantage of tools and leveraging them.
k32k: I think this is where we are at, too.But if you say stuff like this on here you get down voted. Why?
k32k: Many people I know initially used ChatGPT for awhile. Then after awhile they went to Gemini. Again stuck with it for awhile. And now are dabbling with Claude.Yep there really is no switching cost it seems.People generally want something from a model and then leave. I think people are sub-consciously forming relationships with Tech firms such that they do not care about them, and its all about what the user themselves gets. Generally there is no attachment. There's some examples of psychotic stuff but that's thankfully the exception not the norm.That's why Apple cares deeply about its brand - it doesn't want to fall into that group of firms.
maest: I think a better approach would be to have a login form and just say "the password is 1234" or whatever.Virtually no scraper has logic to handle that sort of situation, but it's trivial for humans. Way easier than an LLM
EQmWgw87pw: Well as of right now, mathematically and scientifically, the way an LLM works has nothing to do with how the human brain works.
coldtea: The way this thing "looks like a duck, swims like a duck, and quacks like a duck" has nothing to do with the way a real duck "looks like a duck, swims like a duck, and quacks like a duck".Who cares, as long as the end results are close (or close enough for the uses they are put to)?Besides, "has nothing to do with how the human brain works" is an overstatement."The term “predictive brain” depicts one of the most relevant concepts in cognitive neuroscience which emphasizes the importance of “looking into the future”, namely prediction, preparation, anticipation, prospection or expectations in various cognitive domains. Analogously, it has been suggested that predictive processing represents one of the fundamental principles of neural computations and that errors of prediction may be crucial for driving neural and cognitive processes as well as behavior."https://pmc.ncbi.nlm.nih.gov/articles/PMC2904053/https://maxplanckneuroscience.org/our-brain-is-a-prediction-...
bmitc: The human doesn't just predict. It predicts based upon simulations that it runs. These LLMs do not work like this.
_aavaa_: So? Does a submarine swim?
cedws: Codex has been feeling a bit faster recently, not sure if placebo.
alphabettsy: They claim it’s faster and it seems to be so for me.
fluidcruft: The difference is that Anthropic actually dotted the i's and crossed the t's whereas OpenAI fell for the weaselwords and is now desperately trying to renegotiate.
beachy: OpenAI didn't fall for anything, they knew exactly what they were signing and went ahead anyway, then started gaslighting people about what they had signed. For a lot of people (me included - to non-UIS citizens all AI companies are as dangerous as each other) the lack of integrity and the gaslighting is what has soured them on OpenAI.
anon373839: But the end results aren’t actually close. That is why frontier LLMs don’t know you need to drive your car to the car wash (until they are inevitably fine-tuned on this specific failure mode). I don’t think there is much true generalization happening with these models - more a game of whack-a-mole all the way down.
Grimblewald: I left the openai platform long before this, because I expected things like this. A few called me alarmist but are now also jumping ship because of this. OpenAI has zero moral or ethical substance and people _do_ care about that. I'm extreme enough that joining openAI after a specific date works against you and your CV, not with/for you, while leaving at a specific date speaks volumes in favour of you. People are the sum of their actions, not their words and siding with / continuing to use openAI speaks volumes on who you are.
simulator5g: Not true, even Windows Defender is capable of extracting "the password is 1234" from context like emails or webpages.
Grimblewald: My experience exactly. The more "real" the problems become, the more other models become unsuitable when compared to claude, with the sole exceptions being deepseek/kimi, which while speaking strictly w.r.t metrics and basic tasks are not better, they are more interesting and handle more odd and totally out of domain stuff better than the US models. An example being code i wrote for a hypercomplex sedenion based artififial neural network broke claude so bad it start saying it is chatgpt and cant evaluate/run code. similar experience for all US models, which are characterized by being extremely brittle at the fringes, though cladue least among them. Meanwhile chinese models are less capable for cookie cutter stuff but keep swinging when things get really weird and unusual. It's like US models optimize for the lowest minima acheivable, and god help you if distribution changes. Chinese models on the otgerhand seem to optimize for the flattest minima, giving poorer quality across the board but far more robust behaviour.
ghywertelling: > Besides all that, you will be shocked at how quickly a new service can reconstruct your preferences. I started a new YouTube account, and it was basically the same feed within a few days.Because your location data, wifi name and etc hones in on the fact this is the same person as before. You are actually supporting my point than denying it.
Timon3: "Rank and file" employees choosing to prioritize morality very, very frequently pay real costs for doing so - with a much larger personal impact than executives feel.
mjburgess: Only in very rare circumstances where the obvious answer and their procedural work dont align.When making an operational decision that affects the direction of the business, morality is almost always a concern -- even at the level of "do our customers benefit from this vs., do we?" etc.
Balgair: Neither does a pneumatic piston operate at all like a bicep nor does an accounting book operate at all like a hippocampus. But both have taken well enough of the load off both those tissues that you be crazy to use the biological specimen for 99% of the commercial applications.
EQmWgw87pw: A bicep and a piston both push and pull things, but an AI cannot do what a smart brain can, so I don’t think being smart will no longer have an advantage. I mean, someone has to prompt the AI after all. The mental ability to understand and direct them will be more important if anything.