Discussion

"I Can't Do That, Dave" — No Agent Yet Ever

dvfjsdhgfv: Actually once I was very surprised when testing one of recent non-thinking Qwen models it said "I'm sorry, this project is too complex, I can do that". I was very impressed by this answer. So far, it was the only model that reacted to this task, ever. The remaining ones agreed to proceed and failed.

QuadmasterXLII: What would be great would be if they tried for a while, and if they didn’t succeed, explicitly report that they failed, and then present their best effort

lukan: ChatGPT had a time for me, where it refused to do anything bigger than a few lines of code, because "too complex". But step by step it did it then anyway.(No idea about the current state, since I switched to claude)

speedgoose: For coding perhaps. For general purpose usages, current models know how and when to refuse. Politics, sexual taboo, drugs,…Perhaps we should train them to refuse developing more insert your most hated stack here.

rdiddly: Boy, that was fragmented. What should I have done for years leading up to today to prepare for reading this? Gaming? Doomscrolling social media? Chugging Mountain Dew? Reading poetry?

esafak: Idiocracy is reality: people can't even form paragraph-length thoughts any more. I just noped out.

Frost1x: It’s interesting because I’m seeing some emerging conversations where users are tending to prefer general agents that have their preferential bias over more constrained or specially built agents, because there are certain arbitrary goal criteria they either have forced on them or want to force upon the agent and the general purpose agents tend to do well at this because they just trudge along and do whatever.Meanwhile more specialized agents that try to add or enforce constraints around a problem space where certain aspects tend to be well established don’t sit well with a lot of uses. “No, you and general knowledge don’t know best, I know best… do this.”I can see the use case for both but I’m seeing a whole lot more willingness to want confirmation bias, essentially to automate away parts of jobs and tasks people already do but in the personalized or opinionated way they’ve established, unwilling to explore alternative options.So the general purpose agent structures that just kickoff whatever they can tend to favor best in terms of positive feedback from agent users. Meanwhile it to some degree ignores many of the potential benenfits of having agents with general knowledge and bounded by general established bounds. It’s basically the whole “please do parts of my job for me but only the way I want them done.”People aren’t ready for being wrong or change, they just want to automate parts of their processes away. So I’m not sure “no” is going to sit well with a lot of people.

bronlund: «Think ultra deep and analyze this article. Make a detailed list of the top five alternatives as to what he is talking about.»

mkl95: It's very unique to LinkedIn. OP's prose is difficult to process even if you've abused your brain for years with LinkedIn content, though. In a more merciful timeline, only people like James Ellroy or Cormac McCarthy would ever attempt to write like that.

hmokiguess: Try LinkedIn

wiseowise: Fr bruh rizzing hard to this og jester gooning.

nilirl: Huh? The point of the article is that we should use git to store an LLMs output as it works?How do any of the quotes and citations used coherently form that argument?What is this writing style? Why does it feel like it doesn't want me to understand what the heck it's saying?

woeirua: Yeah, are these poems? I feel like it's just more AI slop.

blharr: It is

throwway262515: Qwen, is that you?My experience with it is that it tends to create such 3-word sentences when ask to write an article.

wolf4earth: The point of the argument is that meaning emerges in conversation. A session between human and AI is a conversation.Current AI storage paradigms offer lateral memory across the time axis. How is the now?A bit branch is longitudinal Merry across the time axis.Persist type checked decision trees within it. Your git history just became a tamper-proof, reproducible O(1) decision tree. Execution becomes a tree walk.It works. And it's not production ready yet.

Reader /

Discussion