Discussion
aadishv: Someone already made a great agent skill for this, which I'm using daily, and it's been very cool!https://github.com/pasky/chrome-cdp-skillFor example, I use codex to manage a local music library, and it was able to use the skill to open a YT Music tab in my browser, search for each album, and get the URL to pass to yt-dlp.Do note that it only works for Chrome browsers rn, so you have to edit the script to point to a different Chromium browser's binary (e.g. I use Helium) but it's simple enough
Etheryte: On one hand, cool demo, on the other, this is horrifying in more ways than I can begin to describe. You're literally one prompt injection away from someone having unlimited access to all of your everything.
speedgoose: Interesting. MCP APIs can be useful for humans too.Chrome's dev tools already had an API [1], but perhaps the new MCP one is more user friendly, as one main requirement of MCP APIs is to be understood and used correctly by current gen AI agents.[1]: https://chromedevtools.github.io/devtools-protocol/
NiekvdMaas: Also works nicely together with agent-browser (https://github.com/vercel-labs/agent-browser) using --auto-connect
mh-: Not the person you're replying to, but I just use a separate Chrome profile that isn't logged into anything except what I'm working on. Then I keep the persistence, but without commingling in a way that dramatically increases the risk of not paying attention to what I'm/it's doing.
Sonofg0tham: This is a massive workflow win. The state persistence is the real killer feature here. Having to re-authenticate or re-populate a complex cart state just to let an agent poke at a bug was always the biggest friction point.
AlexDunit: The session reuse angle is what makes this actually useful — previously you had to re-authenticate every time an agent spun up a new browser instance. Debugging anything behind a login wall was painful.Curious how the permission model holds up against prompt injection via malicious page content. The agent has full DevTools access, so a crafted page could theoretically exfiltrate network responses or cookies through the MCP channel.
rob: [delayed]
zxspectrumk48: I found this one working amazingly well (same idea - connect to existing session): https://github.com/remorses/playwriter
aadishv: Of course I still watch it and have my finger on the escape key at all times :)
bergheim: For now you are. All these things fall with time, of course. You will stop caring once you start feeling safe, we all do.Also. AAarrgh, my new thing to be annoyed at is AI drivel written slop."No browser automation framework, no separate browser instance, no re-login."Oh really, nice. No separate computer either? No separate power station, no house, no star wars? No something else we didn't ask for? Just one a toggle and you go? Whoaaaaaa.Edit: lol even the skill itself is vibe coded:Lightweight Chrome DevTools Protocol CLI. Connects directly via WebSocket — no Puppeteer, works with 100+ tabs, instant connection.I feel like there's nothing fucking left on the internet anymore that is not some mean of whatever the LLM is trained to talk like now.
harr01: That's cool and all, but it makes me glad I don't use Chrome besides for debugging. I can't imagine doing this on an active browser.
sheepscreek: Just one of those dual-use techs. As long as it’s gated and not turned on by default, it’s all good. They could also add a warning/sanity check similar to “allow pasting” in the console.
David-Brug-Ai: This is the exact problem that pushed me to build a security proxy for MCP tool calls. The permission model in most MCP setups is basically binary, either the agent can use the tool or it can't. There's nothing watching what it does with that access once its granted.The approach I landed on was a deterministic enforcement pipeline that sits between the agent and the MCP server, so every tool call gets checked for things like SSRF (DNS resolve + private IP blocking), credential leakage in outbound params, and path traversal, before the call hits the real server. No LLM in that path, just pattern matching and policy rules, so it adds single-digit ms overhead.The DevTools case is interesting because the attack surface is the page content itself. A crafted page could inject tool calls via prompt injection. Having the proxy there means even if the agent gets tricked, the exfiltration attempt gets caught at the egress layer.
simianwords: AI
rzmmm: Yes. Can someone tell me why even HN has bots. For selling upvotes to advertisement purposes?
tonyhschu: Very cool. I do something like this but with Playwright. It used to be a real token hog though, and got expensive fast. So much so that I built a wrapper to dump results to disk first then let the agent query instead. https://uisnap.dev/Will check this out to see if they’ve solved the token burn problem.
oldeucryptoboi: I tell Claude to use playwright so I don't even need to do the setup myself.
senand: I suggest to use https://github.com/simonw/rodney instead
glerk: Note that this is a mega token guzzler in case you’re paying for your own tokens!
sofixa: Even without the bash escape risk (which can be mitigated with the various ways of only allowing yt-dlp to be executed), YT Music is a paid service gated behind a Google account, with associated payment method. Even just stealing the auth cookie is pretty serious in terms of damage it could do.
mh-: Agreed. I wouldn't cut loose an agent that's at risk of prompt injection w/ unscoped access to my primary Google account.But if I understood the original commenter's use case, they're just searching YT Music to get the URL to a given song. This appears[0] to work fine without being logged in. So you could parameterize or wrap the call to yt-dlp and only have your cookie jar usable there.[0]: https://music.youtube.com/search?q=sandstorm[1]: https://music.youtube.com/watch?v=XjvkxXblpz8
rossvc: I've been using the DevTools MCP for months now, but it's extremely token heavy. Is there an alternative that provides the same amount of detail when it comes to reading back network requests?
paulirish: The project just recently landed a CLI that's been in the works: https://github.com/ChromeDevTools/chrome-devtools-mcp/blob/m...* AFAIK the CLI hasn't yet been announced, but it's in the latest v0.20.0 release.
nerdsniper: It's probably not fully optimized and could be compacted more with just some effort, and further with clever techniques, but browser state/session data will always use up a ton of tokens because it's a ton of data. There's not really a way around that. AI's have a surprising "intuition" about problems that often help them guess at solutions based on insufficient information (and they guess correctly more often than I expect they should). But when their intuition isn't enough and you need to feed them the real logs/data...it's always gonna use a bunch of tokens.This is one place where human intuition helps a ton today. If you can find the most relevant snippets and give the AI just the right context, it does a much better job.
tacitusarc: What can you do? I mentioned the use of AI on another thread, asking essentially the same question. The comment was flagged, presumably as off topic. Fair enough, I guess. But about 80% (maybe more) of posted blogs etc that I see on HN now have very obvious signs of AI. Comments do too. I hate it. If I want to see what Claude thinks I can ask it.HN is becoming close to unusable, and this isn’t like the previous times where people say it’s like reddit or something. It is inundated with bot spam, it just happens the bot spam is sufficiently engaging and well-written that it is really hard to address.
bergheim: I hear you and I agree. I don't know. Gated communities?