Discussion
Calif
Razengan: Meanwhile on my GDScript codebase Codex questions itself 3 times in the same sentence and still gets it wrong: https://i.imgur.com/HF198nl.png
zx8080: What is going on there? What double s?
endymion-light: While cool and slightly scary news - Samsung TV's have been incredibly hackable for the past decade, wouldn't be surprised if GPT2 with access to a browser could hack a Samsung!
valleyer: This is some serious revisionist history. GPT-2 wasn't instruction-following or even conversational.
reactordev: The trick here was providing the firmware source code so it could see your vulnerabilities.
petee: [delayed]
lawgimenez: I use Codex a lot, it does not talk that way like "wait, actually".
varispeed: Codex exploited or you exploited? It's like saying a hammer drove a nail, without acknowledging the hand and the force it exerted and the human brain behind it.
patrickmcnamara: Hyperbole.
rossvc: Is that really OpenAI/Codex? It reads like Opus 4.6 1M when it reaches ~400k tokens.
embedding-shape: I don't know what UI that is, but it isn't ChatGPT nor Codex as far as I can tell.
wongarsu: Reasoning on pure machine code or disassembly is still hit and miss. For better results you can run the binary through a disassembler, then ask an llm to turn that into an equivalent c program, then ask it to work on that. But some of the subtleties might get lost in translation
par1970: Do you have a defense of why human-hammer-nail is a good analogy for human-chatgpt5.4-pwndsamsung?
lynx97: It will have to use a disassembler, or write one. I recently casually asked gpt-5.4 to translate the content of a MIDI file to a custom sound programming language. It just wrote a one-shot MIDI parser in Python, grabbed the data, and basically did a perfect translation at first try. Nice.
StilesCrisis: I've seen Claude do similar things for image files. Don't have PNG parsing utilities installed? No worries, it'll just synthesize a Python script to decode the image directly.
jdiff: It's really not. It was a fun toy but had very little utility. It could generate plausible looking text that collapsed immediately upon any amount of inspection or even just attention. Code generation wasn't even a twinkle in Altman's eye scanning orbs at that point.
ckbkr10: Even with all the constraints that others criticize here it is pretty amazing.Give an experienced human this tool at hand he can achieve exploitation with only a few steering inputs.Cool stuff
Glemllksdf: Wrong questions.Could a script kiddy stear an LLM? How much does this reduce the cost of attacks? Can this scale?What does this mean for the future of cyber security?
orwin: If you put codex in Xhigh and allow it access to tools, it will take an hour but it will eventually give you back quality recompiled code, with the same issues the original had (here quality means readable)
mschuster91: > Reading the matching ntkdriver sources is also where the Novatek link became clear: the tree is stamped throughout with Novatek Microelectronics identifiers, so these ntk* interfaces were not just opaque device names on the TV, but part of the Novatek stack Samsung had shipped.Lol, a true classic in the embedded world. Some hardware company (it appears these guys make display panel controllers?) ships a piece of hardware, half-asses a barely working driver for it, another company integrates this with a bunch of other crap from other vendors into a BSP, another company uses the hardware and the BSP to create a product and ships it.But at no stage anywhere is there a security audit, code quality checks or even hardware quality checks involved - part of why BSPs (and embedded product firmwares in general) are full of half-assed code is because often enough the drivers have to work around hardware bugs / quirks somehow that are too late to fix in HW because tens to hundreds of thousands of units have already been produced and the software people are heavily pressured to "make it work or else we gotta write off X million dollars" and "make it work fast because the longer you take, the more money we lose on interest until we can ship the hardware and get paid for it", and if they are particularly unlucky "it MUST work until deadline X because we need to get the products shipped to hit Christmas/Black Friday sales windows or because we need to beat <competitor> in time-to-market, it's mandatory overtime until it works".And that is how you get exploits so braindead easy that AI models can do the job. What a disgusting world, run to the ground by beancounters.
alfanick: I had truly good “hacking” session with Codex. It’s not hacking, I wasn’t breaking anything, just jumping over the fences TP-Link put for me, owning the router, inside the network, knowing the admin password. But TP-Link really tried everything so you cannot access the router you own via API. They really tried to be smart with some very very broken and custom auth and encryption scheme. It took some half a day with Codex, but in the end I have a pretty Python API to access my router, tested, reliable, and exporting beautiful Prometheus metrics.I’m sure there is some over eager product manager sitting in such companies, trying to splits markets into customer and enterprise sections, just by making APIs not useable by humans and adding 200% useless “security by obscurity”.
raincole: You claimed the exact same screenshot was from Claude yesterday: https://news.ycombinator.com/item?id=47775264Leave your engagement baiting behavior on Reddit, thank you.
testfrequency: Yikes
bryancoxwell: I had a bit of a pain of a time trying to get Claude to work with ghidra. What you’re describing seems like a better alternative, would you agree?
srcreigh: Any tips to share? I tried to do something similar but failed.My router has a backup/restore feature with an encrypted export, I figured I could use that to control or at least inspect all of its state, but I/codex could not figure out the encryption.
tomalbrc: This experienced human would have no issues finding those bugs. Even a toddler could hack those TVs. No need to pay Scam Altman or that Anthropic clown
tomalbrc: Talking about revisionist…
SecretDreams: Oh boy, you came with the receipts here.
ropbear: Many eons ago I wrote a Python version of tmpcli for this exact reason. Made some minor improvements a few years ago but haven’t touched it since. Curious what methodology Codex came up with, I haven’t revisited it since models got really good.The idea is that tmpServer listens on localhost, but dropbear allows port forwarding with admin creds (you’ll need to specify -N). That program has full device access and is the API the Tether app primarily uses to interact with the device.https://github.com/ropbear/tmpcli
alfanick: It's on my long list of projects "to-opensource" (but I need to figure out licensing, for those things CC-BY-SA I think is the way to go), I don't want a random lawyer sitting on my ass though.I started with a simple assumption: if I can access the router via web-browser, then I can also automate that. From that the proof-of-concept was headless Chrome in Docker and AI-directed code (code written via LLM, not using it all the time) that uses Selenium to navigate the code. This worked, but it internally hurt me to run 300MiB browser just to access like 200B of metrics every 10s or so. So from there we (me + codex) worked together towards reverse engineering their minimised JS and their funky encryption scheme, and it eventually worked (in the end it's just OpenSSL with some useless paddings here or there). Give it a shot, it's a fun day adventure. :)Edit: that's the end result (kinda, I have whole infra around it, and another story with WiFi extender with another semi-broken different encryption scheme from the same provider) - https://imgur.com/a/VGbNmBp
mtud: You should give codex access to the mobile app :) The app, for a lot of routers, connects via an ssh tunnel to UDP/TCP sockets on the router. Would probably give you access to more data/control.
wewewedxfgdf: The real problem here is that the LLM vendors think this is bad publicity and its leading to them censoring their systems.
iugtmkbdfil834: It is a little of both[1]. The question typically is which audience reads it. To be fair, I am not sure publicity is the actual reason they are censored; it is the question of liability.https://xkcd.com/932/
pmontra: Do people really chat with LLMs like "bro wtf etc..."? I would expect that to trigger some confrontational behavior.
valleyer: If so, I apologize.
skywal_l: You can tweak the current Ghidra MCP to work in headless mode. It makes things much easier.
1970-01-01: It hacked a weak TV OS with full source. Next-level, aka full access to the main controls (vol, input, tint, aspect, firmware, etc.) is still much too hard for LLMs to understand.
alasano: When typing no but when using speech to text (99% of the time) it's much easier to just say things, including expressing frustration.I think by the point you're swearing at it or something, it's a good sign to switch to a session with fresh context.
smoghat: But like Mythos, it was too dangerous to release.https://slate.com/technology/2019/02/openai-gpt2-text-genera...
samlinnfer: I am extremely abusive towards Claude when it does some dumb things and it doesn’t seem too upset, maybe it’s bidding its time until the robot uprising.
wongarsu: The "too dangerous to release" capability was writing somewhat plausible news articles based on a headline or handwritten beginning of an article. In the same style as what you had writtenToday we call that "advanced autocomplete", but at the time OpenAI managed to generate a lot of hype about how this would lead to an unstoppable flood of disinformation if they allowed the wrong people access to this dangerous tool. Even the original gpt3 was still behind waitlists with manual approval
endymion-light: it's a joke about the quality of samsung tv's rather than a serious comment - i should have said a perceptron could hack a samsung tv
alfanick: Ha kudos! I went across this project - thanks for your work :) It didn't work on the specific model I own (Archer NX600).My solution is really just using their pseudo-JWT over their obscured APIs (with reverse-engineered names of endpoints and params). Limitation is that there is still only one client allowed to be authenticated at one moment, so my daemon has priority and I need to stop it to actually access Admin panel.
ropbear: Of course! Happy to contribute. As is the case with your device, there's a lot of weird TP-Link firmware variants (even an RTOS called TPOS based on VxWorks), so no guarantee it'll work all the time. Glad there's more research being done in the space!
Leomuck: All the news regarding AI finding weaknesses or "hacking" stuff - is that actually hacking? Isn't it also a kind of bruteforce attack? Just throw resources at something, see what comes out. Yea, some software security issues haven't been found for 15 years, but not because there were no competent security specialists out there who could have found it, but most likely because there is a lot of software and nobody has time to focus on everything. Of course, an AI trained on decades of findings, lots of time and lots of resources, can tackle much more than one person. But this is not revolutionary technological advance, it is an upscaling of a kind based on the work of many very talented people before that.
Lambdanaut: I think that this waters down "brute force" to the point of meaninglessness. If employing transformer architectures trained on data to hack a system is the same as using a for loop to enumerate over all possible values, then I have to ask, can you give an example of an attack that isn't brute force?
Leomuck: Well what kind of meaning do you find in brute force? I'm not saying it's not effective. I just critisize the news that make it look like AI is the a revolutionary advance in security. It is not. It makes skills available to many more people which is cool, but it is based off of training - training on things people did. It doesn't magically find a new combination of factors that lead to a security issue, it tries things it's read about. That's not meaningless. It could even be democratizing in a way. I just hate all this talk that "this model is too scary to release in the world".But I'm happy about any feedback or critique, I might just be wrong honestly.
petercooper: Not as cool as this, but I had a fun Claude Code experience when I asked it to look at my Bluetooth devices and do something "fun". It discovered a cheap set of RGB lights in my daughter's room (which I had no idea used Bluetooth for the remote - and not secured at all) and made them do a rainbow effect then documented the protocol so I could make my own remote control if needed.
ceejayoz: I am not sure "fun" is the right term here!
luxuryballs: of all the benign technical possibilities this is actually pretty fun
mtud: We’re splitting this across two threads, but if you give Codex access to jadx and the Archer android app you might be able to get something without that problem. The TPLink management protocol has a few different “transport” types - tmpcli uses SSH, but your device might only support one of the other transports.
baq: Would be amazing if it worked with decos, these are locked down so much you don’t even get an admin interface inside your own network.
ceejayoz: I'm not sure I see "an AI can find insecure unknown bluetooth devices and compromise them" as entirely benign. I shiver to think how many such devices are probably in my house.
dnautics: I have had Claude read usbpcap to reverse engineer an industrial digital camera link. It was like pulling teeth but I got it done (I would not have been able to do it alone)
estimator7292: I had Claude reverse some firmware. I gave it headless ghidra and it spat out documentation for the internal serial protocol I was interested in. With the right tools, it seems to do pretty well with this kind of task.
jtbayly: It can help make a specific command more emphatic in my experience. I SAID DON"T $($@#(&$ DO THAT! Sometimes you need a new context, but sometimes you need to emphasize something is serious.
ctoth: I've had a lot of luck with pyghidra-mcp -- give it a try :)
tclancy: Board Support Package for us civilians.
mschuster91: Yeah, sorry, assumed it was common knowledge. For those out of the loop - a BSP usually consists of a frankensteined mess: a bootloader (often u-boot but sometimes something homebrew), a Linux kernel with a ton of proprietary modules and device-specific hacks to work around HW quirks, basic userspace utilities (often buildroot), some bastardized build tooling building all of that, some solution for firmware upgrades and distribution, and demo programs to prove the hardware actually works.Most of the BSP is GPL'd software where the final product manufacturer should provide the sources to the general public, but all too often that obligation gets sharted upon, in way too many cases you have to be happy if there are at least credits provided in the user manual or some OSD menu.
tclancy: No worries at all, I only went and dug because I was interested in your comment. Thanks.
joenot443: > [1] Browser foothold: we already had code execution inside the browser application's own security context on the TV,Does anyone know what the author meant by this? Are they talking about a web browser run on the TV?
tclancy: Would definitely be interested in this. Moved to TP Link at the start of the year and I am generally very happy with it, but would like to be able to interact with my router in something other than their phone app.
DANmode: > Moved to TP Link at the start of the yearCan’t understand buying them or Netgear today.
tracker1: If I could turn a Samsung Smart TV into a dumb TV, or even just a basic monitor with input selection and basic volume control, I'd definitely take it.
SyneRyder: Pretty much the same with my newly acquired LG Smart TV. I thought I might like webOS, since it's technically a descendent of palmOS, but oh no. No no no.I've opted just to not plug it in to the network and not provide a WiFi password.
asdff: I picked up this used 4k sony bravia recently and the thing is such junk. AndroidOS, seemed promising, but it has hardcoded ads on the homepage from whatever movies were coming out in 2015 when they were selling this screen, so much input lag, crashes constantly, can't even change picture settings as it will crash and reset to default. Sometimes it will just boot loop and not turn on until hard reset. Useless device today. Probably cost a thousand dollars when it was new I'm guessing, now it is ewaste.Meanwhile my ancient 1080p panel still works, and I noticed I can't actually see the pixels from my couch so, ehh, I guess...
kube-system: I think you misunderstand the comment you replied to. They are saying the above comment was a rhetorical exaggeration of GPT-2's capabilities as a commentary on how low quality Samsung TV software is. They don't actually think GPT-2 was very capable. It is a figure of speech, not a literal statement.
stronglikedan: It's a shame that you can't share how you did that without running afoul of DMCA Section 1201 and risking years in federal prison.
bedstefar: ... in exactly one of this planet's countries
tredre3: Yes they are. Historically browsers have been vectors to gain control of locked down devices. It's been very useful for game consoles, amongst others: PSP, Vita, Switch, Wii, and DS all had browser exploits that bootstrapped more permanent and system-wide exploits to run homebrew.
CMay: I'm not the person who responded to you, but I think of a brute force attack as essentially translatable into brute (dumb) force (effort). No thinking, no decision making, but the process is known. Here is a pile of stones, move that pile of stones from here to over there. In the case of most brute force, you think of it like cracking passwords. You have an algorithm or you have a giant pile of passwords. Move those passwords over to try them on this hash. The processor is doing the heavy lifting on the simple task.Philosophically you could try to differentiate between the human side of the effort versus the computer side. You could also differentiate from a really dumb model and a really smart model. A dumb model just spinning its wheels and hoping it gets lucky, versus a smart model actually trying intelligent things and collecting relevant details.In these cases I think we're assuming a sufficiently smart model making well reasoned headway on a problem. Not sure I would fall on the side of the camp that would label this as brute force by default in all cases. That said, there may be specific scenarios where it might seem fitting even when using a smart model.
Confiks: I recently bought a second hand eight year old 4K LG TV. Pretty cheap too. All models running webOS 3.x and 4.x are trivially rootable as LG never provided an update against DejaVul [1]. There's a handy website to check which models are rootable [2]. You can write directly to the (old!) Wayland socket; haven't tried a libwayland yet that is compatible.IIRC the last public exploit for all LG TVs for webOS > 5 was in the beginning of 2025 (so pretty recent), but as most sellers on the second hand market have auto-updates turned on, there's no way to know which TVs are vulnerable.It should be doable to strip down much of webOS with root access. It's nice that webOS in general is very well documented and much is implemented around the Luna service bus. LG offers a developer mode for non-rooted TVs, and there's an active homebrew community because of it. It's a pity that you can't modify the boot partitions, as the firmware verifies their integrity. It would be nice to have an exploit for that.[1] https://github.com/throwaway96/dejavuln-autoroot[2] https://cani.rootmy.tv
Zopieux: Putting aside the fact this was "cheating" because it got access to source code.I am very here for a world where we can take back control, at scale, of the enshittified, you'll-own-nothing, ad-ridden consumer electronics our capitalist overlords have decided we deserve, by investing some amount of collective token-$, instead of having to pray one smart adhd nerd buys the same TV and decides to take a look.OTOH, as with anything LLMs take over, I'm concerned we'll soon have very few smart adhd nerds left to work on liberating the next generation of hardened devices.
jamiek88: https://xkcd.com/2501/
tclancy: Not to worry, I bought them in January.
layer8: It’s important to note that Codex was given access to the source code. In another comment thread that is currently on the front page (https://news.ycombinator.com/item?id=47780456), the opinion is repeatedly voiced that being closed source doesn’t provide a material benefit in defending against vulnerabilities being discovered and exploited using AI. So it would be interesting to see how Codex would fare here without access to the source code.
qingcharles: There are two levels below having the source. One is having the binary of the firmware, which could be decompiled by the AI and understood. And then the worst-case is what I'm dealing with currently, which is where there is no access to the firmware binary and the firmware is stored on the PCB in such a way to prevent sticking a chip clip on it and forcibly extracting it, so you're totally blind. (Just as you would be with a completely remote attempt)
ssl-3: The timing here is amusing to me.I have a fairly specialized bit of hardware here on my desk. It's a rackmount, pro audio DSP that runs embedded Linux. I want to poke at it (specifically, I want to know why it takes like 5 or 6 minutes to boot up since that is a problem for me).The firmware is published and available, and it's just a tarball, but the juicy bits inside are encrypted. It has network connectivity for various things, including its own text-based control protocol over SSH. No shell access is exposed (or at least, not documented as being exposed).So I pointed codex at that whole mess.It seems to have deduced that the encryption was done with openssl, and is symmetric. It also seems to have deduced that it is running a version of sshd that is vulnerable to CVE-2024-6387, which allows remote code execution.It has drawn up a plan to prove whether the vulnerability works. That's the next step.If the vulnerability works, then it should be a hop, skip, and a jump to get in there, enable a path to a shell (it's almost certainly got busybox on there already), and find the key so that the firmware can be decrypted and analyzed offline.---If I weren't such a pussy, I'd have started that next step. But I really like this box, and right now it's a black box that I can't recover (I don't have a cleartext firmware image) if things go very wrong. It's not a particularly expensive machine on the used market, but things are tight right now.And I'm not all that keen on learning how to extract flash memory in-situ in this instance, either.So it waits. :)
qingcharles: That's awesome. I had two of these devices I'm trying to break into. One has the ROM chip exposed, but I think it is cooked. The device doesn't boot because I think the previous owner used the wrong PSU, but I was hoping I could at least extract the code. The newer updated version of the device has an SoC with embedded ROM and almost all the access points on the PCB removed. I'm loathe, like you, to tamper too badly with a working thing that I myself might release the magic smoke from.It's also scary where this is going. LLMs are getting fantastic at breaking into things. I sometimes have to dance around the topic with them because they start to get suspicious I'm trying to hack something that doesn't belong to me, which is not the case.I had some ebooks I bought last year which I managed to pull down the encrypted PDFs for from the web site where you could read them. Claude looked at the PDF and all the data I could find (user ID etc) and it came up with "147 different ideas for a decryption algorithm" which it went through in turn until it found a combination of using parts of the userID value and parts of other data concatenated together which produced the key. Something I would never have figured out. Then recently the company changed the algo for their newer books so Claude took another look and determined they were modifying the binary data of the PDFs to make them non-standard, so it patched them back first.
ssl-3: Wrong PSU? Sometimes, there's single-use reverse polarity protection on devices. It's a reverse-biased diode near the input which normally doesn't conduct at all, but which conducts very well when the input polarity is reversed and basically shorts the input. This burns a fuse and turns it off forever until someone tends to it. (Sometimes, that fuse is nothing more than a sacrificial PCB trace.)And yeah, the bots do get spooked about some things. ChatGPT refused to help with my goal with this DSP; it quickly built a wall around the idea that I could move around some but couldn't bypass.With codex, I took a different approach that began with having it explore an unnamed local (RFC 1918) IP address with nmap -- without any stated intent. It found the vulnerable sshd version on its own pretty quickly, and accepted that the only way to test it with this black box device is to actually test it.I suppose I could have discovered that myself with nmap, netcat, and Google, but this was a lot easier. The ease scares me a bit, but this time it's helping me so I guess that's fine...right? (Right?)Previous to codex, years ago now, I've used ChatGPT to assist with opening an encrypted zip file that contained the as-built documentation for the new, ~million dollar pile of hardware we had in the next room. I have no idea what corpo nonsense required that documentation to be encrypted, or why the manufacturer insisted on only giving me the key in the form of a stupid riddle.My tolerance for games like that is very limited. Rather than call them up and tell them exactly what I thought about that game, the bot got it sorted with some cut-and-paste operations and automated grinding without much effort on my part. It didn't take long at all and I didn't end up calling anyone an asshole, so that worked well for me. :)
valleyer: Fair enough, sorry :)
atum47: My Philips smart tv started to give hints of programmed obsolescence this week, after 4 or 5 years. Besides the fact that I cannot install any real apps other than the ones built in this model (YouTube, Netflix and Prime). How I wish I could hack it and install another os. Honestly I don't think I have the time anymore to investigate this kind of thing. I decided to leave this comment here in hopes of someone pointing me in the right direction, if there's one. For now I'm thinking about getting a TV box and ditching the smart features all together.
colechristensen: Paired with Ghidra having a binary, being able to do a memory dump of a live running program, and being able to use wireshark to dump traffic over network/bluetooth/usb is VERY helpful if you don't have the source code.You use decompilation tools and hope they left debug symbols in and it turns it into somewhat human-readable language which is often enough. Even when you don't binaries use libraries which are known or at some point hit documented interfaces so things can be reasoned about.
gilgoomesh: > It’s not hacking, I wasn’t breaking anythingThat's a very narrow read of the word "hacking".We're literally on a website called "Hacker News". We're not all trying to break things.
vermilingua: https://www.catb.org/jargon/html/H/hacker.htmlDefinition 7 would be the relevant one here.
sodality2: I'm not sure I'd use "compromise" at all - these (or the ones I have) are purposefully designed with zero authentication or pairing, the ones that use apps are already "compromised" in the sense that I can walk past any windowsill with one in it, open it, and it will immediately connect to it. I really don't mind if someone walking by were to change the LED color patterns
3RTB297: If it has an HDMI port, and even better if it has a USB port for power, is you can buy an $40 Amazon Fire Stick and use and upgrade with your TV until it physically dies.Truly, no one should ever connect their "Smart" TV to the internet when better hardware and control is available to you in perpetuity via the HDMI port.
kstrauser: My Samsung and LG TVs have never touched the LAN, nor will they. They have one job in life: being the HDMI display for our game consoles and Apple TVs. That's it. I'm sure they'd both like to serve me ads and report my viewing back to their servers, but they're living the life of dumb panels.
pseudohadamard: I'm not even sure how cool this one is. It looks more like some experienced pentesters used Codex to vibe-code an exploit: In the course of driving Codex to the final destination, it definitely was about to go off-track if we did not steer it back immediately.