Discussion
Search code, repositories, users, issues, pull requests...
ComputerGuru: Reviews of the tool on twitter indicate that it completely nerfs the models in the process. It won't refuse, but it generates absolutely stupid responses instead.
littlestymaar: Don't use this 2 days old vibe coded bullshit please.p-e-w's Heretic (https://news.ycombinator.com/item?id=45945587) is what you're looking for if you're looking for an automatic de-censoring solution.
kube-system: I guess it's kind of like a lobotomy tool.
Animats: Link?It's interesting that people are writing tools that go inside the weights and do things. We're getting past the black box era of LLMs.That may or may not be a good thing.
noufalibrahim: I believe that this is already done to several models. One that I've come across are the JOSIEfied models from Gökdeniz Gülmez. I downloaded one or two and tried them on a local ollama setup. It does generate potentially dangerous output. Turning on thinking for the QWEN series shows how it arrives at it's conclusions and it's quite disturbing.However, after a few rounds of conversation, it gets into loops and just repeats things over and over again. The main JOSIE models worked the best of all and was still useful even after abliteration.
Alifatisk: This is for local models right? I can't use it on, say my glm-5 subscription connected to opencode?
HanClinto: Correct, local models only.
littlestymaar: This is vibecoded garbage that the “author” probably didn't even test by themselves since making this yesterday, so it's not surprising that it's broken.
dinunnob: Hate to have to be the one to stick up for pliny here, but hes concerned about forcing frontier labs to focus more on model guardrails - he demonstrates results that are crazy all the timehttps://x.com/elder_plinius
thegrim33: Whether or not the linked tool uses a good approach, manipulating models like you mention is already fairly well established, see: https://huggingface.co/blog/mlabonne/abliteration .
IncreasePosts: I didn't use this tool, but I did try out abliterated versions of Gemma and yes, it lost about 100% of it's ability to produce a useful response once I did it
PeterStuer: Already censored for sharing on FB Messenger?
a2128: You're not just using a tool — you're co-authoring the science. This README is an absolute headache that is filled with AI writing, terminology that doesn't exist or is being used improperly, and unsound ideas. For example, it focuses a lot on doing "ablation studies", by which it means removing random layers of an already-trained model, to find the source of the refusals(?), which is an absolute fool's errand because such behavior is trained into the model as a whole and would not be found in any particular layer. I can only assume somebody vibe-coded this and spent way too much time being told "You're absolutely right!" bouncing back the worst ideas
dinunnob: Hmm, pliny is amazing - if you kept up with him on social media you’d maybe like him https://x.com/elder_plinius
bigyabai: If this qualifies as "amazing" in 2026 then Karpathy and Gerganov must be halfway to godhood by now.
gavinray: The parent comment makes no reference to or comment on the author of the README.It just says "the README sucks." Which, I'm inclined to agree, it does.LLM-generated text has no place in prose -- it yields a negative investment balance between the author and aggregate readers.
EGreg: Amazing as in his stuff actually works?I just hear him promoting OBLITERATUS all day long and trying to get models to say naughty things
dinunnob: Yeah but i think the philosophy is to show how precarious the guardrails are
paradox460: It's not just a headache, it's bad
Retr0id: I don't know if this particular tool/approach is legit, but LLM ablation is definitely a thing: https://arxiv.org/abs/2512.13655
electroglyph: the default heretic with only 100 samples isn't very good, you really need your own, larger dataset to do a proper abliteration. the best abliteration roughly matches a very careful decensor SFT
dinunnob: I dont think anyone is going to dispute this
bigyabai: I just don't think many people will be "amazed" by their output, as you claim.
dinunnob: I just said pliny was amazing, fwiw - i like that hes hacking on these and posts about it. I rushed to defend, i wish more people were taking old school anarchist cookbook approaches to these things
cess11: Smoke banana peel?
Zetaphor: I had such a godawful headache from that. Also tried the peanut shells, equally awful. I was a dumb teenager.
fragmede: Alternately, it's intentional. It very effective filters out people with your mindset. You can decide if that's a good thing or not.
fragmede: gasoline and styrofoam was fun tho
eli: Why would a tool that works need to dissuade skeptics from trying it?
butILoveLife: This is my experience with abliterated models.I use Berkley Sterling from 2024 because I can trick it. No abliteration needed.
Aurornis: I don't know. I scrolled through his recent Tweets and he's sharing things like this $900 snake oil device that "finds nearby microphones" and "sends out AI-generated cancellation signals" to make them unable to record your voice : https://x.com/aidaxbaradari/status/2028864606568067491Try to think for a moment about how a device would "find nearby microphones" or how it would use an AI-generated signal to cancel out your voice at the microphone. This should be setting of BS alarms for anyone.It seems the Twitter AI edgey poster guy is getting meta-trolled by another company selling fake AI devices
lazzlazzlazz: Ironic to see this comment when Pliny, the author of this codebase, is one of the most sophisticated LLM jailbreakers/red-teamers today. So presumptive and arrogant!