Discussion
Huginn Report: February 2026
PunchyHamster: They put them directly in front of search results, why would they not miss them ?
lich_king: I don't understand the metric they're using. Which is maybe to be expected of an article that looks LLM-written. But they started with ~250 URLs; that's a weirdly small sample. I'm sure there are tens of thousands malicious websites cropping up monthly. And I bet that Safe Browsing flags more than 16% of that?So how did they narrow it down to that small number? Why these sites specifically?... what's the false positive / negative rate of both approaches? What's even going on?
supermatt: > When we ran the full dataset through the deep scan, it caught every single confirmed phishing site with zero false negatives. The tradeoff is that it flagged all 9 of the legitimate sites in our dataset as suspiciousHuh? Does this mean it just flagged everything as suspicious?
badgersnake: lol, return false;
nico: On a tangent - gmail has a feature to report phishing emails, but it seems like it’s only available on the website. Their mobile app doesn’t seem to have the option (same with “mark as unread”). Is it hidden or just not available?
notepad0x90: Glass is half empty, I see.How about GSB stopped 16% of phishing sites? that's still huge.
debo_: I guess the glass is 16% full.
john_strinlai: >what's the false positive / negative rate of both approachesthe false positive rate is 100%. they just say everything is phishing:"When we ran the full dataset through the deep scan, it caught every single confirmed phishing site with zero false negatives. The tradeoff is that it flagged all 9 of the legitimate sites in our dataset as suspicious, which is worth it when you're actively investigating a link you don't trust."
lorenzoguerra: it's 100% for what they call "deep scan", and it's 66.7% for the "automatic scan". Practically unusable anyway
sirpilade: But hits 100% of browsing tracking
timnetworks: The most dangerous links recently have been from sharepoint.com, dropbox.com, etc. and nobody is going to block those.
epicprogrammer: Having spent some time in the anti-abuse and Trust & Safety space, I always take these vendor reports with a massive grain of salt. It’s a classic case of comparing apples to vendor-marketing oranges. A headline screaming about an 84% miss rate sounds like a systemic collapse until you look at the radically different constraint envelopes a global default like GSB and a specialized enterprise vendor operate under.The biggest factor here is the false-positive cliff. Google Safe Browsing is the default safety net for billions of clients across Chrome, Safari, and Firefox. If GSB’s false-positive rate ticks up by even a fraction of a percent, they end up accidentally nuking legitimate small businesses, SaaS platforms, or municipal portals off the internet. Because of that massive blast radius, GSB fundamentally has to be deeply conservative. A boutique security vendor, on the other hand, can afford to be highly aggressive because an over-block in a corporate environment just results in a routine IT support ticket.You also have to factor in the ephemeral nature of modern phishing infrastructure and basic selection bias. Threat actors heavily rely on automated DGAs and compromised hosts where the time-to-live for a payload is measured in hours, if not minutes. If a specialized vendor detects a zero-day phishing link at 10:00 AM, and GSB hasn't confidently propagated a global block to billions of edge clients by 10:15 AM, the vendor scores it as a "miss." Add in the fact that vendors naturally test against the specific subset of threats their proprietary engines are tuned to find, and that 84% number starts to make a lot more sense as a top-of-funnel marketing metric rather than a scientific baseline.None of this is to say GSB is perfect right now. It has absolutely struggled to keep up with the recent explosion of automated, highly targeted spear-phishing and MFA-bypass proxy kits. But we should read this report for what it really is: a smart marketing push by a security vendor trying to sell a product, not a sign that the internet's baseline immune system is totally broken.
Medowar: > We also ran the full dataset of 263 URLs (254 phishing, 9 confirmed legitimate) through Muninn's automatic scan. This is the scan that runs on every page you visit without any action on your part. On its own, the automatic scan correctly identified 238 of the 254 phishing sites and only incorrectly flagged 6 legitimate pages. [...] The tradeoff is that it flagged all 9 of the legitimate sites in our dataset as suspicious, ...Am I missing something or is that a 66%/100% False Positive Rate on legitimate Sites?If GSB would have that ratio, it would be absolute unusable.. So comparing these two is absolutely wrong...
xvector: There's probably like one engineer maintaining this as a side project at the company
andor: Yeah, it would be interesting to know how much work is spent on it. I sometimes submit sites when I am targeted by a campaign, but I'm not sure if they end up in their deny-list.
ajross: > I always take these vendor reports with a massive grain of salt.Yeah. "Here's a blog post with some casually collected numbers about our product [...] It turns out that it's great!" is sorta boring.But couple that with a headline framed as "Google [...] Bad" and straight to the top of the HN front page it goes!
loloquwowndueo: Would you use anything that was only 16% effective for its claimed purpose?“Tylenol stops headaches in 16% of people” - it’s huge, right? That’s millions of people we’re talking about.Would you use it?
mock-possum: Idk why not? What’re the side effects?
obblekk: Maybe I’m an outlier but I’d rather this than accidentally block legit sites.Otherwise this becomes just another tool for Google to wall in the subset of the internet they like.
nubinetwork: > I always take these vendor reports with a massive grain of salt. It’s a classic case of comparing apples to vendor-marketing oranges. A headline screaming about an 84% miss rate sounds like a systemic collapse until...I've seen this before in the ip blocklist space... if you're layering up firewall rules, you're bound to see the higher priority layers more often.That doesn't mean the other layers suck, security isn't always an A or B situation...On the other hand, I don't know how I feel about how GSB is implemented... you're telling google every website you go to, but chances are the site already has google analytics or SSO...
ApolloFortyNine: The 9/9 is actually crazy, and then they posted about it as if they found something? What they did was find a major issue in their own process and then told the world about it, that just doesn't seem right.
hedora: So, the false negative rate was 84%, but what was the false positive rate?They have a table "AUTOMATIC SCAN RESULTS (263 URLS)" that sort of presents this information. Of the 9 sites that were negatives, they say they incorrectly flagged 6 as phishing.With a false positive rate of 66%, it's not surprising they were able to drive down their false negative rate. Also, the test set of 254 phishing sites with 9 legitimate ones is a strange choice.(Or maybe they need to work on how they present data in tables; tl;dr the supporting text.)
decimalenough: The false positive rate was 66% for "automatic scan" and 100% (!) for "deep scan".In other words, you can get these numbers if your deep scan filter is isSuspicious() { return true; }.
trehalose: It would seem their service identifies only phishing sites as legitimate ones. It would seem 100% of sites they deem legitimate are phishing sites. Incredible.
blell: Educate yourself on how it works before you say something like this.
sirpilade: Pun aside, I cannot fully trust a centralized URL checker on a remote server that I don’t own, even if they guarantee that my privacy is safe
thrwaway55: The deep scan detected all phishing sites correctly with the unfortunate tagging of legit sites as phishing too. I imagine their code looks something like isPhishing = true.
saalweachter: Crazy, and also like, 9? The sample size in that part of your test suite is 9?
HeatrayEnjoyer: Countless medications have <16% efficacy rate.
If you're interested in trying Muninn, it's available as a Chrome extension. We're in an early phase and would genuinely appreciate feedback from anyone willing to give it a shot. And if you run across phishing in the wild, consider submitting it to Yggdrasil so the data can help protect others.
caaqil: Yeah, maybe let's change the title to remove that 84% rate. It's meaningless because it's just 254 websites, given the scale of what Google Safe Browsing deals with.How is this serious? This is a marketing slop. If the title isn't enough indicator, the ending should be:> If you're interested in trying Muninn, it's available as a Chrome extension. We're in an early phase and would genuinely appreciate feedback from anyone willing to give it a shot. And if you run across phishing in the wild, consider submitting it to Yggdrasil so the data can help protect others.