Discussion

Kimi K2.6: Advancing Open-Source Coding

irthomasthomas: Beats opus 4.6! They missed claiming the frontier by a few days.

BoorishBears: Opus is clearly a sidegrade meant to help Anthropic manage cost, so I would say they may have it if it actually beats 4.6

NitpickLawyer: While I'm skeptical of any "beats opus" claims (many were said, none turned out to be true), I still think it's insane that we can now run close-to-SotA models locally on ~100k worth of hardware, for a small team, and be 100% sure that the data stays local. Should be a no-brainer for teams that work in areas where privacy matters.

nickandbro: Wow, if the benchmarks checkout with the vibes, this could almost be like a Deepseek moment with Chinese AI now being neck and neck with SOTA US lab made models

motoboi: With the previous generation? Yes. With 10T mythos-level models? Not even close.

jollymonATX: According to the benchmarks, you are wrong. It is on track and slightly above some sota. Just the benchmarks speaking there, they can be/are gamed by all big model labs including domestic.

esafak: K2.5 was already pretty decent so I would try this. Starting at $15/month: https://www.kimi.com/membership/pricing

wg0: How are the usage limits compared to Anthropic?

greenavocado: I pray the benchmark figures are true so I can stop paying Anthropic after screwing me over this quarter by dumbing down their models, making usage quotas ridiculously small, and demanding KYC paperwork.

pt9567: wow - $0.95 input/$4 output. If its anywhere near opus 4.6 that's incredible.

verdverm: https://huggingface.co/moonshotai/Kimi-K2.6Is this the same model?Unsloth quants: https://huggingface.co/unsloth/Kimi-K2.6-GGUF(work in progress, no gguf files yet, header message saying as much)

corlinp: This should erase any doubt that AI Labs are making $$$ on API inference.Kimi 2.5 (which this is based on) is served at $0.44 input / $2 output by a ton of different providers on OpenRouter, 2.6 will certainly be similar.That's about 11X less than Opus for similar smarts.

Lalabadie: Famously, OpenAI and Anthropic are devoted to increasing efficiency before scaling up resource usage.

osti: I think this one is only about 600GB VRAM usage, so it could fit on two mac studios with 512GB vram each. That would have costed (albeit no longer available) something like less than 20k.

Balinares: Quite curious how well real usage will back the benchmarks, because even if it's only Opus ballpark, open weights Opus ballpark is seismic.

ai_fry_ur_brain: Its not anywhere close, and if it was nobody in the USA would be spending 7 figures on infrastructure for it.You LLM people all here serious cases of Dunning Kruger

cassianoleal: If only their API wasn't tied to a Google or phone login...

fintechie: Gonna give this one a go... the previous 2.5 model is used for Cursor's Composer 2 Fast. After real world tasks during a few weeks I have seen that it can be very dumb or it can be very good (better than Opus 4.7) depending on the problem you throw at it.Sometimes in one single pass prompt/response can unblock you in issues where Opus ate $100+ in API credits and circled during hours. Other times the response is useless, but it is your responsibility as engineer to discern this.Verdict (at least for me): use both.

dmix: I'm pretty Kimi is what Cursor uses for their "composer 2" model. Works pretty good as a fallback when Claude runs out, but definitely a downgrade.

simonw: Accessed via OpenRouter, this one decided to wrap the SVG pelican in HTML with controls for the animation speed: https://gisthost.github.io/?ecaad98efe0f747e27bc0e0ebc669e94...Transcript and HTML here: https://gist.github.com/simonw/ecaad98efe0f747e27bc0e0ebc669...

game_the0ry: There is some humor in the fact that china (of all countries) is pioneering possibly the world's most important tech via open source, while we (US) are doing the exact opposite.

culi: All great technological advancements have come through opening up technology. Just look at your iPhone. GPS, the internet, AI voice assistants, touchscreens, microprocessors, lithium-ion batteries, etc all came from gov't research (I'm counting Bell Labs' gov't mandated monopoly + research funding as gov't) that was opened up for free instead of being locked behind a patent.Private companies will never open up a technological breakthrough to their competitors. It just doesn't make sense. If you want an entire field to advance, you have to open it up.

FlyingSnake: At this point drawing these Pelicans must be in the training data sets.

elfbargpt: I've always been surprised Kimi doesn't get more attention than it does. It's always stood out to me in terms of creativity, quality... has been my favorite model for awhile

culi: It's also one of the few models that seem capable of drawing an SVG clockhttps://clocks.brianmoore.com/

sigmoid10: Is it? In your link it definitely failed to draw the clock.

otabdeveloper4: > Its not anywhere closeClose to what, and how are you measuring?> nobody in the USA would be spending 7 figures on infrastructure for itAu contraire, if AI had a moat it would pay for itself. They're funneling capital into infrastructure because they know it can't.

gpm: Huh, so the metadata says 1.1 trillion parameters, each 32 or 16 bits.But the files are only roughly 640GB in size (~10GB * 64 files, slightly less in fact). Shouldn't they be closer to 2.2TB?

brandensilva: We are at the point where uncontrolled capitalism collides with humanity.I do wonder where we go from here.

Aeolun: It’s good, but it’s not quite Claude level. And their API has constant capacity issues.Price/quality is absolutely bonkers though. I loaded $40 a few weeks/months ago and I haven’t even gone through half of it.

johndough: The bulk of Kimi-K2.6's parameters are stored with 4 bits per weight, not 16 or 32. There are a few parameters that are stored with higher precision, but they make up only a fraction of the total parameters.

gpm: Huh, cool. I guess that makes a lot of sense with all the success the quantization people have been having.So am I misunderstanding "Tensor type F32 · I32 · BF16" or is it just tagged wrong?

hn8726: Genuine question, what's the goal of posting this on almost every single new model thread here on HN? I may be old and grumpy but to me it got old a while ago, and is closer to a low effort Reddit comment

greenavocado: Anthropic has the worst usage limits in the industry

andriy_koval: gemini is worse imo

deaux: You're correct, Gemini chat limits are a joke at their chapest paid tier compared to both Claude and GPT. Especially crazy when you consider Gemini 3 Pro is more than twice as cheap as Opus 4.6 on the API. It's hard to run into pure chat limits on Claude even if you only use Opus on the cheapest tier, whereas with Gemini it's easy to hit.Not sure about coding usage, Google being weird about these things I could see that quota being separate.

amazingamazing: The psyop continues. Mythos until it’s released is vaporware. Notice how you can try kimi 2.6. Where is the same for mythos?

fragmede: It's been released to "select partners".

dryarzeg: I'm not really sure how this works, but I stayed on the page for a while, and then it reloaded and all clocks changed. I guess there's either a collection of different clocks generated by models, or maybe they're somehow generated in the real time, but the fact is what you see is not necessarily what I see.

sigmoid10: Seems like it regenerates them to reflect the current time. Funny to see how some models (like Kimi and Deepseek) sometimes get it right and other times fail miserably on the level of ancient models like GPT 3.5.

osti: Maybe open source == communism

tadfisher: Nah, open source means those who do the work own the result. It's supercapitalism.

pheggs: I dont think thats right, the models and the gpus are the means of production.in capitalism the people with the capital get the profit, not the people who do the work. however, workers are said to benefit too through their salary, just less so

antirez: This is not in antithesis. My limited personal experience is that I wrote code under OSS licenses primarily because of my past communist believes and current left-wing and redistribution of wealth point of view. This is not to provide the simple equation of: communist China is not interested in money, but also to believe that there is no cultural connection among thinking (inside the local borders, but they are huge borders) that people can't reason as singles, but as a collective, has zero implications. Also there is the obvious fact that in this moment China is more interested in winning technologically in AI, more than economically, since I believe they realized before the others that LLMs are eventually commoditized in the current form, in the long run. One could assume that a breakthrough could give some lab a decisive advantage, but so far we assisted to a different reality: it looks like AI is not architectural-bound (like LeCun and others want us to believe, but so far they mis-interpreted LLMs at every step) but GPU bound, and the data-boundness is both common for all, and surpassable via RL in many domains. So, if this is true, it is not trivial for any single lab to do so much better. And indeed as far as we observed right now folks with enough engineers, GPUs, money, can ship frontier models, and in China even labs with a lot less GPUs can still do it at a SOTA level.

cedws: Even the smaller quantized models which can run on consumer hardware pack in an almost unfathomable amount of knowledge. I don't think I expected to be able to run a 'local Google' in my lifetime before the LLM boom.

sterlind: I'm extremely curious how these models learn to pack a lossily-compressed representation of the entire Internet (more or less) into a few hundred billion parameters. like, what's the ontology?

arcanemachiner: It's a Kimi K2.5 finetune, there was some drama about this a few weeks ago.

jstummbillig: At this point it seems more like the result of a psyop to presume that a new anthropic model should be considered vaporware until released.

atemerev: Why use China model API from China if there are many independent providers available via Openrouter?

pheggs: to support the companies that open source their models

throwaw12: Beats Opus and Open Source?I really hope this holds true in real world use cases as well and not only benchmarks. Congrats to Kimi team!

tadfisher: The reason regular-capitalism worked is that all production used to depend on workers bottlenecking the free flow of capital by demanding salaries in exchange for their labor. Now that we've removed that obstacle, capitalism demands workers seize the means of production in order to maintain the status quo. Hence, supercapitalism.

Reader /

Discussion