Discussion
jgbuddy: Worth noting that this model, unlike almost all qwen models, is not open-weight, nor is the parameter count exposed. Also odd that it is compared against opus 4.5 even though 4.6 was released like 2 months ago.
pferdone: They said in the last paragraph[0]:"[...] In the coming days, we will also open-source smaller-scale variants, reaffirming our commitment to accessibility and community-driven innovation. [...]"[0] https://qwen.ai/blog?id=qwen3.6#summary--future-work
srmatto: The benchmarks provided are for Opus-4.5, not for the latest Opus-4.6 and Qwen is still lagging in a lot of them.
Aurornis: There is no reason to benchmark against Opus 4.5 when Opus 4.6 has been out so long, other than to be misleading.
thegeomaster: [delayed]
Aurornis: This is their hosted-only model, not an open weight model like they’ve become known for. They got a lot of good publicity for their open weight model releases, which was the goal. The hard part is pivoting from an open weight provider to being considered as a competitor to Claude and ChatGPT. Initial reactions are mostly anger from everyone who didn’t realize that the play along was to give away the smaller models as advertising, not because they were just feeling generous.Comparing to Opus 4.5 instead of the current 4.6 is clearly an attempt to deceive, which isn’t winning them any points either.I think there is a moderately large market for models like this that aren’t quite SOTA level but can be served up much cheaper. I don’t know how successful they’ll be in the race to the bottom in this market niche, though. Most users of cheap API tokens are not loyal to any brand and will change providers overnight each time someone releases a slightly better model.
cubefox: > I think there is a moderately large market for models like this that aren’t quite SOTA level but can be served up much cheaper.There isn't, pretty much everyone wants the best of the best.
deaux: > we will also open-source smaller-scale variantsIn other words, like GP said, this Qwen3.6-Plus model is not open-weight unlike the other Qwen models.
pferdone: > unlike almost all qwen modelsAlmost all means there have been ones before that were not open. So, no contradiction there.
eis: Quite strong results in the benchmarks but why Gemini 3 Pro instead of 3.1? Why only for a few of the benchmarks? Why is OpenAI not there in the coding benchmarks? Why Opus 4.5 and not 4.6? Just jumps out into my eye as a bit strange.As always, we'll have to try and see how it performs in the real world but the open weight models of Qwen were pretty decent for some tasks so still excited to see what this brings.
woeirua: Just more evidence that the B tier models are six months behind. Ultimately that’s good. Opus 4.6 level intelligence will be cheap later this year!
daft_pink: Not really interested in using models hosted on alibaba cloud.Like Qwen local for it’s privacy, but I trust the privacy of Google/OpenAI/Anthropic more than alibaba.
esafak: Does anyone have experience with Alibaba's coding plan; presumably the place to use this?
sidrag22: maybe there isnt, but as understanding grows people will understand that having an orchestration agent delegate simple work to lesser agents is significant not only for cost savings, but also for preserving context window space.
kennywinker: > unlike the other Qwen modelsPlease send the download link for qwen 3.5-plus.Also, who cares? If you have the hardware to run a ~400b model i don’t think you count as a home user anymore.
noman-land: OpenCode allows for free inference tho.
giancarlostoro: I hope their open source variants are just as good, having a 1 million token window for a fully offline model would be VERY interesting.
the_pwner224: I had the exact opposite reaction. I stopped using OpenAI/Google a while ago due to privacy and moved to local Qwen, now I'm considering using Alibaba cloud. You know Google and OpenAI are going to share everything with the US government and Western ad networks. But with Alibaba, who cares if the CCP & Chinese ad networks have a comprehensive profile on me? From a pragmatic perspective it's much better for (outcomes related to) privacy.
zobzu: so if China has the data good, us has the data bad, got it lol.us actually has laws around this and they arent sharing very much with thr us gov today. china shares 100% as required by law. and neither care much about "how long do i cook eggs for", but they do care about code generation a lot.
linolevan: I’m surprised that people are surprised. Qwen has been hosting private plus and max variants for a while now.
zozbot234: I wouldn't say "almost all" seeing as -MAX and -Omni models were always closed.
sosodev: I don't know how well it performs, but you can extend Qwen3.5 to 1 million token context using YaRN. Also, Nemotron 3 Super was recently released and scales up to 1 million token context natively.
CamperBob2: As with all arguments equivalent to "I have nothing to hide, so I have nothing to fear," it may be true now, but it may not be true later. The only certainty is that this will not be your call.
the_pwner224: Agreed
wongarsu: For coding I want the best. Both me and $work do lots of things besides coding where smaller models like qwen3.5-27b work great, at much lower cost.
furyofantares: I'll diverge from some of these comments, I don't find it misleading to compare to Opus 4.5.I can remember how good Opus 4.5 was. If I'm considering using this, it's most informative to me to compare to the model it's closest to that I have familiarity with.I'm obviously not switching to this if I want the best model. I'm switching if I'm hopeful that the smaller versions are close to it, or if I want to have more options for providers, or for any other reasons unrelated to getting the highest quality responses possible.
wongarsu: From an espionage perspective your own government is the safest. But from a civil rights perspective your own government is your most immediate threat. China isn't going to arrest me for my opinions on Netanyahu, my own government couldAnd the US government has repeatedly shown that it is very interested in collecting all the data available, not unlike China. In China this is simply done in the open while the US has a veneer of protection for citizens. But where the data collection is forbidden by law they either ignore the law or ask another five eyes member to do the spying and share the results. Both are well documented
thraxil: No. Right now I'm upset that Google has removed (or at least is in the process of removing) the Gemini 2.0 flash model. We use it for some pretty basic functionality because it's cheap and fast and honestly good enough for what we use it for in that part of our app. We're being forced to "upgrade" to models that are at least 2.5 times as expensive, are slower and, while I'm sure they're better for complex tasks, don't do measurably better than 2.0 flash for what we need. Yay. We've stuck with the GCP/Gemini ecosystem up until now, but this is kind of forcing us to consider other LLM providers.
wg0: It hallucinates a lot more then Sonnet or even MiniMax M2.5. Especially in tool calls, it would end up duplicating the content in code files and then realising later and getting stuck in a loop.
simonw: Pretty solid Pelican: https://gist.github.com/simonw/ca081b679734bc0e5997a43d29fad...I used the https://modelstudio.alibabacloud.com/ API to generate that one, which required signing up for an account and attaching PayPal billing - but it looks like OpenRouter are offering it for free right now so I could have used that: https://openrouter.ai/qwen/qwen3.6-plus:free
true_religion: [delayed]
Alifatisk: I understand peoples reactions of Qwen team comparing against Opus 4.5 instead of 4.6. And them comparing against Gemini Pro 3.0 instead of 3.1. But calling it misleading is a bit of stretch in my eyes, people here are acting like we immediately forgot how previous generations performed just because a new version is released.This field is going in a incredible pace, the providers release a new model every quarter or so. The amount of criticism is a bit overblown in my opinion. The benchmarks still look very good to me. I’ve used GLM-5 (latest is GLM-5.1) and Kimi K2.5, they are decent and gets the job done, so seeing how this model of Qwen performs compared to it is kinda impressive.Also, why are so many pointing out the fact that this model is not open-weight as if this is their first time doing so. Qwen-3.5-plus, Qwen-3-Max is also closed source. This is not something new.I think Qwen trying to catch up to the SOTA models is still healthy for us, the consumers. Sure, its sad news that this version is closed-weight, but I won’t downplay their progress.
nickvec: I think it’s more the principle of deception that upsets people. Imagine if Apple released a new iPhone and publicly compared its specs to some previous gen Android. It’s not in good faith.
Alifatisk: Why are we so quick to call it deception? Their figure is quite clear. They aren't fiddling with the graph or hiding the labels, they are clearly stating which models it compares against. But I agree on the sentiment that the standard practice should be to bench against the latest SOTA models.
patates: Even if openly stated, why would they be comparing to a previous generation if not for deception?Laziness? Lack of time? It's not like the latest generation of the SOTA models were released yesterday.
shubhamgarg86: the comparison is helpful but i'd want to see how it handles edge cases
jstummbillig: > Initial reactions are mostly anger from everyone who didn’t realize that the play along was to give away the smaller models as advertising, not because they were feeling generous.The naivety around this has been staggering quite frankly. All of a sudden, people thinking that meta etc are releasing free models because they believe in open access and distribution of knowledge. No, they just suck comparatively. There is nothing to sell. Using it to recruit and generate attention is the best play for them.
miki123211: I thought Qwen was releasing open-weight because China can't compete with America (because of people's privacy concerns), so the only thing they could do is salt the ground economically with open models, and make sure everybody loses.Apparently that wasn't actually the play here.