Alibaba Cloud releases Qwen3, an open weight LLM set that outperforms ChatGPT-o1 with only 32B parameters

gay_king_prince_charles [she/her, he/him]@hexbear.net · 1 day ago

Alibaba Cloud releases Qwen3, an open weight LLM set that outperforms ChatGPT-o1 with only 32B parameters

JoeByeThen [he/him, they/them]@hexbear.net · 1 day ago

Already on ollama.

gay_king_prince_charles [she/her, he/him]@hexbear.net · 1 day ago

I’ve found Qwen preferable to DeepSeek for coding so I can’t wait to try this out

JoeByeThen [he/him, they/them]@hexbear.net · 1 day ago

I’ve not used Qwen yet, but I have noticed deepseek, specifically r1, is kind of a lazy coder. Lot of ‘step 5 draw the rest of the owl’ type responses.

Unrelated, but does anyone else’s internet speed come to a screeching halt when trying to download models from ollama? I swear I’m being throttled by xfinity.

gay_king_prince_charles [she/her, he/him]@hexbear.net · 1 day ago

That might just be LLMs in general. ChatGPT does the same. Copilot is a little more well-tuned, but I really only ever have it do boilerplate.

JoeByeThen [he/him, they/them]@hexbear.net · 1 day ago

I’ve had really good luck with chatgpt 4o, and, to be fair, I have teased some decent responses out of deepseek 3 (iirc). Different ways of expanding on the basic principles of asking it to ‘step back and visualize different options before moving forward and fully implementing them with all necessary code, following best practices, etc.’ tends to get pretty good results.

Alibaba Cloud releases Qwen3, an open weight LLM set that outperforms ChatGPT-o1 with only 32B parameters

Alibaba Cloud releases Qwen3, an open weight LLM set that outperforms ChatGPT-o1 with only 32B parameters

Qwen3: Think Deeper, Act Faster