Entirely agree with that. Except to add that so is Dario Amodei.
I think it’s got potential, but the cost and the accuracy are two pieces that need to be addressed. DeepSeek is headed in the right direction, only because they didn’t have the insane dollars that Microsoft and Google throw at OpenAI and Anthropic respectively.
Even with massive efficiency gains, though, the hardware market is going to do well if we’re all running local models!
Alibaba’s QwQ 32B is already incredible, and runnable on 16GB GPUs! Honestly it’s a bigger deal than Deepseek R1, and many open models before that were too, they just didn’t get the finance media attention DS got. And they are releasing a new series this month.
Local, efficient ML is coming. That’s why Altman and everyone are lying through their teeth: scaling up infinitely is not the way forward. It never was.
Entirely agree with that. Except to add that so is Dario Amodei.
I think it’s got potential, but the cost and the accuracy are two pieces that need to be addressed. DeepSeek is headed in the right direction, only because they didn’t have the insane dollars that Microsoft and Google throw at OpenAI and Anthropic respectively.
Even with massive efficiency gains, though, the hardware market is going to do well if we’re all running local models!
Alibaba’s QwQ 32B is already incredible, and runnable on 16GB GPUs! Honestly it’s a bigger deal than Deepseek R1, and many open models before that were too, they just didn’t get the finance media attention DS got. And they are releasing a new series this month.
Microsoft just released a 2B bitnet model, today! And that’s their paltry underfunded research division, not the one training “usable” models: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T
Local, efficient ML is coming. That’s why Altman and everyone are lying through their teeth: scaling up infinitely is not the way forward. It never was.