• CompactFlax@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    7
    ·
    20 hours ago

    Entirely agree with that. Except to add that so is Dario Amodei.

    I think it’s got potential, but the cost and the accuracy are two pieces that need to be addressed. DeepSeek is headed in the right direction, only because they didn’t have the insane dollars that Microsoft and Google throw at OpenAI and Anthropic respectively.

    Even with massive efficiency gains, though, the hardware market is going to do well if we’re all running local models!

    • brucethemoose@lemmy.world
      link
      fedilink
      arrow-up
      7
      ·
      18 hours ago

      Alibaba’s QwQ 32B is already incredible, and runnable on 16GB GPUs! Honestly it’s a bigger deal than Deepseek R1, and many open models before that were too, they just didn’t get the finance media attention DS got. And they are releasing a new series this month.

      Microsoft just released a 2B bitnet model, today! And that’s their paltry underfunded research division, not the one training “usable” models: https://huggingface.co/microsoft/bitnet-b1.58-2B-4T

      Local, efficient ML is coming. That’s why Altman and everyone are lying through their teeth: scaling up infinitely is not the way forward. It never was.