• CompactFlax@discuss.tchncs.de
    link
    fedilink
    English
    arrow-up
    5
    arrow-down
    2
    ·
    21 hours ago

    Theres openAI, google and meta (american), mistral (French), alibaba and deepseek (china). Many more smaller companies that either make their own models or further finetune specialized models from the big ones

    Which ones are not actively spending an amount of money that scales directly with the number of users?

    I’m talking about the general-purpose LLM AI bubble , wherein people are expected to return tremendous productivity improvements by using a LLM, thus justifying the obscene investment. Not ML as a whole. There’s a lot there, such as the work your colleagues are doing.

    But it’s being treated as the equivalent of electricity, and it is not.

    • SmokeyDope@lemmy.world
      link
      fedilink
      English
      arrow-up
      5
      arrow-down
      1
      ·
      edit-2
      19 hours ago

      Which ones are not actively spending an amount of money that scales directly with the number of users?

      Most of these companies offer direct web/api access to their own cloud supercomputer datacenter, and All cloud services have some scaling with operation cost. The more users connect and use computer, the better hardware, processing power, and data connection needed to process all the users. Probably the smaller fine tuners like Nous Research that take a pre-cooked and open-licensed model, tweak it with their own dataset, then sell the cloud access at a profit with minimal operating cost, will do best with the scaling. They are also way way cheaper than big model access cost probably for similar reasons. Mistral and deepseek do things to optimize their models for better compute power efficency so they can afford to be cheaper on access.

      OpenAI, claude, and google, are very expensive compared to competition and probably still operate at a loss considering compute cost to train the model + cost to maintain web/api hosting cloud datacenters. Its important to note that immediate profit is only one factor here. Many big well financed companies will happily eat the L on operating cost and electrical usage as long as they feel they can solidify their presence in the growing market early on to be a potential monopoly in the coming decades. Control, (social) power, lasting influence, data collection. These are some of the other valuable currencies corporations and governments recognize that they will exchange monetary currency for.

      but its treated as the equivalent of electricity and its not

      I assume you mean in a tech progression kind of way. A better comparison might be is that its being treated closer to the invention of transistors and computers. Before we could only do information processing with the cold hard certainty of logical bit calculations. We got by quite a while just cooking fancy logical programs to process inputs and outputs. Data communication, vector graphics and digital audio, cryptography, the internet, just about everything today is thanks to the humble transistor and logical gate, and the clever brains that assemble them into functioning tools.

      Machine learning models are based on neuron brain structures and biological activation trigger pattern encoding layers. We have found both a way to train trillions of transtistors simulate the basic information pattern organizing systems living beings use, and a point in time which its technialy possible to have the compute available needed to do so. The perceptron was discovered in the 1940s. It took almost a century for computers and ML to catch up to the point of putting theory to practice. We couldn’t create artificial computer brain structures and integrate them into consumer hardware 10 years ago, the only player then was google with their billion dollar datacenter and alphago/deepmind.

      Its exciting new toy that people think can either improve their daily life or make them money, so people get carried away and over promise with hype and cram it into everything especially the stuff it makes no sense being in. Thats human nature for you. Only the future will tell whether this new way of precessing information will live up to the expectations of techbros and academics.