• ryathal@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    Both are happening. Samples of casual writing are more valuable to use to generate an article than research papers though.

    • FaceDeer@fedia.io
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      Yeah. Scientific papers may teach an AI about science, but Reddit posts teach AI how to interact with people and “talk” to them. Both are valuable.

      • geekwithsoul@lemm.ee
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        Hopefully not too pedantic, but no one is “teaching” AI anything. They’re just feeding it data in the hopes that it can learn probabilities for certain types of output. It “understands” neither the Reddit post nor the scientific paper.

        • hoshikarakitaridia@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          edit-2
          3 months ago

          This might be a wild take but people always make AI out to be way more primitive than it is.

          Yes, in it’s most basic for an LLM can be described as an auto-complete for conversations. But let’s be real: the amount of different optimizations and adjustments made before and after the fact is pretty complex, and the way the AI works is pretty close already to a brain. Hell that’s where we started out; emulating a brain. And you can look into this, the base for AI is usually neural networks, which learn to give specific parts of an input a specific amount of weight when generating the output. And when the output is not what we want, the AI slowly adjusts those weights to get closer.

          Our brain works the same in it’s most basic form. We use electric signals and we think associative patterns. When an electric signal enters one node, this node is connected via stronger or lighter bridges to different nodes, forming our associations. Those bridges is exactly what we emulate when we use nodes with weighted connectors in artificial neural networks.

          Our AI output is quality wise right now pretty good, but integrity and security wise pretty bad (hallucinations, not following prompts, etc.), but saying it is performing at the level of a three year old is simultaneously under-selling and overselling how AI performs. We should be aware that just because it’s AI doesn’t mean it’s good, but it also doesn’t mean it’s bad either. It just means there’s a feature (which is hopefully optional) and then we can decide if it’s helpful or not.

          I do music production and I need cover art. As a student, I can’t afford commissioning good artworks every now and then, so AI is the way to go and it’s been nailing it.

          As a software developer, I’ve come to appreciate that after about 2y of bad code completion AIs, there’s finally one that is a net positive for me.

          AI is just like anything else, it’s a tool that brings change. How that change manifests depends on us as a collective. Let’s punish bad AI, dangerous AI or similar (copilot, Tesla self driving, etc.) and let’s promote good AI (Gmail text completion, chatgpt, code completion, image generators) and let’s also realize that the best things we can get out of AI will not hit the ceiling of human products for a while. But if it costs too much, or you need quick pointers, at least you know where to start.

          • geekwithsoul@lemm.ee
            link
            fedilink
            English
            arrow-up
            0
            ·
            3 months ago

            This shows so many gross misconceptions and with such utter conviction, I’m not even sure where to start. And as you seem to have decided you like to get free stuff that is the result of AI trained off the work of others without them receiving any compensation, nothing I say will likely change your opinion because you have an emotional stake in not acknowledging the problems of AI.

        • Zexks@lemmy.world
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          Describe how you ‘learned’ to speak. How do you know what word comes after the next. Until you can describe this process in a way that doesn’t make it ‘human’ or ‘biological’ only it’s no different. The only thing they can’t do is adjust their weights dynamically. But that’s a limitation we gave it not intrinsic to the system.

          • geekwithsoul@lemm.ee
            link
            fedilink
            English
            arrow-up
            0
            ·
            3 months ago

            I inherited brain structures that are natural language processors. As well as the ability to understand and repeat any language sounds. Over time, my brain focused in on only the language sounds I heard the most and through trial and repetition learned how to understand and make those sounds.

            AI - as it currently exists - is essentially a babbling infant with none of the structures necessary to do anything more than repeat sounds back without understanding any of them. Anyone who tells you different is selling you something.

  • ImplyingImplications@lemmy.ca
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    Because AI needs a lot of training data to reliably generate something appropriate. It’s easier to get millions of reddit posts than millions of research papers.

    Even then, LLMs simply generate text but have no idea what the text means. It just knows those words have a high probability of matching the expected response. It doesn’t check that what was generated is factual.

      • ulkesh@beehaw.org
        link
        fedilink
        English
        arrow-up
        0
        ·
        3 months ago

        Because we have brains that are capable of critical thinking. It makes no sense to compare the human brain to the infancy and current inanity of LLMs.

  • tiddy@sh.itjust.works
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Papers are most importantly a documentation of exactly what and how a procedure was performed, adding a vagueness filter over that is only going to decrease its value infinitely.

    Real question is why are we using generative ai at all (gets money out of idiot rich people)

  • Destide@feddit.uk
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    Redditors are always right, peer reviewed papers always wrong. Pretty obvious really. :D

    • xmunk@sh.itjust.works
      link
      fedilink
      arrow-up
      0
      ·
      3 months ago

      What he was fighting for was an awful lot more important than a tool to write your emails while causing a ginormous tech bubble.

    • spongebue@lemmy.world
      link
      fedilink
      arrow-up
      0
      ·
      edit-2
      3 months ago

      Machine learning has some pretty cool potential in certain areas, especially in the medical field. Unfortunately the predominant use of it now is slop produced by copyright laundering shoved down our throats by every techbro hoping they’ll be the next big thing.

    • UlyssesT [he/him]@hexbear.net
      link
      fedilink
      English
      arrow-up
      0
      ·
      3 months ago

      It’s marketing hype, even in the name. It isn’t “AI” as decades of the actual AI field would define it, but credulous nerds really want their cyberpunkerino fantasies to come true so they buy into the hype label.

      • queermunist she/her@lemmy.ml
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        Yeah, these are pattern reproduction engines. They can predict the most likely next thing in a sequence, whether that’s words or pixels or numbers or whatever. There’s nothing intelligent about it and this bubble is destined to pop.

        • UlyssesT [he/him]@hexbear.net
          link
          fedilink
          English
          arrow-up
          0
          ·
          3 months ago

          That “Frightful Hobgoblin” computer toucher would insist otherwise, claiming that a sufficient number of Game Boys bolted together equals or even exceeds human sapience, but I think that user is currently too busy being a bigoted sex pest.

      • FaceDeer@fedia.io
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        The term AI was coined in 1956 at a computer science conference and was used to refer to a broad range of topics that certainly would include machine learning and neural networks as used in large language models.

        I don’t get the “it’s not really AI” point that keeps being brought up in discussions like this. Are you thinking of AGI, perhaps? That’s the sci-fi “artificial person” variety, which LLMs aren’t able to manage. But that’s just a subset of AI.

  • Rampsquatch@sh.itjust.works
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    You could feed all the research papers in the world to an LLM and it will still have zero understanding of what you trained it on. It will still make shit up, it can’t save the world.

  • r00ty@kbin.life
    link
    fedilink
    arrow-up
    0
    ·
    3 months ago

    Anyone running a webserver and looking at their logs will know AI is being trained on EVERYTHING. There are so many crawlers for AI that are literally ripping the internet wholesale. Reddit just got in on charging the AI companies for access to freely contributed content. For everyone else, they’re just outright stealing it.

  • cobysev@lemmy.world
    link
    fedilink
    English
    arrow-up
    0
    ·
    3 months ago

    We are. I just read an article yesterday about how Microsoft paid research publishers so they could use the papers to train AI, with or without the consent of the papers’ authors. The publishers also reduced the peer review window so they could publish papers faster and get more money from Microsoft. So… expect AI to be trained on a lot of sloppy, poorly-reviewed research papers because of corporate greed.

      • Alice@beehaw.org
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        Because that’s what it’s designed for? I’m curious what else it could be good for. A machine capable of independent, intelligent research sounds like a totally different invention entirely.

        • Melatonin@lemmy.dbzer0.comOP
          link
          fedilink
          arrow-up
          0
          ·
          3 months ago

          It’s sort of like the communication aspect of it isn’t the sole purpose of it. It’s as if we invented computers but the only thing we cared about was the monitor and the keyboard.

          We want it to DO things. Stick to the truth, not just placate.

      • thepreciousboar@lemm.ee
        link
        fedilink
        arrow-up
        0
        ·
        3 months ago

        Because “ai” ad we colloquially know today are language models: they train on and can produce language, that’s what they are designed on. Yes, they can produce images and also videos, but they don’t have any form of real knowledge or understanding, they only predict the next word or the next pixel based on their prompt and their vast examples of words and images. You can only talk to them because that’s what they are for.

        Feeding research papers will make it spit research-sounding words, which probably will contain some correct information, but at best an llm trained on that would be useful to search through existing research, it would not be able to make new one