Hi all, I’d like to hear some suggestions on self hosting LLMs on a remote server, and accessing said LLM via a client app or a convenient website. Either hear about your setups or products you got good impression on.

I’ve hosted Ollama before but I don’t think it’s intented for remote use. On the other hand I’m not really an expert and maybe there’s other things to do like add-ons.

Thanks in advance!

  • hendrik@palaver.p3x.de
    link
    fedilink
    English
    arrow-up
    4
    arrow-down
    1
    ·
    edit-2
    20 days ago

    What’s the difference regarding this task? You can rent it 24/7 as a crude webserver. Or run a Linux desktop inside. Pretty much everything you could do with other kinds of servers. I don’t think the exact technology matters. It could be a VPS, virtualized with KVM, or a container. And for AI workloads, these containers have several advantages. Like you can spin them up within seconds. Scale them etc. I mean you’re right. This isn’t a bare-metal server that you’re renting. But I think it aligns well with OP’s requirements?!

      • ddh@lemmy.sdf.org
        link
        fedilink
        English
        arrow-up
        1
        ·
        17 days ago

        Running an LLM can certainly be an on-demand service. Apart from training, which I don’t think we are discussing, GPU compute is only used while responding to prompts.