You would need to run the LLM on the system that has the GPU (your main PC). The front-end (typically a WebUI) could run in a docker container and make API calls to your LLM system. Unfortunately that requires the model to always be loaded in the VRAM on your main PC, severely reducing what you can do with that computer, GPU-wise.
I don’t want to get in the way of your argument re. Usenet, but spinning hard drives will last longer if they stay on. Starting and stopping the spindle motor will impart the greatest wear. As long as you have the thermals managed, a spinning disk is a happy disk.