feddit.de has been giving “Server error” for some time but I read that the server is still working when using a Lemmy app. Tried the Photon front-end today and choosing feddit.de as instance.
- Example post from All : https://phtn.app/post/feddit.de/12459995
- Example (German) post from Local, 3 hours ago : https://phtn.app/post/feddit.de/12475331
My question (I’m just curious, I have no account on feddit.de) is : Can an alternative front-end on their server co-exist with the other server software ? I guess it would be a matter of installing Photon and then point nginx configuration to that. Or am I missing something crucial ?
Isn’t there a software issue if it breaks with time unless you maintain it? What has happened more specifically? Memory leaks?
Memory leaks?
Possible, but much more likely is disk full. Not a bug, just something that happens…
Good point, didn’t think of that. That’s not an issue with the software. Although, one could argue that it should not break down and become unresponsive.
A lot of software writes to log files or temporary files or lock files or database transaction logs as part of its normal function and when those writes fail due to a full disk the software doesn’t work anymore.
That’s bad software then, right? The inability to write to disk shouldn’t cause the software to lose all functionality. Unless that’s its only function, or somehow depends on it for proper functioning. 🤷♂️
No. Every good software program should write at least logs to disk. Every good database writes to disk. Add a new post, db will commit to the db and the db will grow in size.
Name any decent sized program where new content is added and I guarantee it writes to disk and will fail eventually if not maintained.
Nice down vote. Let’s discuss instead.
I’m saying that the server shouldn’t go down just because new content can’t be added. You should get maybe a 500-series REST response or something. Not… nothing. Ideally it should write to disk. Ideally it should allow new content to be added. But uptime and content access is still more important than being able to write to disk. It should warn the admin of the serious errors, and explain to the user in some diplomatic/apologetic manner. But never go down completely. That’s not resilient at all.
That’s my opinion. 👍
I’m almost positive it did warn the admin in some way, but the admin was afk for weeks and didn’t see the warnings.
That’s good. 👍 Just one piece of the puzzle though. 😬
For the record I did not downvote.
But I capitulate on your point. It would be great if every piece of software was written with resilience and uptime in mind.
As a former sysadmin that sounds like a dream. But I don’t think I have ever seen that with any mainstream program that I’ve had responsibility for. Does that mean all those programs were bad? I don’t think so. We wouldn’t need sysadmins if all programs were written the way you describe.
Programs can be written to auto rotate their logs, compact and reindex their db’s. Using browser updates as an example, they can even safely auto update and revert back on failure.
How many programs actually do these things? My experience is next to 0. But I wouldn’t call them all bad or poorly written programs.
Fun fact: Old school admins used to write a large-ish (~5% hdd space) file of random data to the drive right after installing the server. If the hard drive ran full without anyone noticing, you could just delete the file to get some breathing room to deal with the issue. It’s a very crude alarm system, but one you WILL notice when it goes off even if you ignore all emails.
I’m slightly confused. They will notice it, but somehow it still happened without anyone noticing? I feel like I lost something in that text. 😅 What makes that trick such a good alarm system?
The server’s hard drive filled up completely from image uploads, which in turn corrupted the database.
By the time the admin noticed it and had time to troubleshoot, the automated backup process had replicated the corrupted database, overwriting all backups that had a still-functioning database.
There was some personal event in the admin’s life as well as a long planned vacation at that time.
The lesson to be learned here is that a private server administrated by a single person in their spare time isn’t something you can rely on.
The Feddit community is currently trying to found an organisation that can share the administrative load and is able to receive donations.
This is so baffling to me. How does this happen? There needs to be checks in place so that this can’t happen ffs lol. No space left on the device to complete the write? Abort. Or like, starting to run out of space => stop accepting new content until fixed, to protect the integrity of the data. Something.
Anyway, I hope they manage to find a solution to sharing the load of work! 🙏😌
You can set up safeguards against this. You can also make sure some of your backups are never overwritten. But you have to do it in advance.