Question about Lemmy front-end and feddit.de

lemmyreader@lemmy.ml · edit-2 1 year ago

Question about Lemmy front-end and feddit.de

Victor@lemmy.world · 1 year ago

Isn’t there a software issue if it breaks with time unless you maintain it? What has happened more specifically? Memory leaks?

_edge@discuss.tchncs.de · 1 year ago

Memory leaks?

Possible, but much more likely is disk full. Not a bug, just something that happens…

Victor@lemmy.world · 1 year ago

Good point, didn’t think of that. That’s not an issue with the software. Although, one could argue that it should not break down and become unresponsive.

taladar@sh.itjust.works · 1 year ago

A lot of software writes to log files or temporary files or lock files or database transaction logs as part of its normal function and when those writes fail due to a full disk the software doesn’t work anymore.

Victor@lemmy.world · 1 year ago

That’s bad software then, right? The inability to write to disk shouldn’t cause the software to lose all functionality. Unless that’s its only function, or somehow depends on it for proper functioning. 🤷‍♂️

Jarvis2323@programming.dev · 1 year ago

No. Every good software program should write at least logs to disk. Every good database writes to disk. Add a new post, db will commit to the db and the db will grow in size.

Name any decent sized program where new content is added and I guarantee it writes to disk and will fail eventually if not maintained.

Victor@lemmy.world · 1 year ago

Nice down vote. Let’s discuss instead.

I’m saying that the server shouldn’t go down just because new content can’t be added. You should get maybe a 500-series REST response or something. Not… nothing. Ideally it should write to disk. Ideally it should allow new content to be added. But uptime and content access is still more important than being able to write to disk. It should warn the admin of the serious errors, and explain to the user in some diplomatic/apologetic manner. But never go down completely. That’s not resilient at all.

That’s my opinion. 👍

KISSmyOSFeddit@lemmy.world · 1 year ago

I’m almost positive it did warn the admin in some way, but the admin was afk for weeks and didn’t see the warnings.

Victor@lemmy.world · 1 year ago

That’s good. 👍 Just one piece of the puzzle though. 😬

Jarvis2323@programming.dev · 1 year ago

For the record I did not downvote.

But I capitulate on your point. It would be great if every piece of software was written with resilience and uptime in mind.

As a former sysadmin that sounds like a dream. But I don’t think I have ever seen that with any mainstream program that I’ve had responsibility for. Does that mean all those programs were bad? I don’t think so. We wouldn’t need sysadmins if all programs were written the way you describe.

Programs can be written to auto rotate their logs, compact and reindex their db’s. Using browser updates as an example, they can even safely auto update and revert back on failure.

How many programs actually do these things? My experience is next to 0. But I wouldn’t call them all bad or poorly written programs.

KISSmyOSFeddit@lemmy.world · 1 year ago

Fun fact: Old school admins used to write a large-ish (~5% hdd space) file of random data to the drive right after installing the server. If the hard drive ran full without anyone noticing, you could just delete the file to get some breathing room to deal with the issue. It’s a very crude alarm system, but one you WILL notice when it goes off even if you ignore all emails.

Victor@lemmy.world · edit-2 1 year ago

ran full without anyone noticing

alarm system, but one you WILL notice when it goes off even if you ignore all emails

I’m slightly confused. They will notice it, but somehow it still happened without anyone noticing? I feel like I lost something in that text. 😅 What makes that trick such a good alarm system?

KISSmyOSFeddit@lemmy.world · edit-2 1 year ago

The server’s hard drive filled up completely from image uploads, which in turn corrupted the database.
By the time the admin noticed it and had time to troubleshoot, the automated backup process had replicated the corrupted database, overwriting all backups that had a still-functioning database.
There was some personal event in the admin’s life as well as a long planned vacation at that time.

The lesson to be learned here is that a private server administrated by a single person in their spare time isn’t something you can rely on.
The Feddit community is currently trying to found an organisation that can share the administrative load and is able to receive donations.

Victor@lemmy.world · 1 year ago

hard drive filled up completely from image uploads, which in turn corrupted the database

This is so baffling to me. How does this happen? There needs to be checks in place so that this can’t happen ffs lol. No space left on the device to complete the write? Abort. Or like, starting to run out of space => stop accepting new content until fixed, to protect the integrity of the data. Something.

Anyway, I hope they manage to find a solution to sharing the load of work! 🙏😌

KISSmyOSFeddit@lemmy.world · 1 year ago

You can set up safeguards against this. You can also make sure some of your backups are never overwritten. But you have to do it in advance.