The letter S in a light blue, stylized speech bubble followed by SpeakBits
SpeakBitsThe letter S in a light blue, stylized speech bubble followed by SpeakBits
Trending
Top
New
Controversial
Search
Groups

Enjoying SpeakBits?

Support the development of it by donating to Patreon or Ko-Fi.
About
Rules
Terms
Privacy
EULA
Cookies
Blog
Have feedback? We'd love to hear it!

Building a highly-available web service without a database

screenshotbot.io
submitted
9 mos ago
byboredgamertoprogramming

Summary

In this blog post, we’re going to break down a new architecture for web development. We use it successfully for Screenshotbot, and we hope you’ll use it too. I’m going to demonstrate how you use the architecture in all three phases.

What if the web service and the database instance were exactly one and the same? You don’t need multiple front-end servers talking to a single DB, just get a bigger server with more RAM and more CPU if you need it. You don't need any services to run background jobs, because background jobs are just threads running in this large process.

If your process crashes and restarts, it first reloads the snapshot, and replays the transaction logs to fully recover the state. Since all requests are being served by the same process, which usually doesn’t get killed, it means you can store closures in memory that can be used to serve pages.

The closure has references to the objects, so we don’t need to pass around object-ids across every single request. It’s also a lot easier to write test code, since you no longer have to mock out databases.

Restarting the service can bring down the server for multiple minutes. Even re-deploys are tricky. Raft Consensus Protocol comes in to place.

The read threads parallelize beautifully. The main bottleneck I expect to see is scaling the commit-thread. We use Common Lisp. Common Lisp is also heavily multi-threaded.

Common Lisp is excellent at handling reloading code. We use a cluster of 3 servers per installation, which allows for one server to go down. To store image files, or blobs that shouldn’t be part of the datastore, we use EFS (a highly available NFS)

Screenshotbot runs on a 4-core 16GB machine to serve requests from a well-known customer. The architecture is excellent for new startups, and I’m hoping more companies will adopt it.

 digital clock analog clock Band Aid traffic light traffic signal stoplight-0
11

4 Comments

2
joesch
9 mos ago
That "usually doesn't get killed" is such a massive footgun
2
boredgamerOP
9 mos ago
It's the scary part and makes me wonder if there really is anything against that.
1
throwschen
9 mos ago
I've always wondered if something like this was really doable. The whole "snapshot of RAM" is the missing piece I've always missed.
2
boredgamerOP
9 mos ago
I commend anyone who tries but I wouldn't be surprised with a follow up of "our backups didn't work"