Serverless with Servers, a theory

Posted Dec 6, 2019

I have a simple theory to share with you: I think you can do serverless with servers. This is an expansion of the following tweet which was prompted by discussion of Amazon EKS + Fargate, Google’s Cloud Run, and similar products.

Nodes don't need to be abstracted away from you to be self-managing. @coreos Container Linux has been doing this for years.
I haven't logged in, updated or managed availability of any @openshift 4 nodes...it's all handled for me by the platform, as it should be. pic.twitter.com/Jr8tIi9GVf
— Rob Szumski (@robszumski) December 4, 2019

Let’s expand this by answering two questions:

why do we hate servers?
why do we do ridiculous things to avoid them?

Why Servers Suck

Servers suck because we love poking, prodding and abusing them *ahem* configuring them.
Servers suck because no one is a Linux expert.
Servers suck because they always need to be updated.
Servers suck because they have SSH keys and other secrets that require tricky handling.

All of this sucks because the foundation that drives it all, the Linux distro, is not built for modern computing. We don’t need a scorched-earth-kill-the-servers policy, we just need an operating system that does all of this for you. In today’s world, that’s an immutable OS that runs containers.

At CoreOS, we verified this at scale by managing a fleet of servers the size of a cloud – millions of machines a month – each one immutable, self-updating and configured out of the box. You don’t have to know what systemd is, or your kernel version, or how to fix the latest CVE to be successful (Meltdown patched automatically). It’s all handled for you, just like a serverless environment. Just bring your containers.

Our Ridiculous Serverless Hacks

Now on to question #2: why are we doing ridiculous hacks to avoid servers? Folks were using cron jobs to keep Heroku dynos alive years ago and we’re still pursuing the same line of hacks to keep our Lambda instances warm and having to hack software to keep TCP streams alive for our database connections. Why not take advantage of a redesigned Linux (but still avoid managing it) and immediately get the networking and storage that we’re used to (and we can even keep a long running process and it’s cheaper and it runs anywhere).

Now of course, the elasticity of a serverless environment is powered by a platform. One of the reasons our servers suck is that they aren’t aware of each other, which makes it pretty hard to work together. The superpower duo is our immutable OS paired with an orchestrator like Kubernetes. The same properties we applied to make our OS self-managing can also be applied to our platform.

Orchestration is a topic for another post, but it’s a required component in this new model of Linux. In fact, you should start to think of the orchestrator and the OS as the same entity. CoreOS started down this path in 2016 and the work continues as part of Red Hat’s OpenShift 4. This fulfills the key user experience of serverless, you have one place to start workloads and you never have to worry about them again. Until you need to debug something. Then it’s really nice to have a server.

A Server When You Need It

What you get with this new model of Linux is the best properties of a server: tools like tcpdump, persistent storage (use it wisely!), and normal networking but without any of the management overhead – the serverless part. Yes, you are signing up for some high level operations, but there’s no manual toil in this system. The best part? You get a normal Linux box when you want it, and get to leave the parts that make servers suck in the past.

Remember, Lambda is one of Amazon’s most expensive services. The hacks we’ve built up around it are even more costly in terms of complexity, auditability and re-tooling how our engineers thinking about interactions between services. I think most serverless users can meet their needs by making servers not suck.