The cost of tailing logs in kubernetes

Originally published at

Logging is one of those plumbing things that often gets attention only when it’s broken. That’s not necessarily a criticism. Nobody makes money off their own logs. Rather we use logs to gain insight into what our programs are doing… or have done, so we can keep the things we do make money from running. At small scale, or in development, you can get the necessary insights from printing messages to stdout. Scale up to a distributed system and you quickly develop a need to aggregate those messages to some central place where they can be useful. This need is even more urgent if you’re running containers on an orchestration platform like kubernetes, where processes and local storage are ephemeral.

Since the early days of containers and the publication of the Twelve-Factor manifesto a common pattern has emerged for handling logs generated by container fleets: processes write messages to stdout or stderr, containerd (docker) redirects the standard streams to disk files outside the containers, and a log forwarder tails the files and forwards them to a database. The log forwarder fluentd is a CNCF project, like containerd itself, and has become more or less a de facto standard tool for reading, transforming, transporting and indexing log lines. If you create a GKE kubernetes cluster with cloud logging enabled (formerly Stackdriver) this is pretty much the exact pattern that you get, albeit using Google’s own flavor of fluentd.

Continue reading