Enterprises are migrating to microservices and container-based infrastructures. With the success of Docker, containers are now right in the public eye. Logging is a hot topic in the Docker community because containerization changes the nature of logging. However, few established best practices have emerged.
As with any service, logging is a core component of Docker. Analyzing logs provides insight into the performance, stability, and reliability of containers and the Docker service itself. However, because of the flexible and dynamic nature of Docker, there’s no single approach to gathering and storing log events. Instead, we have a variety of solutions at our disposal, each with its own benefits and drawbacks.
As the Docker ecosystem continues to evolve, we have to ask ourselves the following questions:
- How can we log and monitor Docker effectively? This includes logging the Docker runtime infrastructure, the container itself and what goes on inside of it, and how to ensure to collect log data from ephemeral containers.
- How can we use feedback from containers to manage and improve the quality of our services?
- Can we build off of decades of experience logging monolithic applications, or do we have to start from scratch?
- If we have to start from scratch, how can we build a solution that helps us make better decisions?
In this post, we’ll look at some of the tools, techniques, and methods available for crafting a comprehensive Docker logging solution.
Key Considerations When Logging in Docker
Although there are some similarities, container-based logging is still very different from traditional application-based logging. As you consider the different approaches to logging in Docker, there are a few things you’ll want to keep in mind.
Containers Are Transient
Containers come and go. They start, they stop, they’re destroyed, and they’re rebuilt on a regular basis. Storing persistent application data inside of a container is an anti-pattern with Docker, since the data will be lost once the process completes. While containers can store persistent data through the use of volumes, the recommended solution is to export data (logs or otherwise) to a service that can store it long-term, whether it’s a folder on the local hard drive or an Amazon S3 bucket. This way, you can stop and start your containers without compromising your data.
Containers Are Multi-Tiered
Logging in Docker isn’t as simple as configuring a framework and running your container. Even the simplest Docker installation has at least three distinct levels of logging: the Docker container, the Docker service, and the host operating system (OS). As the infrastructure becomes more complex and more containers are deployed, you’ll need a way of associating log events with specific processes rather than just their host containers. For example, using the Loggly Docker container mentioned earlier, you can define custom tags for each log event as it passes through the container and later correlate those events in Loggly.
Containers Are Complex
Docker is robust enough for many enterprises, but there are lingering security issues that have yet to be resolved. Compared to virtual machines, containers pose a much larger attack vector since they share the same kernel as the host. Some enterprises have worked around this by running Docker containers in a virtual environment. For instance, tools such as Boot2Docker or Docker Toolbox address this shortcoming while simultaneously making it possible to run Docker on non-Linux machines.
Meanwhile, projects such as RancherVM have taken the opposite approach by running virtual machines inside of Docker containers. Known as VM containers, these containers run just like normal containers except they host a complete Kernel-based virtual machine (or KVM) environment. This merges the flexibility of Docker containers with the security of virtual machines.
Unfortunately, both approaches come with drastic increases in logging complexity. Not only do you have to log the application, the Docker daemon, and host OS, but you also have to log the virtual machine and hypervisor. Logging, tagging, and associating all of these services is not just a feat of architecture engineering, but it’s also far beyond the scope of this article. You can learn more about logging each approach through their respective websites. It is important, though, to have the right logging strategy in place for the respective approach you’re taking. Missing out on the opportunity to collect and aggregate the logs of one particular tier might prevent you from efficiently troubleshooting issues.
Methods of Logging in Docker
Similar to virtualization, containers add an extra layer between an application and the host OS. Logging Docker effectively means not only logging the application and the host OS, but also the Docker service.
Logging via the Application
This process is likely what most developers are familiar with. In this process, the application running inside the container handles its own logging using a logging framework. For instance, a Java application might use Log4j2 to format and send logs to a remote destination. This acts as an easy and intuitive migration path for enterprises using a logging framework in their existing applications. The logs are sent from the application to a remote centralized server bypassing Docker and the OS. This gives developers the most control but also adds additional load on the application process.
When Should I Log via the Application?
Application-based logging closely resembles logging monolithic applications. It lets developers continue using existing application logging frameworks without having to add logging functionality to the host. However, since the logging framework is limited to the container itself, any logs stored in the container’s filesystem will be lost if the container shuts down. This is because a container’s underlying file system only persists for the life of the container. To prevent data loss, you will have to either configure a persistent data volume or forward logs to a remote destination. Application logging also becomes difficult when deploying multiple identical containers, since you would need a way of uniquely identifying each container.
Logging via Data Volumes
When dealing with Docker logs, there is one really important caveat you must keep in mind at all times. Because containers are stateless by nature, any files created within the container will be lost if the container shuts down. Instead, containers must either forward log events to a centralized logging service such as Loggly or store log events in a data volume.
With a data volume, you can store long-term data in your containers by mapping a directory in the container to a directory on the host machine. You can also share a single volume across multiple containers to centralize logging across multiple services. However, data volumes make it difficult to move these containers to different hosts without potentially losing log data.
When Should I Log via Data Volumes?
Data volumes are effective for centralizing and storing logs over an extended period of time. Because they link to a directory on the host machine, data volumes significantly reduce the chances of data loss due to a failed container. Because the data is now available to the host machine, you can make copies, perform backups, or even access the logs from other containers.
Logging via the Docker Logging Driver
Another option is to forward log events from each container to the Docker service, which then sends the events to a syslog instance running on the host. With Loggly in place, you accomplish this by changing the Docker logging driver to log to syslog and then use the Configure-Syslog script to forward the events to Loggly. Another solution—which we’ll discuss later in the post—is to have the application forward its logs to a container dedicated solely to logging. That container, rather than the host OS, becomes responsible for forwarding each event to the right destination.
When Should I Log via the Docker Logging Driver?
Unlike data volumes, the Docker logging driver reads log events directly from the container’s stdout and stderr output. This lets you quickly and effectively centralize your container logs by using just the Docker service. The benefit is that your containers will no longer need to write to and read from log files, resulting in a performance gain. Additionally, since log events are stored in the host machine’s syslog, they can be easily routed to Loggly.
Logging via a Dedicated Logging Container
While the two previous methods have several advantages, they share a common disadvantage: They rely on a service running on the host machine. Dedicated logging containers, on the other hand, let you manage logging from within the Docker environment. Dedicated logging containers can retrieve log events from other containers, aggregate them, then store or forward the events to a third-party service. This approach is more aligned with the microservices architecture since it eliminates your containers’ dependencies on the host machine without hindering your logging capabilities.
Dedicated logging containers can manage logs for specific containers, or they can act as a “log vacuum” for multiple containers. For example, the Logspout container automatically captures stdout output from any containers running on the same host and forwards them to a remote syslog service. You can define the destination URL when running the container:
docker run --name="logspout" \ --volume=/var/run/docker.sock:/tmp/docker.sock \ gliderlabs/logspout \ rfc5424://logs-01.loggly.com:514?structuredData=<Loggly token>\ tag=\"<env>\"\ tag=\"<role>\"
When Should I Use a Dedicated Logging Container?
In addition to centralizing and aggregating logs, dedicated logging containers eliminate any dependencies on the host machine. Not only does this make it easier to move containers between hosts, but it lets you scale your logging infrastructure as needed by adding additional containers. Dedicated logging containers can retrieve logs through multiple streams (data volumes, stdout, etc.), making them at least as flexible as host-based logging solutions.
Logging via the Sidecar Approach
The sidecar approach is discussed more thoroughly in a previous post. To summarize, each container is linked with its own logging container. The first (or application) container saves its logs to a volume that can be accessed by the logging container. The second (or logging) container then uses file monitoring to tag and forward each event to Loggly. An example of this approach is the Loggly Docker container. Although similar to dedicated logging containers, sidecar containers can offer greater transparency into the origin of log events.
When Should I Use the Sidecar Approach?
As with dedicated logging, the key benefit of the sidecar approach is that it lets you manage logging the same way you manage your applications. Sidecar containers scale more easily than other logging methods, making them ideal for larger deployments. This approach also lets you incorporate additional tracking information specific to the logging container into each log event. By providing custom tags, you can more easily track where log events originate and which containers are actively generating logs.
The downside to this approach is that it can be complex and more difficult to set up. Both containers must work in tandem or you may end up with incomplete or missing log data. In this case, it might be easier to use a tool such as Docker Compose to manage both containers as a single unit.
What’s the Solution?
As we mentioned earlier, there is no all-in-one solution to logging with Docker. We’ve shown you several approaches to logging each tier of your Docker architecture. Logging from the container benefits smaller, simpler deployments due to its relatively easy configuration. Larger deployments benefit more from the sidecar approach due to its scalability. The correct approach ultimately depends on the specific layout and needs of your service.
We still have a long way to go before Docker containers become a truly mature architecture. While there’s no clearly defined “best” solution for logging Docker containers, that doesn’t diminish the importance of keeping thorough logs. We’ll be sure to keep you updated on emerging trends in Docker logging.
Related Reading: What Does the Docker Daemon Log Contain? »