Centralizing Python Logs
In the modern cloud-based world, logs on a single machine are rarely useful. Administrators would need to log in to dozens, hundreds, or even thousands of systems to check them. Therefore, logs need to be sent to a central server for aggregation and easy access. This can be achieved through various ways in Python, with a key difference depending on the type of event being logged.
The two main types of log messages that benefit from different treatment are event streams versus exceptional events. Event streams are what we usually think of when we talk about logs—individual events that happen during the normal execution of the program. These can indicate problems, but regularly just inform about normal behavior, and even for problems they mean the programmer had a good idea on how to continue. Exceptional events, on the other hand—mainly actual exceptions in the program—indicate a serious problem. Something exceptional happened that caused the normal flow of the program to be interrupted, and the programmer usually did not know how the program could continue normally in this situation.
The key difference for handling these two types of events is that events in an event stream can occur quite frequently, while exceptional events (hopefully) happen more sporadically, but can have a large amount of associated information. As exceptional events happen less often, and indicate a serious problem in the program, it is possible to spend more time processing them. Events in an event stream happen during the normal execution of the program and emitting them should impact the program as little as possible.
For regular programs, this is pretty straightforward. For example, a new-style daemon might have a
main() function as the main entry point. Using Sentry, the program could look like this (documentation):
from raven import Client
client = Client()
This will simply capture any exception and send it to the Sentry tracker, and then terminate the program. It’s best practice to terminate a program on an unhandled exception and leave it up to systemd to restart it, as you do not know what state the program is in after an exception.
For programs using frameworks such as Django, the exception handler needs to hook into the framework itself. Luckily, most error trackers support the common frameworks directly.
Sometimes, an exceptional event is not directly associated with an exception. In these cases, it is also possible to send an event directly. In the example above, something like the following would be possible:
message='API returned an unexpected status code',
But again, this is for exceptional messages only. The call will cause a network request to be made, and block the application until that succeeds.
Event Stream Logging
Contrary to this, event streams are often sent to a local service first which is then responsible for forwarding the events to a central logging service such as Loggly. This avoids both possible blocking due to network traffic as well as loss of data due to a terminating program.
The main local service to use for this is
syslog. Even with the advent of
journald, the excellent support for network logging in
syslog daemons is still the main way to go. There are multiple options for getting messages to
syslog, though. The simplest is to just send lines to standard output and let systemd handle these. Another option is to talk directly to
syslog or journald. This is a bit more complex, but allows more control over the message.
syslog with Python is extensively documented elsewhere.
It is possible to avoid the local intermediate server entirely and send messages to the central logging system directly. This has the advantage of a simpler system configuration, but makes the application more susceptible to network problems.
For example, it’s possible to send log events directly to Loggly using the loggly-python-handler module for the standard logging library. Recognizing the problem with blocking the application, this advanced library actually runs a worker thread in the background to handle the sending of log messages. This avoids blocking, but can lose log messages when the program terminates.