Exceptional Logging of Exceptions in Python

 

Python Exceptions Blog Header

Aaron Maxwell is author of the Advanced Python Newsletter.

Exceptions happen. And as developers, we simply have to deal with them. Even when writing software to help us find burritos.

Wait, I’m getting ahead of myself… we’ll come back to that. As I was saying: How we deal with exceptions depends on the language. And for software operating at scale, logging is one of the most powerful, valuable tools we have for dealing with error conditions. Let’s look at some ways these work together.

The “Big Tarp” Pattern

We’re going to start at one extreme:

try:
    main_loop()
except Exception:
    logger.exception("Fatal error in main loop")

This is a broad catch-all. It is suitable for some code path where you know the block of code (i.e, main_loop()) can raise a number of exceptions you may not anticipate. And rather than allow the program to terminate, you decide it’s preferable to log the error information, and continue from there.

The magic here is with exception method. (logger is your application’s logger object—something that was returned from logging.getLogger(), for example.) This wonderful method captures the full stack trace in the context of the except block, and writes it in full.

Note that you don’t have to pass the exception object here. You do pass a message string. This will log the full stack trace, but prepend a line with your message. So the multiline message that shows up in your log might look like this:

Fatal error in main loop
Traceback (most recent call last):
  File "bigtarp.py", line 14, in 
    main_loop()
  File "bigtarp.py", line 9, in main_loop
    print(foo(x))
  File "bigtarp.py", line 4, in foo
    return 10 // n
    ZeroDivisionError: integer division or modulo by zero

The details of the stack trace don’t matter—this is a toy example that illustrates a grown-up solution to a real world problem. Just notice that the first line is the message you passed to logger.exception(), and the subsequent lines are the full stack trace, including the exception type (ZeroDivisionError in this case). It will catch and log any kind of error in this way.

By default, logger.exception uses the log level of ERROR. Alternatively, you can use the regular logging methods— logger.debug(), logger.info(), logger.warn(), etc.—and pass the exc_info parameter, setting it to True:

while True:
    try:
        main_loop()
    except Exception:
        logger.error("Fatal error in main loop", exc_info=True)

Setting exc_info to True will cause the logging to include the full stack trace…. exactly like logger.exception() does. The only difference is that you can easily change the log level to something other than error: Just replace logger.error with logger.warn, for example.

Fun fact: The Big Tarp pattern has an almost diabolical counterpart, which you’ll read about below.

The “Pinpoint” Pattern

Now let’s look at the other extreme. Imagine you are working with the OpenBurrito SDK, a library solving the crucial problem of finding a late-night burrito joint near your current location. Suppose it has a function called find_burrito_joints() that normally returns a list of suitable restaurants. But under certain rare circumstances, it may raise an exception called BurritoCriteriaConflict.

from openburrito import find_burrito_joints, BurritoCriteriaConflict
# "criteria" is an object defining the kind of burritos you want.
try:
    places = find_burrito_joints(criteria)
except BurritoCriteriaConflict as err:
    logger.warn("Cannot resolve conflicting burrito criteria: {}".format(err.message))
    places = list()

The pattern here is to optimistically execute some code—the call to find_burrito_joints(), in this case—within a try block. In the event a specific exception type is raised, you log a message, deal with the situation, and move on.

The key difference is the except clause. With the Big Tarp, you’re basically catching and logging any possible exception. With Pinpoint, you are catching a very specific exception type, which has semantic relevance at that particular place in the code.

Notice also, that I use logger.warn() rather than logger.exception(). (In this article, wherever you see warn(), you can substitute info(), or error(), etc.) In other words, I log a message at a particular severity instead of logging the whole stack trace.

Why am I throwing away the stack trace information? Because it is not as useful in this context, where I’m catching a specific exception type, which has a clear meaning in the logic of the code. For example, in this snippet:

characters = {"hero": "Homer", "villain": "Mr. Burns"}
# Insert some code here that may or may not add a key called
# "sidekick" to the characters dictionary.
try:
    sidekick = characters["sidekick"]
except KeyError:
    sidekick = "Milhouse"

Here, the KeyError is not just any error. When it is raised, that means a specific situation has occurred—namely, there is no “sidekick” role defined in my cast of characters, so I must fall back to a default. Filling up the log with a stack trace is not going to be useful in this kind of situation. And that is where you will use Pinpoint.

The “Transformer” Pattern

Here, you are catching an exception, logging it, then raising a different exception. First, here’s how it works in Python 3:

try:
    something()
except SomeError as err:
    logger.warn("...")
    raise DifferentError() from err

In Python 2, you must drop the “from err”:

try:
    something()
except SomeError as err:
    logger.warn("...")
    raise DifferentError()

(That turns out to have big implications. More on that in a moment.) You will want to use this pattern when an exception may be raised that does not map well to the logic of your application. This often occurs around library boundaries.

For example, imagine you are using the openburrito SDK for your killer app that lets people find late-night burrito joints. The find_burrito_joints() function may raise BurritoCriteriaConflict if we’re being too picky. This is the API exposed by the SDK, but it does not conveniently map to the higher-level logic of your application. A better fit at this point of the code is an exception you defined, called NoMatchingRestaurants.

In this situation, you will apply the pattern like this (for Python 3):

from myexceptions import NoMatchingRestaurants
try:
    places = find_burrito_joints(criteria)
except BurritoCriteriaConflict as err:
    logger.warn("Cannot resolve conflicting burrito criteria: {}".format(err.message))
    raise NoMatchingRestaurants(criteria) from err

This causes a single line of output in your log, and triggers a new exception. If never caught, that exception’s error output looks like this:

Traceback (most recent call last):
  File "transformerB3.py", line 8, in 
    places = find_burrito_joints(criteria)
  File "/Users/amax/python-exception-logging-patterns/openburrito.py", line 7, in find_burrito_joints
    raise BurritoCriteriaConflict
openburrito.BurritoCriteriaConflict
The above exception was the direct cause of the following exception:
Traceback (most recent call last):
  File "transformerB3.py", line 11, in 
    raise NoMatchingRestaurants(criteria) from err
myexceptions.NoMatchingRestaurants: {'region': 'Chiapas'}

Now this is interesting. The output includes the stack trace for NoMatchingRestaurants. And it reports the instigating BurritoCriteriaConflict as well… clearly specifying which was the original.

In Python 3, exceptions can now be chained. The raise ... from ... syntax provides this. When you say raise NoMatchingRestaurants(criteria) from err, that raises an exception of typeNoMatchingRestaurants. This raised exception has an attribute named __cause__, whose value is the instigating exception. Python 3 makes use of this internally when reporting the error information.

How do you do this in Python 2? Well, you can’t. This is one of those goodies you just have to upgrade to get. In Python 2, the “raise … from” syntax is not supported, so your exception output will include only the stack trace for NoMatchingRestaurants. The Transformer pattern is still perfectly useful, of course.

The “Message and Raise” Pattern

In this pattern, you log that an exception occurs at a particular point, but then allow it to propagate and be handled at a higher level:

try:
    something()
except SomeError:
    logger.warn("...")
    raise

You are not actually handling the exception. You are just temporarily interrupting the flow to log an event. You will do this when you specifically have a higher-level handler for the error, and want to fall back on that, yet also want to log that the error occurred, or the meaning of it, at a certain place in the code.

This may be most useful in troubleshooting—when you are getting an exception, but trying to better understand the calling context. You can interject this logging statement to provide useful information, and even safely deploy to production if you need to observe the results under realistic conditions.

The “Cryptic Message” Antipattern

Now we’ll turn our attention to some anti-patterns… things you should not do in your code.

try:
    something()
except Exception:
    logger.error("...")

Suppose you or someone on your team writes this code, and then six months later, you see a funny message in your log. Something like:

ERROR: something bad happened

Now I hope and pray that you will not see the words “something bad happened” in your actual log. However, the actual log text you see may be just as baffling. What do you do next?

Well, the first thing is to figure out where in the code base this vague message is being logged. If you are lucky, you will be able to quickly grep through the code and find exactly one possibility. If you are not lucky, you may find the log message in several completely different code paths. Which will leave you with several questions:

  • Which one of them is triggering the error?
  • Or is it several of them? Or ALL of them?
  • Which entry in the log corresponds to which place?

Sometimes, however, it’s not even possible to grep or search through the code to find where it is happening, because the log text is being generated. Consider:

    what = "something"
    quality = "bad"
    event = "happened"
    logger.error("%s %s %s", what, quality, event)

How would you even search for that? You may not even think of it, unless you searched for the full phrase first and got no hits. And if you did get a hit, it could easily be a false positive.

The ideal solution is to pass the exc_info argument:

try:
    something()
except Exception:
    logger.error("something bad happened", exc_info=True)

When you do this, a full stack trace is included in the application logs. This tells you exactly what line in what file is causing the problem, who invoked it, et cetera… all the information you need to start debugging.

The Most Diabolical Python Antipattern

If I ever see you do this one, I will come to your house to confiscate your computers, then hack into your github account and delete all your repos:

try:
    something()
except Exception:
    pass

In my newsletter, I refer to this as “The Most Diabolical Python Antipattern.” Notice how this not only fails to give you any useful exception information. It also manages to completely hide the fact that anything is wrong in the first place. You may never even know you mistyped a variable name—yes, this actually masks NameError—until you get paged at 2 a.m. because production is broken, in a way that takes you until dawn to figure out, because all possible troubleshooting information is being suppressed.

Just don’t do it. If you feel like you simply must catch and ignore all errors, at least throw a big tarp under it (i.e. use logger.exception() instead of pass).

More Exception Logging Patterns

There are many more patterns for logging exception information in Python, with different trade-offs, pros and cons. What patterns have you found useful, or not? Let everyone know in the comments.


4 comments

  • Nenad Propadovic

    6 months ago

    Hello Aaron,
    well, I’ve seen (and repaired) something worse than what you tag “The Most Diabolical Python Antipattern”: six levels of “The Most Diabolical Python Antipattern” nested one within each other :). It was production code, so it kept a team of five supporters on their toes most of the time. That was in 2008. I repaired it, using unit-test to secure the progress. Worked out quite well :).
    I like your post, btw., an I learned something from it, so I’ll subscribe.
    Cheers,
    Nenad

    • Karen Sowa

      Karen Sowa

      5 months ago

      Hi Nedad,

      Thanks for sharing your battle story! It is so satisfying when you can find a solution that reduces pain for multiple members of your team. Great job!

      Thanks for the subscribe, glad you’re liking the blog. We really appreciate it! Let us know what other topics you’d like to see in the future :)

      – Karen & the Loggly Team

  • Kenneth Redler

    9 months ago

    Is there a way to use a single logger to:
    1. log the message and stack trace to one handler
    while
    2. only logging the message to another handler?

    I envision a filter on the handler might be the way to go here, but was curious if you have any suggestions. Thanks!

    • Karen Sowa

      Karen Sowa

      5 months ago

      Hi Kenneth,

      Thanks for reading!

      You can do this by creating a special formatter that just leaves out the exception information. The example code creates a logger with two handlers, one of which gets a special formatter that does not emit exception information.

      Check out this gist for more information: https://gist.github.com/jorgenschaefer/50c0f8f42c8bfb046da1e26de51f0ada

      Happy logging!

Share Your Thoughts

Top