Transaction Tracing is a Valuable Troubleshooting Method
Imagine that your customer support team told you that an important customer was unable to complete an urgent transaction. They give you the user ID and ask you to figure out what happened. If you have been logging all of your application logs to Loggly, this type of request is pretty easy to fulfill. In this post, I’ll show you how.
Logging for Transaction Tracing
Step One: Include a unique user identifier
In order to trace a sequence of events using your logs, you need to include a unique user identifier in all of your log events. This could be a Globally Unique Identifier (GUID), a unique number, an API key, or even a session ID. In any of these cases, the unique identifier will identify a particular user at a particular point in time (or a particular request) so that you can create a search that shows all of the logs from that user.
In my example, I’m searching on the user ID, which happens to be an eight-digit number. Since my customer support team told me roughly when the problem occurred, I also have a timeline to narrow my search results.
Step Two: Log your entire application stack
If you log from both the front-end and the back-end of your application, your logs will give you visibility into the progress of every transaction across your whole stack. You’ll be able to see everything that happened as the request was processed and spot where the problem occurred. This is especially powerful when you have a service oriented architecture where there can be many systems or APIs and you want a single view across all of them.
Step Three: Search for the unique identifier to trace logs across your stack
Once the logs are in Loggly, you can search for the unique identifier and retrieve all the logs for that given request or transaction. This also requires you to pass this identifier between application or service calls. If you have multiple identifiers for a transaction, such as a second internal service call, you can OR them together in our search bar to see the full set.
Using Loggly for Tracing Events
I want to quickly scan my events across systems to trace what happened. However, this can include a lot of noise because each system may have its unique log format. To add order to chaos, I find it easiest to use the Loggly grid view. This spreadsheet-like interface (shown in the screen shot at the top of this post) provides a nice columnar layout that immediately shows you a progression over time and highlights which parts of the infrastructure were involved.
I group the columns together by application so I can see the transaction flow from one service to the next. I sort in ascending order so I can see the sequence over time as I read down the table. This makes a nice waterfall view of the event where each step of processing is down and to the right. If your transaction involves return calls, you can see those passed back to the left columns at the end of the transaction.
Here’s how you can set up a grid like the one shown above:
Access the Grid view panel by clicking on the grid icon next to the “Events” tab in the search window.
In my example, I used these fields:
- Timestamp: so I could see the sequence of events by time
- Fields for each of components in my application stack, so I could see the transaction moving from component to component
Follow Your Transaction Process
- In my application, a requests first hits Apache. The customer puts something in her shopping cart.
- Then, the request gets passed to a Tomcat purchase servlet for processing. We see each java class that the transaction went through because I set java logback logging to log the the java class name. So, I can see where in the code the request was being sent from.
- After my servlet processes its request, it creates a REST request to my inventory service. The first thing the inventory service does is load inventory from my inventory database, and that’s where the error happened.
- I can see that the request handler sent a 500 code back to Tomcat. Tomcat then displayed a formatted error message to the customer.
By tracing our unhappy customer’s transaction through whole application stack, I determined that the root cause was that the item couldn’t be loaded from the inventory database. Since I also knew the item’s product ID, I could verify that it was missing from the database. Adding that back prevented the same error from affecting future customers’ purchases. I could also set up error reporting to inform me if a similar issue happens in the future. (See my recent blog post on error reporting.)
Transaction Tracing And Tracing Errors Help You Close the Loop with Customers
Loggly gives you a complete record of transactions or user sessions, so you can easily see what happened and find the root cause of problems. This can help your operations team ensure deployments succeed, and help customer support team follow up with affected customers. In my example, since I fixed the problem, we can let the customer know they can try again or I can just process it for them manually. Either way, Loggly has helped me deliver a better customer experience that will lead to more repeat business.