4 ways to make log management successful
Logs are the voice of your apps. They help you understand what’s going on, how it happened, and who is impacted. Amazon Web Services, the big daddy of cloud computing today, clearly considers centralized log management as not only mission critical but also vital for operational excellence. Far from the common perception that logs are useful only to developers and sysadmins, logs deliver valuable insight to other functions in your organization such as customer support, customer success, and product management.
Build it, but they will NOT come
Here at Loggly, we often hear people say, “We are already sending logs to a central repository and monitoring everything, but it is not helping us.” Well, seeing results from your centralized logs is easier said than done. Don’t be fooled by anyone who claims otherwise! Logs have the answers, but it takes a bit of time and perseverance to see the full benefits.
Here are four ways in which you can make log management successful.
Centralize ALL of your logs
Often we see businesses “cherry pick” which logs they would like to centralize for a wide variety of reasons including cost, segregating production and dev systems, security, tech debt, and lack of resources. However, only when you centralize all your logs, will you see the complete picture of your tech stack. As the API-driven economy explodes, systems are getting more complex. Having complete traceability is key to understanding any issue in the system. So don’t leave behind any logs! For all you know, the one that got wiped out by a decommissioned EC2 instance could be the missing cog in the wheel. Plus it is easier for you to troubleshoot when you can find all your logs in one place instead of requiring permissions to multiple places for finding them.
Grant role-based access to logs
Consolidating all your logs is just the first step. What’s more important is sharing that data in a secure and reliable manner so that various functions in your organization can benefit from it. Using a cloud-based log management solution is very useful in this regard. Not only can you make the logs easily accessible to less technically savvy folks in your organization, but you can also make log management an easy and secure process. You can choose to restrict certain kinds of data such as customer data and payment processing to only those who are authorized to see it.
By sharing relevant log data with other functions, you can reap the benefits across the organization. While customer support can use these logs to triage customer issues, product managers can use them to monitor business metrics such as number of logins and signups, number of downgrades, etc. Marketing can monitor the use of coupons or attribute traffic to the proper channels. Create function-specific dashboards so that each team can quickly monitor at a glance what’s going on.
Review your logs regularly
Centralized log management is more than just being a crisis-management tool. Many engineering and ops teams think that consolidating logs and setting up alerts is all that is required for successful log management and troubleshooting. Wait for the customer issue or alert and then dig in! That cannot be farther from the truth.
Very often, logs can reveal underlying issues even before actual symptoms appear in the form of a customer issue in production or a system alert. But spotting these issues early requires a mind shift. Instead of relying on reactive log management processes, you must transition to proactive log management processes, engage everyone in your team, and institutionalize proactive processes. Consider implementing the following proactive steps:
- Schedule daily or weekly reports and summarize findings
- Conduct team reviews even when there is no alert or customer issue
- Consider cross-team audits on your logging practices
- Introduce logging best practices during new hire training
Measure your impact: Net Log Benefit (NLB)
Build an internal scorecard that allows you to track how your team is doing. Here I share a simple formula that you can use to measure the Net Log Benefit (NLB). Feel free to customize or extend this based on your organization’s needs.
- For each issue you resolve using logs before a customer reports it, +2.
- For each issue you resolve using logs before an alert, +1.
- For each issue you resolve using logs after an alert, -1.
- For each issue where you cannot find logs, -2.
Compute your weekly score and share it across the team. This lets you track and measure how you are faring as a proactive log management team. Every week, evaluate the score and see how you can improve as a team.
Log management is not an experiment, it is an immersion
Logging is necessary and beneficial. But it is not a one-time experiment. To make log management work for you, engage your entire team regularly and proactively.
Share your experience in making log management successful in the Comments below or send us a tweet @loggly.
Pranay Kamat is a Product Manager at Loggly. His previous experiences include designing user interfaces, APIs, and data migration tools for Oracle and Accela. He has an MBA from The University of Texas at Austin and Master’s degree in Computer Science from Cornell University.