Loggly Q&A: A Look at Log Management and IT Performance Management
Dennis Callaghan is a Senior Analyst on the Infrastructure Software team at 451 Research. He leads the firm’s coverage of application and Internet performance management, service-level monitoring and management, and IT asset and service management. Follow Dennis on Twitter at: @DennisCallaghan
Loggly: How has the cloud influenced IT performance management strategies?
Callaghan: There are new types of applications that are being deployed in the cloud and new monitoring tools that run in the cloud. These newer toolsets such as New Relic, AppDynamics and Loggly are able to monitor these applications much better than traditional tools, and that’s the same for network monitoring. If you’re deploying workloads into the cloud, you need to be able to see into those environments, especially the network performance of those environments, and so again, newer vendors like Boundary are playing here. As well, if you have more workloads and apps running, you need many different tools, such as for monitoring applications built in Ruby or PHP versus Java or .NET. One thing that hasn’t really changed, according to our research, is that most performance issues still trace back to the application code.
Loggly: Sounds way more complex…how to rise above it?
Callaghan: IT organizations are struggling to deal with it, and to get a sense of how many dependencies the application has, where it’s running, what services it’s calling, which databases are being used. A pain point that we hear again and again is around how tough it is to understand or simulate the ultimate end user experience. There are so many factors that contribute to it or degrade it, from an issue with the user’s device, the browser, or Internet connection to the network bandwidth or to something going on with the company’s backend systems. It’s hard to correlate all these data points. Today, there is a bit of voodoo that goes into trying to divine all this.
Loggly: Is log management becoming more important to IT performance management overall?
Callaghan: In general, analytics are becoming more important, and logs are another data source that can be mined for insights. Every application, database and server is generating a log that is a gold mine of data. Companies are getting a better sense of how to take advantage of that gold mine. And it exists in every industry.
Log management has evolved as well. Eight or 10 years ago, log file analysis was only used in the context of security, to find a weakness or intrusion. Now, companies are finding a better way to sort through and index a much larger set of log sources. This helps them to look at many other issues beyond security and find errors that can indicate a problem with response time, transaction speed, or resource usage.
Loggly: What else is going on in the world of performance monitoring?
Callaghan: People are always trying to get better analysis of the data they monitor. They want to know, beyond seeing the state of SLAs and root cause analysis, how using a new tool will really benefit my business? How much will it result in improving customer conversions or winning new business or having less abandoned shopping carts? If I spend $300,000 on a new APM system will I get ROI and how fast will I get it by seeing better performance levels? What is the value of those better performance levels, those faster response times, to my business? Can I consolidate my IT infrastructure or see where I am overprovisioning? This is the next level of monitoring. Vendors are trying to deliver those business-oriented metrics.
So what will it take to get there? Connecting the dots to the business?
Callaghan: The tools are starting to deliver that. The end users in development or DevOps will have to learn how to use them effectively. The vendors will have to design them in such a way that the users can get more out of them. Most importantly, IT will have to think more strategically about the value it can provide and work more collaboratively with business departments to define SLAs and success metrics.