Ultimate Guide to Logging

Your open-source resource for understanding, analyzing, and troubleshooting system logs

Analyzing PHP Logs

After collecting and storing your logs, you need a way to parse, analyze, visualize, and make sense of this data. The solutions vary depending on your need, but we're going to highlight the most popular tools for the job.

How to Parse Logs

Part of your logging strategy is to prepare your logs for analysis, which means extracting useful information from your log entries. If you’re planning to use a log management tool, you don’t have to worry about this step since it’s handled by the system.

Linux Commands

The `awk` command lets you scan your files for patterns using a processing language. Here’s an example:

// mylog.log
192.168.22.10 - GET /profile/139 - User #139 suspended

// Command
awk -F- '{print $3}' mylog.log

The above command will extract all the body of the log (`User #139 suspended`). When using the command line, you can combine multiple commands like `awk`, `grep`, `uniq`, etc.

awk -F- '{print $3}' mylog.log | uniq -c

You can check this article to learn more about the `awk` command. The Apache logging guide provides a detailed overview of using the command line for parsing logs.

If you’re trying to parse your logs using PHP, you may want to consider using the log-parser package. It lets you define your log format, and then loop through every record.

// mylog.log
192.168.22.10 - GET /profile/139

…

To avoid specifying the list of options on every command, you can use the `sample.conf` configuration file from the installation directory to the `/etc/webalizer.conf` and adjust it according to your needs. Make sure to visit the official website for the list of available options.

Analog

// The official website is down for a long time (http://www.analog.cx).

Analog is an open source software that offers the possibility to generate custom reports, DNS Lookup, multiple log formats (e.g., COMMON, COMBINED, REFERRER, BROWSER, EXTENDED, MICROSOFT-NA, MICROSOFT-INT, WEBSITE-NA, WEBSITE-INT, MS-EXTENDED, WEBSTAR-EXTENDED, MS-COMMON, NETSCAPE, WEBSTAR, MACHTTP, AUTO, custom_string), etc.

After installing Analog, you can configure it using the `analog.cfg` file in the same directory. It also offers a list of other configuration samples inside the `examples` directory that you can check. You also have the ability to configure it from a web form using `anlgform` contained in the same directory.

Even though Analog is out of date and hasn't been updated for so long, it has really extensive documentation that covers all features and options, along with some examples.

web-server-statistics

AWStats

AWStats is another open source solution to analyze and group your logs. It offers the same features as Webalizer, like large files support, DNS lookup, multiple file formats, etc. You can see the comparison table between the three famous log analyzers.

The way AWStats works is that you run an update script from the command line to generate the output.

sudo perl awstats_buildstaticpages.pl -config=vaprobash -dir=/vagrant/public/awstats/wwwroot -update

awstat

AWStats offers a set of plugins that you can load from scripts like DNS lookup on IPv6 addresses, Whois lookup, GeoIP, etc. You can check the full list in the documentation.

Using Log Management Tools

Command line tools and scripts are not suitable for all business. We often have one server that contains all application logs, and we use some of the tools mentioned above to help us parse and analyze those. This is often a fine solution for a small applications, but as you scale up, you'll need a log management tool for the job.

Why Use a Log Management Tool?

There are a lot of reasons why you should move to a log management tool. The most important ones are:

  • Cost: Managing and developing your logging system will cost more than using a log management tool.
  • Security: This is a major concern for most companies that have sensitive information inside their logs.
  • Long-Term Retention: Most log management tools store your logs for a very reasonable length of time so you can use them for your annual reports, security auditing, etc.

Using Log Management Tools

The first step to moving to a log management solution is to connect your system to it. Most solutions provide integration for different languages and frameworks. A log management tool should be able to understand the log formats and make sense of them. This will increase the user experience by making filtering, searching, and visualizing data a lot easier. It is worth mentioning that every application should have a logging strategy, which means carefully selecting the log format, what data to log and how often to log it.

log-management

The screenshot above from the Loggly search page shows that after the logs are parsed, the field explorer sidebar lists the available filter variables depending on the scope (System, SysLog, HTTP, etc.). You can use it to filter logs from different sources before performing any further search.

Now that we’ve isolated the log source that we want to work with, we can search for PHP fatal errors that occurred on a specified time range. Luckily in my case, I had no fatal error in the last seven days in my production application.

log-management-2

To avoid checking every time for fatal errors in your system or for other important events, you can use the alerting system. It gives you the ability to notify developers or system admins about potential system anomalies that needs to be taken care of. You specify a search pattern for error detection which fires the alert every time it has a match. This could be done through an IRC channel, email, etc. This option goes well with real-time logs and makes it even useful in the case of production bugs.

alerts

alert-2

Instead of looking at your logs in a text-based format, you may use the charting components provided by the log management tool. The screenshot below shows the logs of a specific duration grouped by severity (Error, Warning, Debug, Notice, Info).

pie-chart