Ultimate Guide to Logging

Your open-source resource for understanding, analyzing, and troubleshooting system logs

Analyzing Node Logs

Analyzing millions of log lines from a production server can be quite difficult. Command line tools are nice for looking at logs streaming by on a console when doing development. However, when analyzing production logs there are higher volumes of data, and you need automated tools to query and summarize the logs. If the data is in the Apache log format such as Morgan logs, it’s slightly easier to analyze with Unix command line tools. If the data is in JSON format, you need an analysis tool that understands JSON as Winston or a log management solution. We give examples of several types of analysis tools below.

Command Line Tools

Command line tools are nice for viewing a live tail of logs during development, or for quick, one-off type analysis using grep. If the log data is in the Apache combined format (as described in “Request Logging with Morgan”), you can parse the logs using Unix command line tools like grep. We have several examples in the Apache guide on using Unix command line tools. If the logs are in JSON format, it’s harder to analyze with grep. One popular open-source tool is called jq, and it’s described in Parsing Java Logs in JSON.

Winston

Winston provides a simple method to query log entries, aside from creating logs. The following code snippet is the foundation that will be used to query logs. Ensure Winston is initialized with the proper transport locations prior to executing this code. You can read the section “Request Logging with Morgan” to learn how to configure it with Winston.

const options = {
  from: new Date() - (24 * 60 * 60 * 1000),
  until: new Date(),
  limit: 10,
  start: 0,
  order: 'desc',
  fields: ['message']
};
// Find items logged between today and yesterday.
logger.query(options, function (err, results) {
  if (err) {
    /* TODO: handle me */
    throw err;
  }
  console.log(results);
}});

Winston queries log information using the method winston.query(). A first parameter is a JSON object of the options used for searching. In this example, we’re searching through the log entries from the last 24 hours (by setting the form and until parameters). The limit parameter sets the number of log entries to return. This value will vary in specific cases. The search will start at the beginning (start parameter) and return results in descending order (order parameter). The fields parameter tells Winston what specific log information to return. This value will vary in specific cases. If there were no errors, the returned log entries can then be queried and analyzed.

Log Management Solutions

Log management tools can often natively parse and analyze logs in both JSON and Apache combined format. They can give you quick summaries allowing you to visualize large sets of data, which simplifies the process of analyzing logs and keeping up with what’s happening on your site. Here are some dashboards that are set up to visualize important trends.

A timeline chart can tell you if there’s an unexpected increase in traffic or error rate by status code. Here we see a bike spike in 404 errors shown in green. This could indicate a problem with a recent deployment or less of a critical resource.

An area chart can tell you if response times are slow due to servers getting overloaded or new code deployments. In the trend view, select timeline chart and a numeric field such as json.responsetime. Next choose one or more operators like min and max. This can be used to identify differences in user experience or failures to meet your SLA.

A pie chart can tell you if traffic issues are overwhelmingly caused by one client or a small number of clients. On popular sites you’d likely see traffic fairly evenly distributed among many clients. Too much traffic from a single IP could indicate someone using your service in an unusual way.

A bar chart can tell you the top ranked items, such as which user agents are responsible for the most errors. Perhaps if we’re seeing many errors from curl, someone has a broken bash script?

A table can provide a list of which error messages occur most often. You can see the top ones sorted to the top, but also less common ones by scrolling down.