JSON: Best Practices


If you're like anyone else in the world, you probably don't like wasting a lot of time scouring through log files. Even though Loggly makes log files fun, we want to help you get more out of your logs without even looking at your logs. This is where structured data comes in. Many of our users have made the transition to JSON and aren't going back!

Here's an example. Say this is a portion of your log file:

Hoover, 29, 251 Kearny Street, San Francisco, CA, 2012-09-29
Teton, 21, 123 Great Avenue, Teton, ID, 2012-09-29

I want to figure out how many 29 year olds are from San Francisco. This may look familiar to people who are used to dealing with unstructured logs:

$ grep 29 file.log | cut -d , -f 4 |sort |uniq -c |sort -nr

With the above approach, you'll end up with log entries that include any date that has “29” in it. So then you end up with an even more complicated command.

With JSON data, that complex command that no one but you understands, becomes:

> uniq json.age json.city:"San Francisco"


Getting Started

You'll need to convert your plain text logs into JSON. This is usually straight forward. Within your Apache configuration file (httpd.conf), set up a custom logging format. Here are a couple of examples:

Common Log Format:

LogFormat "{ \"remoteHost\":\"%h\", \"remoteLogname\":\"%l\", \"user\":\"%u\", \"time\":\"%t\", \"request\":\"%r\", \"status\":\"%>s\", \"size\":%b }" jsonlog
CustomLog logs/access_log jsonlog


NCSA extended/combined log format:

LogFormat "{ \"remoteHost\":\"%h\", \"remoteLogname\":\"%l\", \"user\":\"%u\", \"time\":\"%t\", \"request\":\"%r\", \"status\":\"%s\", \"size\":\"%b\", \"referer\":\"%{Referer}i\", \"userAgent\":\"%{User-agent}i\" }" jsonlog
CustomLog logs/access_log jsonlog
LogFormat "{ \"time\":\"%t\", \"remoteIP\":\"%a\", \"host\":\"%V\", \"request\":\"%U\", \"query\":\"%q\", \"method\":\"%m\", \"status\":\"%>s\", \"userAgent\":\"%{User-agent}i\", \"referer\":\"%{Referer}i\" }" jsonlog
CustomLog logs/access_log jsonlog


Setting Up Loggly

Once you have your JSON data, you'll need to create a new input that is JSON-enabled. We'll accept JSON data over any protocol: TCP w/ Strip, UDP w/ Strip, Secure Syslog (TLS), or HTTP(S). The “With Strip” option means that we'll strip off the syslog header before we index your data. This is so that we can easily parse the JSON.