This tutorial will show you how to troubleshoot problems using trend analysis, find the root cause, monitor it on your dashboard, and set an alert. It will guide you through the demo shown in Loggly in 5 Minutes, but on your own account using sample data. The sample data is a smaller set for faster download, so the charts might look slightly different.
Try It With Sample Data
Step 1: Upload Loggly Sample Data
Upload Loggly’s Sample Data, which is a small file containing the events used in this walkthrough. It takes just a single command to upload the data.
Step 2: Search for Your Sample Data
Verify you successfully sent the data to Loggly by searching for all the events you just uploaded using the sample tag.
Step 3: Zoom In On the Events
At first, the data will just be a big spike on the left. Zoom in by clicking on the blue column and then dragging with your mouse until it’s evenly distributed across the time series chart. This will make it easier to see trends.
Step 4: Save Your Search
You can save this search and time series chart view so you can go back to it later. Call it “Sample Events”.
Step 5: Create a Source Group
Instead of including the tag:sample on every search, create a source group so it will search this tag automatically. Go the Source Setup tab, then click Source Groups. Name the source group “Sample” and enter “sample” as the tag.
Step 6: Plot Maximum Response Time
Let’s imagine we have a problem where results are coming back slow, and we want to troubleshoot and find out why using trend analysis. Search for response time on query calls by selecting the Sample source group, then entering this on the search box.
To plot the maximum response time, click the trend analysis subtab in the middle of the screen, then click to expand the dropdown, and select Timeline. Select the maximum statistic in an area chart. The chart automatically zooms in on the part with data. You can see a few spikes where the responses came back slow.
Step 7: Plot Average Response Time
Step 8: Range Search for Slow Responses
To find just the slow events, do a range search for responses over the SLA of 500ms. It must have an upper limit, so make it greater than the maximum response time to show all the slow events.
json.querytime_ms:[500 TO 10000]
Step 9: Filter on Top Failures
To see why they are slow, expand the filter for failures, then click show more to see the top failure code. Clicking on the top failure code will add the filter on that value.
Step 10: See Expanded Event View and Automated Parsing
To learn more about events with this failure code, switch to the event view. Then click on an individual event to expand it out. You will see each field has been automatically parsed out. This is what enables the trend analysis and filters to work on individual fields or facets.
Step 11: Create an Alert
Create an alert so that if responses come back slow in the future, you will receive an email. First, create a saved search called “Responses Over SLA” and then click the link to create an alert at the bottom of the popup window. Call the alert “Responses Over SLA”, set it so that if happens more than 25 times in 5 minutes, then it sends you an email. Note this alert won’t actually activate because you are not sending live data and the saved search is on a custom time range rather than a relative one.
Step 12: Create a Dashboard Widget
Go back to the search page, and then the time series chart view. Create a bar chart showing a count of events. These are responses over the SLA. Click the button on the upper right to create a dashboard widget. Call it “Responses over SLA”, then save it.
Step 13: Create a New Dashboard
Step 14: Add Your Widget to the Dashboard
Step 15: Send Your Own Data
Go to the Source Setup tab. Send your own log data to Loggly, then setup your own dashboards, alerts, and more!