Loggly helps Segment efficiently Operate a SaaS service with 100+ touchpoints
With billions of API calls every month, we generate a huge amount of log data, and it’s difficult and expensive to manage ourselves. It really made the most sense to bring in an expert like Loggly to take it off our plates.Calvin French-Owen Founder, Segment
- Significantly cut resolution time for customer issues
- Enabled tech support team to close most cases without involving developers or DevOps
- Kept valuable engineering resources focused on core business
Segment offers cloud-based companies a single platform that collects, stores, and sends data to hundreds of business tools with the flip of a switch. This transparent integration takes a lot of headaches away from its customers, serving as a routing hub for all of their analytics and marketing tools. But when running your business involves integrating with more than 100 web-based services and billions of API calls every month, there are many places where things can go awry. Segment depends on its log data to troubleshoot operational issues and to keep its service in top shape.
“We have hundreds of thousands of requests to partner services at any given time, which makes identifying and resolving issues quickly a little complicated,” says Calvin French-Owen, one of the founders at Segment. “Any given problem can be the result of an error in our service, bad responses or slow responses from partners, or configuration issues on the part of our customers.” Therefore, Segment logs responses from its application and all of its external touchpoints.
Segment’s log management strategy evolved several times as its customer base grew. The company’s original approach of aggregating its logs with syslog made it difficult to correlate related issues across machines. After experimenting with and outgrowing a couple of other cloud-based log management services, Segment built an internal logging service on the ELK Stack (Elasticsearch, Logstash, & Kibana). However, the service lacked depth in reporting and didn’t deliver on the need to expose insights quickly from the data without internal cycles.
Additionally, because engineering resources were focused on advancing Segment’s core product, Segment had limited resources to build log indices; the end result was that most logs were only stored as text. Without indexed log data, it took developers and support personnel much longer to isolate the specific logs they needed to troubleshoot customers’ integration problems. Finally, Segment engineers TJ Holowaychuk and Garrett Johnson decided to really invest in Segment’s logging systems and migrate to Loggly.
“Logging is not the meat of our business – analytics is,” French-Owen remarks. “We want to be able to move really quickly on analytics, and we know that Loggly can do a much better job at log management than we can.”
French-Owen cited several key deciding factors in choosing Loggly:
- Comprehensive indexing: Loggly was able to give Segment significantly more visibility into its log data, making searches much more efficient and increasing signal-to-noise ratio from the logs.
- Good user interface: Anyone on Segment’s team can access log data in a visual format. “There’s no odd query language you need to teach people.”
- API: Holowaychuk also built a CLI that hooks up to the Loggly API so that his developers can access log data without using the GUI.
For all of its machines, Segment logs errors related to requests to and responses from all of its integrations. It also logs errors that happen with events, for example a disconnection from one of Segment’s queues, Redis clients, or Mongo. Previous to deploying Loggly, Segment had deployed a separate log server to which all of its machines send logs. As a result, sending data to Loggly was as easy as adding a single plugin and turning it on, and self-scaling means that Segment was able to tailor their investment to their changing and growing business needs.When Segment first started using Loggly, it sent the same types of text-based logs that it indexed with its internal service. The company now makes extensive use of JSON and has been refining its logs to gain the maximum advantage from Loggly’s indexing.
Faster Customer Support
Segment’s technical support team is the primary group of Loggly users. “Since there are so many companies involved when we route data— our customers, our integration partners and ourselves—Loggly is the first place we go to see what’s working and what’s not,” French-Owen comments. “We need both a broad view of where something went wrong along with the ability to narrow down onto a specific request.”
Before Loggly, troubleshooting a customer’s problem usually involved manual integration testing and a long series of back-and-forth discussions between the customer, Segment support, and the Segment developer team. Now, when a customer contacts the Segment support team, the Segment team member first uses a debugger in the Segment application to verify that data requests are being sent out. Next, the team member goes into Loggly and uses a unique customer identifier to to determine if requests are reaching the relevant partner service and if any error messages are being generated in the process.
Segment support can quickly identify:
- Customer-specific issues such as being over plan in the partner service, using an expired API key, etc.
- Partner services that are not responding or are responding slowly.
- Potential problems with Segment’s code or in how Segment has normalized data for the partner’s required data formats.
“It’s been great that we can solve customers’ problems much faster than we could before,” French-Owen notes. “And in most cases, the DevOps team no longer has to get involved at all. Loggly is a way to keep our sanity around having all of these external touchpoints.”
Opportunities for Proactive Log Management
Moving forward, French-Owen sees opportunities to use log data for proactive monitoring. Today, if his team sees an uptick in errors relating to a specific partner, they may reach out to that partner, buffer messages, or both. The team is also looking to Loggly for:
- Monitoring for errors in code releases: Monitoring log data to ensure that new code is doing what it’s intended to do.
- Proactive customer service: If the volume drops significantly for a specific type of request or a specific customer, is something happening that needs attention?“When you have touchpoints with 100+ different services, the probability that one will be slow or experiencing problems is actually pretty high,” French-Owen concludes. “Investing in log management is something that pays off down the road.”