Loggly Loves Large Data: But How Much?


The launch of Second Gen2 Loggly changed everything: how our systems process, store and extract insight from the logs, the best of breed open-source architecture we built it on to accept and large index log volume at line-speeds and the all-new UI to give insight ahead of search to the customers.

Our Gen2 release also changed the conversations and customers we can handle… Now Loggly really is, “all growns up.” (Swingers, Movie circa 1996 NSFW)

We routinely have customers ask us about sending large volumes of data over 100GB/day and even over 5TB/day. Many of them are large cloud-centric companies, previously stuck with expensive on-premise solutions or using slower and unstable services. They are looking for ways to improve operational troubleshooting, reduce their footprint and costs.  Let’s address how Loggly handles large volumes.

How much data can I send?  
Loggly Gen2 has newly redesigned collectors that are able to ingest up to 4TB/day each. That’s enough to handle very large customers singlehandedly, but we also scale them horizontally to serve over 3,500 active customers. We process more data every day than Twitter, and have been growing 5x year over year.

How will you grow with peak demand?  
We helped the Obama for America campaign process over $600M in donations.  We scaled up automatically to handle peak demand, scaled down at the end of the campaign, and they did all this without even touching a single server.  Even better, our Production plan includes free overage protection so don’t need to pay for your peak volume, you simply pay for your rolling average volume!  We grow when your business grows.

How fast is it?
Events typically appear on your dashboard within 5-7 seconds and searches on millions of events come back in seconds too.  Compare this to some of our competitors where you can be left waiting for minutes searching over a day’s worth of data.  Our entire processing pipeline is scalable and built on top of LinkedIn’s Kafka, Twitter’s Storm and Elasticsearch.  We automatically shard your data at sizes as low as 30GB, for fast performance and search results.  We strive every day to deliver the best performance of any cloud-based logging solution.

How reliable are cloud services?
Logging offsite is often more reliable, and that’s why it’s often a requirement for audit compliance.  An offsite solution is robust to not only individual instances or servers crashing, but also allows access to your data even when your local datacenter or network goes down; a situation where you may need your log data the most.  To address robustness and increase reliability, Loggly is distributed both across Amazon regions and zones as well as to our own datacenter.   We allow you to use proven, standard, time-tested syslog forwarders like rsyslog, syslog-ng and nxlog that guarantee delivery using TCP and will retry and queue locally if you are cut off from the network.  Our backend includes distributed persistent queues that are robust to bursts of traffic without dropping data.  Consider the on-premise alternative where neglected log server disks routinely fill up and crash. This can result in pages to your DevOps team in the middle of the night, or even crash dependent services causing major outages.  Loggly lets you sleep easy, and it’s there when you need it.

Is bandwidth expensive?
100GB/day worth of logs requires about 10Mbit of bandwidth with flat volume, but you may need more if your volume is lopsided during peak hours. Adding in the cost of Loggly’s service, the transmission cost is less than 10% of the cost of your total solution.  If you are hosted in the Amazon Web Services (AWS) US-East or US-West region it’s free and without noticeable latency.  The concerns of bandwidth are usually so small, they pale in comparison to the true total-cost of trying to justify on premise investments. It’s one less thing to worry about, and you will do much better in the long run investing in insight for your company versus hardware to run it.  Paying for bandwidth is much cheaper than the cost of failure.

How can I make sense of that much data?
Extracting the useful information among all the noise is difficult, even harder when there are huge volumes and hundreds of applications all lumped together, but that’s also why we exist.  Loggly Gen2 is designed to make it easier with automated parsing that makes search results more accurate, trend graphs to see patterns in millions of events, source groups to view only related sources, and even transactional tracing to watch single requests cross multiple applications and hosts.  We make it easier to see what’s happening!

Am I really going to save time and money?
Some companies use expensive on-premise products because of an old blanket IT policy to keep data in house.  Some of those same companies also used to host their own CRMs.  Today, Salesforce has a $30b market cap.

IT Management is frequently pushing a cloud first policy to save on expensive license,  maintenance, and hardware fees.  They want to reduce their on-premise footprint and push more data to the cloud.  Alternatively, some use free open source tools in the company’s early days, but as they grow it takes more and more time to manage.  Large scale distributed systems are always hard, and many see it as one more chore that distracts from serving your customers.  Loggly runs your log system so you can focus on running your business.  We will never be the cheapest log management solution, but we are the most popular cloud-based log management service which is more valuable.

View our plans and pricing options, or contact us to learn more about sending large volumes!

Share Your Thoughts