Blog Node.js

Lodge: A HTTP to Syslog Proxy with Node.js

By Hoover J. Beaver 09 Sep 2010

Lately, I’ve been a real eager beaver about tweeting anything that even remotely mentions Node.js in passing. In fact, earlier in the year when I first started noticing their uptick in traffic, I tweeted a link to Will Larson’s fine post on monitoring and forwarding log events with Node.js. Since then, I’ve been itching to do a project myself, and just recently found the time and a reason to use it.

Keeping it Simple

With an extended private beta release of Loggly coming up next month, we’ve been in crunch mode and are trying to keep the feature set as simple as possible. Unfortunately, one of the things I’ve had on my wish list was HTTP enabled inputs, but the outlook was grim for allocating the resources to get it done anytime soon. The reason I wanted web inputs was purely selfish: I want to be able to log from my AppEngine account to my Loggly account.

While our APIs are built for volume, they are primarily designed for searching data. As we’re already handling thousands of events a second on our current syslog-based inputs, implementing something that could only do 100s of events a second on the HTTP inputs would be just plain silly.

If we were going to do right, it’d have to be ridiculously fast and highly asynchronous.

Never Ask Zed If He’s Seen Pulp Fiction

Speaking of fast, round about 2 months ago I saw a post by Zed Shaw mentioning he was working on integrating 0MQ into Mongrel2. I thought this was fucking brilliant, and I figured it’d be worth a shot to see if he wanted to chat with us about how we could use Mongrel2 to catch HTTP POSTs and toss them into our own 0MQ enabled Solr cluster. A few short days later he swung by the Loggly office with his guitar and ended up singing us this coding ballad:

I was getting tired of their stateless HTTP chatter .Nothing I wrote for them ever seemed to matter. So I took a little hiatus just to clear my head I saw a flashing zero-m-queue just up ahead.
I started coding hard to get some satisfaction With a little more C++ and a lot more action – Zed and his geetar.

Zed said that Mongrel2 would be able to easily handle takingHTTP POSTs then turn around and jam them in a 0MQ consumer as fast as we could pump it. “Essentially this is what Mongrel was made to do, and it’ll do all day long”, he told me. It appeared to us that Zed’s magic was the way to roll with our web input plans. I discussed it with Jon and Brian a bit, and then stuck it on the roadmap for later in the year.

Right about now you’re saying to yourself, where the hell is this post going, weren’t you going to talk about Node.js and Syslog, and why the hell did Zed just appear in the story and sing a song about Mongrel2 with a Toby Keith accent? Well, yeah, I hear you and that’s exactly what I’m going to do, right after I tell you how bad of a programmer I am.

Balancing the Load

I’m one of those people who is good at cherry picking bits of technology and assembling them into something useful, but coding-wise it takes me about 2x as long on a project as it would for a professional to do it, and it doesn’t really deal with error handling very well. Ok, at all. For most of the languages I program in, I’m probably like a 6 or 7 on a good day. With JavaScript I’m like a 7 or 8, if I’ve had lots of caffeine and a good night’s rest.

So, what it boiled down to was that, if I wanted HTTP inputs for our users anytime soon, I’d probably have to do it myself. I seriously doubted I’d be able to write a 0MQ enabled module for a new fangled web server in something (C++/Java?) that would be stable enough for us to put in production. And then I got to thinking I could do it in Node in JavaScript a few days, and maybe it’d be fast enough.

images

Thankfully, even though I’m a crappy programmer, I’m actually a pretty damn good networking guy. One of the things I discovered quickly about Node.js when I started experimenting with it was that it feels like a high performance, fully programmable load balancer. It can, in fact, do something that a very pricy solution like Zeus’ Traffic Manager product fails at: Proxy HTTP POSTs through to a Syslog Server. Talk about a knockout blow.

Open Source IT

And so, after about a week of hacking, I was able to come up with a fast, decently scalable, stable solution all by myself for getting HTTP events into Loggly. And here’s the good news: It appears that Lodge can handle just north of 2K event POSTs a second, which is more than fine and dandy for us for right now. By the time we get to needing it to be faster, we can get something done with Mongrel2.

I also figured this might be useful as a standalone project if someone wanted to forward logs out of AppEngine, Heroku, EngineYard, or even Node.js itself, so I took out all the Loggly specific stuff out and jammed it up on Github for everyone. You can fork Lodge here.

To run Lodge, you’ll need to download it from Github, install Node.js , and then start it by doing the following:

node lodge.js

Lodge forwardsTCP based syslog data to the local machine on port 514. By default, most syslog servers will not be configured to take TCP streams on port 514. You’ll probably need fiddle with your syslog server a bit to get it working.

If you are running Syslog-NG, your configuration file will need to look a bit like this:

   source s_lodge {
        tcp(ip(0.0.0.0) port(514) max-connections(300));
   };
   destination df_lodge {
        file("/var/log/lodge.log");
   };
   log {
        source(s_lodge);
        destination(df_lodge);
   };

Once you get Lodge running and have your syslog service listening on the correct port/protocol, you should be able to do a little curl action to test it:

I’m working on a new logging handler for Appengine which will be able to use Lodge to log straight into a syslog server. I’ll get a post done for that when it’s ready. In the meantime, check out the logging facility docs for Python.

Happy logging!

Hoover J. Beaver

Hoover J. Beaver

Share Your Thoughts

Shares