Fixing Client IPs in Apache Logs with Amazon Load Balancers

If you are running your Web servers behind a load balancer, you have probably noticed that your logs contain the load balancer’s IP address as the client IP, which is kind of annoying. There is an Apache module called RPAF, which fixes exactly this issue. Once you have it downloaded and installed, you configure it as follows:

RPAFenable On
RPAFsethostname On
RPAFproxy_ips 127.0.0.1 10.0.0.1
RPAFheader X-Forwarded-For

The module works such that it takes the IP address that is being transmitted in the X-FORWARD-FOR header and sticks it into the request as the ClientIP.

The not so nice part is that the RPAFproxy_ips statement lets you define the IP addresses of your load balancer. If you have only a set amount of them, all is fine. But here comes the catch. If you are deployed on Amazon AWS and you are using their load balancer (LB), then this solution is going to frustrate you quite a bit. You will notice that the LB’s IP address changes constantly and you keep adding IP addresses to the configuration statement. After about 10 IP addresses, I got sick of that and I started looking at the source code of RPAF to solve this problem once and for all. Here is what I did:

On line 139 of mod_rpaf-2.0.c, I added a return 1; statement. This will tell the is_in_array() function to always assume that the request is coming from a load balancer, without checking the configured list of IP addresses. The rest of the RPAF code is robust enough to only replace the client ip when an X-FORWARD-FOR header is actually set. After the change, do a make install-2.0 and you are in business.

Happy logging!

2 Comments

RSA Security Conference – Cloud the Logging Killer App?

Logging - Cloud Kiler App

Logging

I am attending the RSA conference this week. The first session I attended was the Cloud Security Alliance (CSA) meeting. Reading some of the accompanying material and listening to some of the presentations and panels, I couldn’t help it but notice that the terms auditing and logging were all over.

Here is my attempt for an explanation of this. It seems that one of the reasons for this is the nature of the cloud. Think about it. You are in an environment where you don’t control much. You are in an environment where you cannot trust most of the infrastructure pieces. For example, if you are using AWS like we are doing at Loggly, you should generally not trust your AMIs (the OS images). Now, what do you do if you don’t trust someone? You observe them, you monitor them. That’s exactly what is and needs to happen in the cloud: You don’t trust the service. To mitigate this issue, you are going to monitor the service.

And to make this not just my explanation, here is what some panelists during the CSA meeting said:

“Loss of visibility in the cloud” – Scott Chasin, CTO McAfee SaaS Unit
“Lose control and still maintain accountability” – Ken Biery, Verizon Business.

Is the cloud the killer app for logging? And if that’s the case, how do you manage your logs in the cloud? There are hardly any cloud logging solutions out there. I think you see where I am going with this.

0 Comments