Can You Trust Your Logs?
Log output is the lifeblood of operations. Whether important events go to statistics, monitoring, your dev Slack channel, or even to PagerDuty and mobile phone alerts, we implicitly trust that they are showing the right thing. The trust goes so far that we do not just record them for humans to read — we even base automatic actions on the output of log messages. Services like fail2ban automatically configure IP-level blocks based on log messages, while modern cloud systems can spin up extra virtual machines in some situations. But who guarantees that the log messages were generated by the software that we assume generated them?
Syslog Fails to Audit Message Fields
Let’s take a look at the fail2ban service, though the same issues apply to other log-based reactions. fail2ban parses log files and reacts to messages from a specific program matching a regular expression. Here’s an example:
Apparently, someone was trying to guess the root password for this machine. Naughty. If this happens more than once or twice, fail2ban will add a firewall rule blocking the specified IP address for 15 minutes to prevent further attacks. The same message is shown in a more readable way in Loggly. What is not obvious here is that the values for appName and the other fields are set by the program that generated them and not validated by syslog at all. Any program can pretend to be any other program. As a normal user, we can even use the logger(1) tool to issue the exact same message as sshd, indistinguishable by programmatic means, even though we are not the ssh daemon, do not control ssh in any way, and have no specific privileges on the system.
And, true enough, the generated message is indistinguishable from the one issued by sshd.
A Malicious User Can Trigger Any Log Reaction
Using this loophole, it’s very easy for someone with a user account on the system to trigger automatic reactions to log entries. If fail2ban is running, we simply have to issue the command a few times with an IP address of our choosing, and fail2ban will dutifully add the IP address to the host firewall. Though it usually takes some effort to start a denial of service attack, this one is very easy to do and with potentially tremendous impact.
Other automated reactions can have wildly different effects. Maybe the monitoring system spins up a few more VMs on a cloud hoster. Maybe the on-call admin is notified. The possibilities get only more wide-reaching the more log messages are automatically processed. Even when admins are notified, this is difficult to debug. The logs show a login attack, except the IP address looks wrong. Was the address faked? If it’s not localhost as in these examples, was the remote host compromised? Is there a bug in sshd? Here, the only good news is that modern, virtualized, single-purpose systems rarely have many unrelated users, making the attack vector for this a lot less likely than it used to be. Still, this kind of privilege escalation can turn a minor security issue into a serious problem.
Journald Authenticates Log Fields
As an improvement over syslog, journald actually provides some guarantees as to the authenticity of the fields it fills. Using journalctl -o json-pretty, we can check the full key/value pairs journald recorded for the messages above. Any field starting with an underscore is added by journald and guaranteed not to have been tampered with. (Some fields removed for readability.)Contrast this with the message from a user issued with the logger tool.
Automated reactions to log messages can use these fields to ensure that the log messages actually are from the programs they expect them to be from, and later analysis from administrators can easily identify any tampering.
The issue raised in this article is real, but for modern, single-purpose virtualized systems only of small impact. The old syslog protocol simply does not have the capabilities to authenticate log messages in a way that would allow administrators to actually trust them.
Journald does, and it records trusted fields that allow admins to restrict automated responses to the correct programs and identify tampering. If these fields were to be sent to tools like Loggly, this problem would become a non-issue.
Sadly, journald still has only spotty integration with centralized logging software (as I mentioned in my earlier blog post). Hopefully, this will change soon.