blog.polynom.me/content/2021-04-16-About-Logging.md
Alexander "PapaTutuWawa caef031d48
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Initial commit
2024-01-05 18:10:44 +01:00

5.6 KiB

+++ title = "About Logging" date = "2021-04-16" template = "post.html" aliases = [ "/About-Logging.html" ] +++ TL;DR: This post also talks about the problems I faced while working on my logging. To log to syslog from within my containers that do not support configuring a remote syslog server, I had syslog-ng expose a unix domain socket and mounted it into the container to /dev/log.

Introduction

I have written a lot of blog posts about the lessons I have learned while setting up and maintaining my server. But now that I started to rework my infrastructure a bit, I had to inevitably look at something I may have overlooked in the past: logging!

Previously, I had Docker kind of manage my logs: If I needed something, I would just call docker-compose logs <service> and it would spit out logs. Then, I started to configure my services to log to files in various locations: my prosody server would log to /etc/prosody/logs/info.log, my nginx to /etc/nginx/logs/error.log, etc. This, however, turned out to be problematic as, in my case, prosody stopped logging into the file if I rotated it with logrotate. It was also a bit impractical, as the logs were not all in the same place, but distributed across multiple directories. Moreover, prosody was logging things that I did not want in my logs but I could not turn off, like when a client connected or authenticated itself. For me, this is a problem from two perspectives: On the one hand, it is metadata that does not help me debug an hypothetical issue I have with my prosody installation, on the other hand, it is metadata I straight-up do not want to store.

My solution was using a syslog daemon to process the logs, so that I could remove logs that I do not want or need, and drop them all off at /var/log. However, there was a problem that I faced almost immediately: Not all software I can configure to log to syslog, I can configure to log to a specific syslog server. Why is this a problem? Well, syslog does not work inside a Docker container out of the box, so I would have to have my syslog daemon expose a TCP/UDP (unix domain) socket that logs can be sent to. To see this issue you can try to run logger -t SomeTag Hello World inside one of your containers and try to find it, e.g. in your host's journal.

Today, I found my solution to both syslog logging within the containers and filtering out unneeded logs.

Syslog inside Containers

The first step was getting the logs out of my containers without using files. To this end, I configured my syslog daemon - syslog-ng - to expose a unix domain socket to, for example, /var/run/syslog and mount it into all containers to /dev/log:

source s_src {
       system();
       internal();
       unix-dgram("/var/run/syslog");
};

If you now try and run logger -t SomeTag Hello World inside the container, you should be able to find "Hello World" inside the host's logs or journals.

Ignoring Certain Logs

The next step was ignoring logs that I do not need or care about. For this, I set up two logs within syslog-ng: One that was going into my actual log file and one that was dropped:

destination d_prosody {
    file("/var/log/prosody.log");
};
filter f_prosody {
    program("prosody");
};
filter f_prosody_drop {
    program("prosody")
	and message("(Client connected|Client disconnected|Authenticated as .*|Stream encrypted .*)$");
};

# Drop
log {
    source(s_src);
    filter(f_prosody_drop);
    flags(final);
};
# Log
log {
    source(s_src);
    filter(f_prosody);
    destination(d_prosody);
    flags(final);
};

This example would log all things that prosody logs to the prosody location d_prosody and drop all lines that match the given regular expression, which, in my case, matches all lines that relate to a client connecting, disconnecting or authenticating.

Important is the flags(final); in the drop rule to indicate that a log line that matches the rule should not be processed any further. That log also defines no destination, which tells syslog-ng in combination with the final flag that the log should be dropped.

Additionally, I moved the log rule that matches everything sent to the configured source to the bottom of the configuration to prevent any of the logs to also land in the "everything" log.

Since I also host a Nextcloud server, I was also interested in getting rid of HTTP access logs. But I would also like to know when someone is trying to scan my webserver for vulnerable wordpress installations.

So I again defined rules similar to those above, but added a twist:

filter f_nextcloud_drop {
    program("nextcloud")
	and match("200" value(".nextcloud.response"));
};
log {
    source(s_src);
    parser { apache-accesslog-parser(prefix(".nextcloud.")); };
    filter(f_nextcloud_drop);
    flags(final);
};

As you can see, the rule for my Nextcloud is quite similar, except that I added a parser. With this, I can make syslog-ng understand the HTTP access log and expose its parts as variables to my filter rule. There, I say that my drop rule should match all access log lines that indicate a HTTP response code of 200, since those are locations on my server that I expect to be accessed and thus do not care about.

Conclusion

With this setup, I feel much better about the logs I produce. I also have done other things not mentioned, like configure logrotate to rotate my logs daily so that my logs don't grow too large and get removed after a day.

Please note that I am not an expert in syslog-ng. It just happend to be what I first got to do what I want. And the example rules I showed are also the first thing that I wrote and filtered out what I wanted.