diff --git a/content/blog/2021-04-16-About-Logging.md b/content/blog/2021-04-16-About-Logging.md new file mode 100644 index 0000000..b9411c7 --- /dev/null +++ b/content/blog/2021-04-16-About-Logging.md @@ -0,0 +1,128 @@ + + +*TL;DR*: This post also talks about the problems I faced while working on my logging. To log to +syslog from within my containers that do not support configuring a remote syslog server, I had +*syslog-ng* expose a unix domain socket and mounted it into the container to `/dev/log`. + +## Introduction + +I have written a lot of blog posts about the lessons I have learned while setting up and +maintaining my server. But now that I started to rework my infrastructure a bit, I had to +inevitably look at something I may have overlooked in the past: logging! + +Previously, I had *Docker* *kind of* manage my logs: If I needed something, I would just +call `docker-compose logs ` and it would spit out logs. Then, I started to +configure my services to log to files in various locations: my *prosody* server would +log to `/etc/prosody/logs/info.log`, my *nginx* to `/etc/nginx/logs/error.log`, etc. +This, however, turned out to be problematic +as, in my case, *prosody* stopped logging into the file if I rotated it with *logrotate*. It was +also a bit impractical, as the logs were not all in the same place, but distributed across multiple +directories. +Moreover, *prosody* was logging things that I did not want in my logs but I could not turn off, +like when a client connected or authenticated itself. For me, this is a problem from two perspectives: +On the one hand, it is metadata that does not help me debug an hypothetical issue I have with my +*prosody* installation, on the other hand, it is metadata I straight-up do not want to store. + +My solution was using a syslog daemon to process the logs, so that I could remove logs that I do not +want or need, and drop them all off at `/var/log`. However, there was a problem that I faced almost +immediately: Not all software I can configure to log to syslog, I can configure to log to a specific +syslog server. Why is this a problem? Well, syslog does not work inside a *Docker* container out of the +box, so I would have to have my syslog daemon expose a TCP/UDP (unix domain) socket that logs can be sent to. To +see this issue you can try to run `logger -t SomeTag Hello World` inside one of your containers +and try to find it, e.g. in your host's journal. + +Today, I found my solution to both syslog logging within the containers and filtering out unneeded logs. + +## Syslog inside Containers + +The first step was getting the logs out of my containers without using files. To this end, I configured +my syslog daemon - *syslog-ng* - to expose a unix domain socket to, for example, `/var/run/syslog` and +mount it into all containers to `/dev/log`: + +``` +source s_src { + system(); + internal(); + unix-dgram("/var/run/syslog"); +}; +``` + +If you now try and run `logger -t SomeTag Hello World` inside the container, you should be able +to find "Hello World" inside the host's logs or journals. + +## Ignoring Certain Logs + +The next step was ignoring logs that I do not need or care about. For this, I set up two logs within +*syslog-ng*: One that was going into my actual log file and one that was dropped: + +``` +destination d_prosody { + file("/var/log/prosody.log"); +}; +filter f_prosody { + program("prosody"); +}; +filter f_prosody_drop { + program("prosody") + and message("(Client connected|Client disconnected|Authenticated as .*|Stream encrypted .*)$"); +}; + +# Drop +log { + source(s_src); + filter(f_prosody_drop); + flags(final); +}; +# Log +log { + source(s_src); + filter(f_prosody); + destination(d_prosody); + flags(final); +}; + +``` + +This example would log all things that *prosody* logs to the *prosody* location `d_prosody` and drop all +lines that match the given regular expression, which, in my case, matches all lines that relate to a client +connecting, disconnecting or authenticating. + +Important is the `flags(final);` in the drop rule to indicate that a log line that matches the rule should +not be processed any further. That log also defines no destination, which tells *syslog-ng* in combination with +the `final` flag that the log +should be dropped. + +Additionally, I moved the log rule that matches everything sent to the configured source to the bottom +of the configuration to prevent any of the logs to *also* land in the "everything" log. + +Since I also host a *Nextcloud* server, I was also interested in getting rid of HTTP access logs. But I would +also like to know when someone is trying to scan my webserver for vulnerable *wordpress* installations. + +So I again defined rules similar to those above, but added a twist: + +``` +filter f_nextcloud_drop { + program("nextcloud") + and match("200" value(".nextcloud.response")); +}; +log { + source(s_src); + parser { apache-accesslog-parser(prefix(".nextcloud.")); }; + filter(f_nextcloud_drop); + flags(final); +}; + +``` + +As you can see, the rule for my *Nextcloud* is quite similar, except that I added a parser. With this, I can +make *syslog-ng* understand the HTTP access log and expose its parts as variables to my filter rule. There, +I say that my drop rule should match all access log lines that indicate a HTTP response code of 200, since +those are locations on my server that I expect to be accessed and thus do not care about. + +## Conclusion + +With this setup, I feel much better about the logs I produce. I also have done other things not mentioned, like +configure *logrotate* to rotate my logs daily so that my logs don't grow too large and get removed after a day. + +Please note that I am not an expert in *syslog-ng*. It just happend to be what I first got to do what I want. And +the example rules I showed are also the first thing that I wrote and filtered out what I wanted.