blog.polynom.me/content/2023-07-15-prosody-traefik-2.md
Alexander "PapaTutuWawa caef031d48
All checks were successful
ci/woodpecker/push/woodpecker Pipeline was successful
Initial commit
2024-01-05 18:10:44 +01:00

132 lines
7.6 KiB
Markdown

+++
title = "Running Prosody on Port 443 Behind traefik 2: Electric ALPN"
date = "2023-07-15"
template = "post.html"
aliases = [ "/prosody-traefik-2.html" ]
# <!-- description: In this blog post, I tell you how I changed my setup for proxying my XMPP server using traefik -->
+++
Hello everyone. Long time, no read.
In 2020, I published a post titled "[Running Prosody on Port 443 Behind traefik](https://blog.polynom.me/Running-Prosody-traefik.html)", where I described how I run my XMPP server
behind the "application proxy" [*traefik*](https://github.com/traefik/traefik).
I did this because I wanted to run my XMPP server *prosody* on port 443, so that the clients connected
to my server can bypass firewalls that only allow web traffic. While that approach worked,
over the last three years I changed my setup dramatically.
<!-- more -->
While migrating my old server from *Debian* to *NixOS*, I decided that I wanted a website
hosted at the same domain I host my XMPP server at. This, however, was not possible with
*traefik* back then because it only allowed the `HostSNI` rule, which differentiates TLS
connections using the sent *Server Name Indication*. This is a problem, because a connection
to `polynom.me` the website and `polynom.me` the XMPP server both result in the same SNI being
sent by a connecting client.
Some time later, I stumbled upon [*sslh*](https://github.com/yrutschle/sslh), which is a
tool similar to *traefik* in that it allows hosting multiple services on the same port, all
differentiated by the SNI **and** the ALPN set by the connecting client. ALPN, or *Application-Layer Protocol Negotiation*, is an extension
to TLS which allows a connecting client to advertise the protocol(s) it would like to use
inside the encrypted session [(source)](https://en.wikipedia.org/wiki/Application-Layer_Protocol_Negotiation). As such, I put
*sslh* in front of my *traefik* and told it to route XMPP traffic (identified with an ALPN
of `xmpp-client`) to my prosody server and everything else to my *traefik* server. While this
worked well, there were two issues:
1. I was not running *sslh* in its ["transparent mode"](https://github.com/yrutschle/sslh/blob/master/doc/config.md#transparent-proxy-support), which uses some fancy iptable rules to allow the services behind it to see a connecting client's real IP address instead of just `127.0.0.1`. However, this requires more setup to work. This is an issue for services which enforce rate limits, like *NextCloud* and *Akkoma*. If one of theses services gets hit by many requests, all the services see are requests from `127.0.0.1` and may thus rate limit (or ban) `127.0.0.1`, meaning that all - even legitimate - requests are rate limited. Additionally, I was not sure if I could just use this to route an incoming IPv6 request to `127.0.0.1`, which is an IPv4 address.
2. One day, as I was updating my server, I noticed that all my web services were responding very slowly. After some looking around, it turned out that *sslh* took about 5 seconds to route IPv6 requests, but not IPv4 requests. As I did not change anything (besides update the server), to this day I am not sure what happened.
Due to these two issues, I decided to revisit the idea I described in my old post.
## The Prosody Setup
On the prosody-side of things, I did not change a lot compared to the old post. I did, however,
migrate from the `legacy_ssl_*` options to the newer `c2s_direct_tls_*` options, which
[replace the former](https://hg.prosody.im/trunk/file/tip/doc/doap.xml#l758).
Thus, my prosody configuration regarding direct TLS connections now looks like this:
```lua
c2s_direct_tls_ports = { 5223 }
c2s_direct_tls_ssl = {
[5223] = {
key = "/etc/prosody/certs/polynom.me.key";
certificate = "/etc/prosody/certs/polynom.me.crt";
};
}
```
## The *Traefik* Setup
On *traefik*-side of things, only one thing really changed: Instead of just having a rule using
`HostSNI`, I now also require that the connection with the XMPP server advertises an ALPN
of `xmpp-client`, which is specified in the
[appropriate XMPP spec](https://xmpp.org/extensions/xep-0368.html). From my deployment
experience, all clients I tested (*Conversations*, *Blabber*, *Gajim*, *Dino*, *Monal*, [Moxxy](https://moxxy.org))
correctly set the ALPN when connecting via a direct TLS connection.
So my *traefik* configuration now looks something like this (Not really, because I let NixOS
generate the actual config, but it is very similar):
```yaml
tcp:
routers:
xmpps:
entrypoints:
- "https"
rule: "HostSNI(`polynom.me`) && ALPN(`xmpp-client`)"
service: prosody
tls:
passthrough: true
# [...]
services:
prosody:
loadBalancer:
servers:
- address: "127.0.0.1:5223"
http:
routers:
web-secure:
entrypoints:
- "https"
rule: "Host(`polynom.me`)"
service: webserver
tls:
```
The entrypoint `https` is just set to listen on `:443`. This way, I can route IPv4 and IPv6
requests. Also note the `passthrough: true` in the XMPP router's `tls` settings. If this is
not set to `true`, then *traefik* would terminate the connection's TLS session before passing
the data to the XMPP server.
However, this config has one really big issue: In order
to have the website hosted at `polynom.me` be served using TLS, I have to set the
router's `tls` attribute. The *traefik*
documentation says that "*If both HTTP routers and TCP routers listen to the
same entry points, the TCP routers will apply before the HTTP routers. If no matching route
is found for the TCP routers, then the HTTP routers will take over.*"
[(source)](https://doc.traefik.io/traefik/routing/routers/#general_1).
This, however, does not seem to be the case if a HTTP router (in my example with ```Host(`polynom.me`)```) and a TCP router (in my example with ```HostSNI(`polynom.me`)```) respond to the same
SNI **and** the HTTP router has its `tls` attribute set. In that case, the HTTP router appears
to be checked first and will complain, if the sent ALPN is not one of the
[HTTP ALPNs](https://developer.mozilla.org/en-US/docs/Glossary/ALPN), for example when
connecting using XMPP. As such we can connect to the HTTP server but not to the
XMPP server.
It appears to be an issue that [I am not alone with](https://github.com/traefik/traefik/issues/9922), but also
one that is not fixed. So I tried digging around in *traefik*'s code and tried a couple of
things. So for my setup to work, I have to apply [this patch](https://github.com/PapaTutuWawa/traefik/commit/36f0e3c805ca4e645f3313f667a6b3ff5e2fe4a9) to *traefik*. With that, the issue *appears*
to be gone, and I can access both my website and my XMPP server on the same domain and on the
same port. Do note that this patch is not upstreamed and may break things. For me, it
works. But I haven't run extensive tests or *traefik*'s integration and unit tests.
## Conclusion
This approach solves problem 2 fully and problem 1 partially. *Traefik* is able to route
the connections correctly with no delay, compared to *sslh*. It also provides my web services
with the connecting clients' IP addresses using HTTP headers. It does not, however, provide
my XMPP server with a connecting client's IP address. This could be solved with some clever
trickery, like telling *traefik* to use the [*PROXY* protocol](https://doc.traefik.io/traefik/routing/services/#proxy-protocol) when connecting to prosody,
and enabling the [`net_proxy`](https://modules.prosody.im/mod_net_proxy.html) module. However,
I have not yet tried such a setup, though I am very curious and may try that out.