I have been embarking on an exercise to migrate some of the smaller applications I use into Docker containers.
This is the reverse of my prior more secure approach where I wanted 3rd part apps that may be insecure but were internet accessable to be isolated in their own KVM instances to prevent any impacts to other services should they indeed prove to be insecure and let someone into the system, in fact many of my KVMs were created specifically to isolate such applications and move them off my main web server. However memory is finite, and with each OS upgrade each KVM needs even more resources to run just the operating system so I have hit a limit and need to consolidate again, and Docker containers seem to be the way to go.
So far I have containerised my ‘live’ hercules system (no savings as it already ran on the web server, and worse additional overhead of docker running on the web server) plus my IRC server (inspirecd) I have containertised and decomissioned the dedicated KVM instance for that (savings in KVM being decomissioned that had 256Mb memory allocated and was swapping badly and I can cap the docker container at 20Mb and it runs fine…and I prefer to run my own customised one but for those interested there is an official container image).
Trying to get a container that needs ip forwarding to work, reverse route needed
The next KVM instance I want to migrate is my openvpn server, as from working on it to date the entire thing runs in a docker container capped at 32Mb, so being able to decommission that server would be of benefit also. However it obviously needs to be able to pass through traffic from the openvpn network and the internal network, which docker is not keen on without a bit of tweaking.
As I want the container image to be portable, and obviously not to contain any server keys, the container requires that it’s /etc/openvpn directory be provided as a read only overlay which not only makes it portable but allows configurations to be switched/tested easily (ie: source filesystem can be /home/fred/openvpn/files_server1 or /home/fred/openvpn/files_server2 for the overlay on the docker run command without the container image needing to change). This allows the container (or multiple of the same containers) to be started on any docker host with customised configurations managed outside of the image.
And that has made finding this issue easy. To assist with tracking down the issue I simply installed docker on the existing VPN server and can test the docker image there, easy in that the standalone and container openvpn processes are on a host with exactly the same host and networking configuration plus exactly the same configuration files. Standalone works and containerised needs extra steps on the docker host.
In the following standalone configuration everything works perfectly (for ping).
- The openvpn server can ping and connect to everything in the 10.0.1.0/24 network
- The openvpn server can ping and connect to external network addresses via the 10.0.1.0/24 networks external gateway
- Clients connected to the server can ping and connect to all addresses available in the above two points
+----------------------------------+
| |
| OPENVPN SERVER |
| |
Outside world< -->10.0.1.0/24 < --->| eth0 tun0 |< ----------->vpn clients
network | 192.168.1.173 10.8.0.1 | tun0 10.8.0.n
| gw routes to |
| 10.0.1.0/24 |
| |
+----------------------------------+
However in the following container configuration ping responses cannot traverse the return path unless a static route is added on the docker host routing into the container.
Note that this is the same server used above, and while the application has been moved into a container the configuration files are identical as they are provided by a filesystem directory overlay to the container, the container uses exactly the same files as in the non-container example above. Absolutely everything is identical between the two configurations apart from the docker ‘bridge’ connection between the container eth0 and the docker host docker0… which is working OK as the container itself can see the external networks.
- The openvpn container can ping and connect to everything in the 10.0.1.0/24 network
- The openvpn container can ping and connect to external network addresses via the 10.0.1.0/24 networks external gateway
- Clients connected to the server can ping the tunnel interface and the container 172.17.0.2 interface but cannot ping the host docker0 or any other external to the container addresses the container itself can ping.
Fixed: on docker host a route needs to be added to the container ip running the openvpn network (in the example below ‘route add -net 10.8.0/24 gw 172.17.0.2’ allows all servers within the 10.0.1.0/24 network to be pinged)
+--------------------------------------------------------------------+
| |
| OPENVPN SERVER / DOCKER HOST |
| |
| +----------------------------------+ |
| | | |
| eth0 | OPENVPN CONTAINER | |
| 192.168.1.173 | | |
| gw routes to | | |
Outside world< -->10.0.1.0/24< ----> | 10.0.1.0/24 | eth0 tun0 |< ----------->vpn clients
network | | 172.17.0.2 10.8.0.1 | | tun0 10.8.0.n
| | gw routes to | |
| docker0 < --bridge--> | 172.17.0.1 | |
| 172.17.0.1 | | |
| +----------------------------------+ |
| ^ |
| | |
| STATIC ROUTE 10.8.0.0/24 gw 172.17.0.2---+ |
| (needed for ping reverse traversal) |
+--------------------------------------------------------------------+
With the static route added in the container configuration it behaves similarly to a native KVM instance running the openvpn server, and can ping all hosts in the internal 10.0.1.0/24 network.
PING does not mean it is all working
The ‘ping’ traffic returns to the originator by reversing back down the route it travelled to reach the target host, this is unique to utilities like ping and does not apply for normal tcp traffic. Normal traffic requires routing to provide a return search path for traffic, as such it would be easiest to implement an openvpn server on the same host that is the default gateway for all your servers; which is unlikely to happen in the real world.
In the real world I believe each server you would need to access would require a route defined back to the host running the containers 172.17.0.0/24 network plus to the same host for the 10.8.0.0/24 network (or in a KVM simply the 10.8.0.0/24 route) in order for application traffic to find a return path.
Unfortunately in my lab environment I cannot really test that, while I can remove routes to the 10.0.1.0/24 network and test that traffic to that network does enter it via the openvpn connection, as that is an existing network with routing already defined back to my home network via a different path it is impossible for me to reconfigure return routing via the openvpn network to my home network without severely damaging my lab setup.
One other point of note is that the working examples are for a native KVM or physical host running the container.
Almost the same configuration running in a container on an OpenStack KVM host simply does not work; the only difference in configuration is that the container host is on the 10.0.1.0/24 network itself, tcpdump shows the ping tries to leave the openstack kvm to another kvm on the same internal network but never completes the final hop to the target host; this is with security rules allowing all traffic on both hosts. The internal network is the 10.0.1.0/24 network and is the same one that works perfectly for ping on a native KVM host on a seperate network with a route to the 10.0.1.0/24 network, but does not work when the container is run on a OpenStack host in that same network, lots of pain there. This I believe is asn issue with OpenStack networking rather than Docker sinply because it works on a natuve KVM, but as google throws up lots of posts with difficulties with Docker escaping the host machine I cannot be sure.
Starting the container
As the docker host needs to have a route to the container assigned ip-address my script to start the container is shown below, it uses docker exec to extract the ip-address assigned to the container after the container is started, and if the route into the container for the openvpn network does not already exist it is added.
#!/bin/bash
# ======================================================================================
# Start the openvpn docker container
# which is not as simple as you may think.
#
# Issues: all hosts on the network that are to be accessable via the VPN tunnel need
# to know to route traffic to the openvpn network address range via the
# docker host running the container.
# And the docker host running the container needs to route into the container
# to access the openvpn tunnel (that is automated by this script so at least
# ping to all hosts will work as ping traverses back up the route it travelled
# down so returns to the container host to look for the openvon network,
# most ip traffic does not behave that way and needs complete routing entries).
# ======================================================================================
# --------------------------------------------------------------------------------------
# Start the Docker container
# * the configuration files for /etc/openvpn are supplied by a read only overlay
# * we also need to overlay the modules directory as the docker container `uname` is
# the value of the docker host, not the value used to build the container image,
# also read only.
# * we need access to the docker hosts tun device, and capability SYS_MODULE to load
# the tun driver in the container.
# --------------------------------------------------------------------------------------
docker run -d --memory=32m --cpus="0.3" --rm --name openvpn1 -p 1194:1194 \
--cap-add=SYS_MODULE --cap-add=NET_ADMIN \
--device /dev/net/tun:/dev/net/tun \
--network bridge \
-v /home/fedora/package/openvpn/files/etc_openvpn:/etc/openvpn:ro \
-v /lib/modules:/lib/modules:ro \
openvpn
# --------------------------------------------------------------------------------------
# Docker host needs routing back to the openvpn network via the etg0 interface that
# docker assigned to the container. Use 'docker exec' to query the container for the
# ip address that was assigned.
#
# If a route already exists, we need to do nothing.
#
# ping replies that traverse back up the network path need to be able to route via
# the openvpn network, the docker host does not know about it so add a route.
# Clients can now get replies to pings...
# HOWEVER while pings traverse back up the path it came down application taffic behaves
# differently and all servers that need to be contacted across tcp will need to be
# able to route traffic via the openvpn network... so the docker host machine should
# ideally be a default gateway for all servers.
# --------------------------------------------------------------------------------------
gw_addr=`docker exec openvpn1 ifconfig eth0 | grep inet | grep -v inet6 | awk {'print $2'}`
if [ "${gw_addr}." == "." ];
then
echo "**** Unable to obtain eth0 address from within container ****"
exit 1
fi
gw_exists=`route -n | grep "^10.8.0.0" | grep -v grep`
if [ "${gw_exists}." == "." ];
then
myuserid=`whoami`
if [ "${myuserid}." != "root." ];
then
echo "---- Performing sudo: Enter your password to add the required route ----"
sudo route add -net 10.8.0.0/24 gw ${gw_addr}
else
route add -net 10.8.0.0/24 gw ${gw_addr}
fi
fi
gw_exists=`route -n | grep "^10.8.0.0" | grep -v grep`
if [ "${gw_exists}." == "." ];
then
echo "**** Failed to add reqired route to openvpn network within container ****"
echo "Manually (as root or using sudo) enter : route add -net 10.8.0.0/24 gw ${gw_addr}"
fi
# --------------------------------------------------------------------------------------
# Done
# --------------------------------------------------------------------------------------
I have yet another unrelated pain
I will need another KVM server if I decommission the existing dedicated VPN server anyway.
The already converted IRC container and mvs38j/TK4- container I can happily run on my web server as ip traffic is only between those and clients.
An openvpn application requires access to the internal network, and I have locked down my webserver with iptables rules that prevent it from initiating any connections to the internal network; which I obviously do not want to change. So if I containerise it, it will still need its own server, so I may as well just continue using the existing working KVM server even though host memory is becoming an issue, as the alternative of running it as a container on the KVM host itself rather than in a KVM while it would release resources is an extremely insecure option I prefer not to consider. If there is misuse I would prefer to kill a KVM rather than a host that would affect multiple KVMs.