Installing OpenStack Ussuri in a multicompute node configuration in your home lab

This post covers installing from the RDO project documentation that can be found at https://www.rdoproject.org, it uses the ‘packstack’ tool covered at https://www.rdoproject.org/install/packstack with the main changes in this post being defining additional compute nodes plus installing heat support as I use heat patterns a lot.

It also covers implementing connectivity from your OpenStack private networks to your existing network; in the correct order as the document on the RDO site to achieve this does not mention the need to install OpenVswitch and has editing the network scripts in the wrong order for me; I prefer that the bridge be configured and working before trying to install OpenStack.

This post does not cover a high-availablity installation, it is for a simple home lab where having a single network/control node is adequate and you have a few servers lying around with enough resources to be additional compute nodes (whether physical or with capacity to run KVM servers to be compute nodes).

While the post covers adding additional compute nodes it you wish to use the ‘allinone’ environment simply omit adding additional ip-addresses to the list of compute nodes when editing the answers file.

The last release of OpenStack from the RDO project for CentOS7 is the “Train” release. To obtain the latest “Ussuri” release you must be running CentOS8. This post covers using CentOS8 as we all want the latest release of course; I used CentOS8 media for 8.0.1905 installation.

If you follow all the steps in this post you will end uo with a working OpenStack environment with full external network connectivity

Creating the installation environment, changes needed on the network/controller server

Install two (2) or more new CentOS8 servers, I used KVM with the first server to be used as the network/control node with 15Gb memory and with a 70Gb disk, and a compute node and the second server a dedicated compute node with 10Gb memory with 50Gb disk. These servers must have static ip-addresses in your network range. I used 192.168.1.172 for the network/control node and 192.168.1.162 for the dedicated compute node. Memory allocations for dedicated compute nodes depend on what you intend to run of course. It is important to note these new servers should be assigned fully qualified hosts names.

Note: if you want more than one dedicated compute node create additional CentOS8 servers with static ip-addresses at this time also. However ensure you do not go mad and over-allocate compute nodes simply because you have spare capacity now; you may want to use that spare capacity for something else later and it is a lot harder to remove a compute node than to add an additional one so only define those you need initially, and add more later if needed in preference to trying to unconfigure unused compute nodes later on.

It should also be noted that CentOS8 does not provide the network-scripts package by default as it is depreciated. However it is a requirement that this is used rather than NetworkManager to configure your static-ip setup; as the scripts will need to be edited to setup bridging (on the assumption you want network access for your openstack environment).
eithert
A note on disk sizing. 50Gb of virtual disk size should be more than enough for both servers if you do not intend to create permanent volumes or create snapshots, however if you do wish to do either of those it is important to note that by default SWIFT storage is on a loopback device on your filesystem. The maximum size of this can be set when editing the answers file discussed later in this post but you need to reserve enough space on your network/control node to cope with all the persistant storage you are likely to use.

I should also point out at this time that I place a compute node on the network/control node spcifically to run a single small ‘gateway’ server instance as a bridge between the openstack private network and my normal external network (as if the network/control server is down there is no point havingh it anywhere else and placing it there eliminates network issues in reaching it) after which I disable the hypervisor for the network/control node to force all new instance creations to only be created on dedicated compute nodes. You may wish to not place any compute node functions on your network/control node which is probably the recomended method.

Once the new servers have been created you must update the /etc/hosts files on all the new servers (or dns servers if you use those) to ensure they are able to resolve each other by fully qualified server names, if they are not able to resolve each other installation will fail half way through leaving a large mess to clean up, to the point it is easier to start again from scratch, so ensore they can resolve each other. Also at this time on all servers perform the below steps.

dnf install network-scripts -y
systemctl disable firewalld
systemctl stop firewalld
systemctl disable NetworkManager
systemctl stop NetworkManager
systemctl enable network
systemctl start network

I prefer using OVS as the backend for the network node rather than the default of OVN, however I was unable to get this release of OpenStack networking working using OVS so this post covers installing it to use OVN which is fine for small home labs but does not support the VPNaaS or FWaaS services and uses Geneve as the encapsulation method for tenant networks.

OpenVswitch should be installed and configured before trying to install OpenStack to ensure the bridge is working correctly.

With all the notes above, only on the network/controller node perform the following steps

dnf update -y
dnf config-manager --enable PowerTools
dnf install -y centos-release-openstack-ussuri
dnf update -y
dnf install -y openvswitch
systemctl enable openvswitch

Then you need to edit some files in /etc/sysconfig/network-scripts, the initial filename will change based on your installation but for this example we will use mine which is ifcfg-ens3. Copy (not mode, copy) the file to ifcfg-br-ex ( ‘cp -p ifcfg-ens3 ifcfg-br-ex’ ); then edit the ifcfg-br-ex file to have the following changes.

  • The TYPE becomes TYPE=”OVSBridge”
  • The DEVICE becomes DEVICE=”br-ex”
  • The NAME becomes NAME=”br-ex”
  • Change BOOTPROTO from none to BOOTPROTO=”static”
  • Add a line DEVICETYPE=ovs
  • Add a line USERCTL=yes
  • Add a line DOMAIN=”xxx.xxx.xxx” where your domain is used, for example if your servername is myserver.mydept.example.com use mydept.example.com here
  • Delete the UUID line, that belongs to ens3 not br-ex
  • If a HWADDR line exists delete that also as it also referes to ens3 (a fresh install of CentOS8 does not use HWADDR)
  • All remaining parameters (ipaddr, dns, gateway etc remain unchanged)

Now you need to edit the origional ifgfg-xxx file, in my case edit ifcfg-ens3. This is an exercise in deletion, with only a few edits other than deleting lines, so it is easier to show an example. The below is what the ifcfg-ens3 file looks like after editing. Note that the HWADDR can be obtained from the ‘ether’ field of an ‘ifconfig ens3’ display and the UUID value will have been populated in the origional file during the install (either UUID or HWADDR can be used but I prefer to code both).

DEVICE="ens3"
BOOTPROTO="none"
TYPE="OVSPort"
OVS_BRIDGE="br-ex"
ONBOOT="yes"
DEVICETYPE=ovs
HWADDR=52:54:00:38:EF:48
UUID="6e63b414-3c7c-47f2-b57c-5e29ff3038cd"

The result of these changes is that the server ip-address is going to be moved from the network interface itself to the bridge device br-ex and managed by openvswitch after the server is rebooted.

One final step, update /etc/hostname to contain the fully qualified name of your server.

Then reboot the network/controller node. When it restarts ‘ifconfig -a’ (or ‘ip a’) must show that the ip-address has been moved to the new br-ex device.

If all has gone well the server is configured correctly for the install to begin on this server once the compute nodes have been prepared.

Creating the installation environment, changes needed on the compute nodes servers

After all the work above you will be pleased to see there is very little effort required here. Simply perform the steps below on every compute node to ensure that when the deployment needs packages installed the repositories needed are configured. Also ensure you followed the steps for all servers to switch from using NetworkManager to network and disable firewalld as was mentioned earlier.

dnf update -y
dnf config-manager --enable PowerTools
dnf install -y centos-release-openstack-ussuri
dnf update -y

Also update /etc/hostname to ensure it is set to the correct FQDN for the server, and remember to ensure the /etc/hosts file (or dns servers) have been updated to be able to resolve every nee server you have created for this envoronment.

Backup all your Virtual machine disks

At this stage you have a environment ready to install the RDO packaging of OpenStack onto.

You would not want to have to repeat all the steps again so shutdown the VMs and backup the virtual disk images. This will allow you to restart from this point as needed.

Once the virtual disk images have been backed up restart the VMs and continue.

Preparing the installation configuration settings, on the control node

Packstack by default will build a stand-alone single all-in-one environment with the optional features it thinks you may need. We wish to override this to support our additional compute nodes and add any other optional features you may wish to play with.

To achieve this rather than simply running packstack with the ‘–allinone’ option we will use the option ‘–gen-answer-file=filename’ packstack option to generate an answers file that we can edit to suit the desired configuration and feature installs.

Note the br-ex mapping to ens3 which was my interface, change ens3 to your interface name. Also note that as mentioned above we are using OVN networking.

dnf install -y openstack-packstack
packstack --allinone --provision-demo=n \
   --os-neutron-ovn-bridge-mappings=extnet:br-ex \
   --os-neutron-ovn-bridge-interfaces=br-ex:ens3 \
   --gen-answer-file=answers.txt \
   --default-password=password

In the example above the answers file is written to answers.txt; we need to edit this file to customise for the environment we wish to build.

You must search for the entry CONFIG_COMPUTE_HOSTS and update the entry with a comma seperated list of all the compute node server ip-addresses you wish to become compute nodes. In my case as the default is 192.168.1.172 (the network/control node packstack was run on) I just added the second ip-address also.

Other entries to note are the entries CONFIG_CINDER_VOLUMES_SIZE which defaults to 20G and CONFIG_SWIFT_STORAGE_SIZE which defaults to 2G. This amount of space will be needed to be available on your virtual disk filesystem. The first CONFIG_CINDER_STORAGE used space under /var/lib/cinder and must be large enough to contain all the block storage devices (disks) for all instances you will launch so if you will be running instances with large disks you will probably want to increase that. The second is for a loopback filesystem for swift object storage and I have had no problems with leaving that at 2G. Note however I do not use snapshots and seldom create additional volumes for instances, if you intend to do so definately increase the value of the first, as I am using a virtual disk size of 70G on the network/contro,l node and 50G on the compute nodes I set both to 20G for my use.

As a general rule options in the file set to “y” should be left that way but the other options in the file you can change from “n” to “y” to suit your needs, for example I use heat patterns so set CONFIG_HEAT_INSTALL=y (and CONFIG_HEAT_DB_PW, CONFIG_HEAT_KS_PW, CONFIG_HEAT_DOMAIN_PASSWORD, CONFIG_MAGNUM_DB_PW, CONFIG_MAGNUM_KS_PW set), likewise I set CONFIG_MAGNUM_INSTALL=y for container infrastructure support.

Setting entries turned on by default to “n” cannot be guaranteed to have been tested but will generally work.

This will run KVMs ‘hot’ (lm-sensors shows my kvm host cpu temperatures go from 35 to 85 (with 80 being the warning level on my cores)) so I did for my use set CONFIG_CEILOMETER_INSTALL and CONFIG_AODH_INSTALL to “n” as I don’t need performance metrics, and that alone dropped the temperatore of the cores by 10 degrees.

Interestingly when I gave up on OVS networking and switched to OVN networking temperatures dropped another 10degrees so OVN networking is preferred. But expect a lot of spinning cooling fan noise anyway.

When you have customised the answers file to suit your needs you are ready to perform the install.

Performing the OpenStack install

You have done all the hard work now, to install OpenStack simply run packstack using the answers file ensuring all your new servers can resolve ip-addresses to hostnames via /etc/hosts or DNS and that all the servers are available, you may want to ssh from the network/control node to each of the compute node servers to ensure connectivity before running the packstack command, and recheck name resolution.

Remember that the commands I have used are for an openswitch OVN environment so ensure you performed the steps to create the br-ex bridge covered earlier also.

To perform the instal simply run packstack using your customised answers file as below. You will be prompted to enter the root password for each of the compute you have configured. You may need to use the –timeout option if you have a slow network; not just a slow internal network but internet as well as many packages will be downloaded. It will probably take well over an hour to install.

packstack --answer-file=answers.txt [--timeout=600000]

When it states the installation has been sucessful define a router using a flat network to your local network, then create a floating ip range that can be used by openstack, ensure the floating ip range is not used by any of your existing devices.

Note that to issue any commands you need to source the keystonerc_admin file that will have been created in the root users home directory to load credentials needed to issue configuration commands.

These commands can be enteres as soon as the ‘packstack’ command has completed sucessfully as it starts all the services required as part of the installation. Change the network addresses to the addresses used by your external network and ensure the allocation pool range does not conflict with any addresses published by your dhcp server or home router).

source keystonerc_admin
neutron net-create external_network \
  --provider:network_type flat \
  --provider:physical_network extnet \
  --router:external
neutron subnet-create --name public_subnet \
  --enable_dhcp=False \
  --allocation-pool=start=192.168.1.240,end=192.168.1.250 \
  --gateway=192.168.1.1 external_network 192.168.1.0/24

Once this step has completed ‘tenant’ private networks can be created and configured to use the router for external network access, and floating up-addresses can be assigned to servers within the tenant private network to allow servers on the external network to connect to servers on the private network. What I would normally do is start only one instance with a floating ip-address per private network and simply add a route to my desktop to access the private network via that gateway (ie: if the gateway was assigned floating ip 192.168.1.243 and the private network within the OpenStack tenant environment was 10.0.1.0/24 I would simply “route add -net 10.0.1.0/24 gw 192.168.1.243” and be able to ssh directly to any other instances in the 10.0.1.0/24 network via that route).

If you have reviewed the documentation on the RDO website you will have seen that a project is created and a user assigned to the project, router and tenant private network created from the command line interface. I personally prefer doing that through the horizon dashboard after installation to ensure everything looks like it is working correctly; to do so logon as admin and create a project, then create a personal user user and assign it to that project in an admin role.

To access the horizon interface simply point your webbrowser at “http://the address of your controller node/dashboard”.

After creating the project and user logoff the dashboard and logon again as the new user.

  • Go to Network/Networks and create a new private network for your project, example marks_10_1_1_0, subnet marks_10_0_1_0_subnet1 with 10.0.1.0/24 (let gateway default), subnet details would be 10.0.1.5,10.0.1.250 (do not use the first few entries as they used to be reserved, ie: 10.0.1.1 used to be assigned to the gateway so do not permit it in the allocation pool)
  • Then create a router for your project using Network/Routers for example marks_project_router and assign it to the predefined ‘external_interface’ we created from the command line earlier; then from the router list select the new router and the interfaces tab and from there attach your projects private network to the router also. Instances in your project will now have external connectivity and be able to be assigned a loating ip-address from the public_network allocation range defined from the command line earlier when the public subnet was created
  • This would also be a good time to use Network/Security Groups to create a security group to use for testing such as a group named allow_all, a new group defaults to allow all for egress but we want to also add rules ingress tcpv4 ALLTCP, ingress ALLICMP (to allow ping) and egress ALLICMP so we can test connectivity to everything and also use tools like ping for troubleshooting. Obviously when you are happy an isntance is working you would want to create custom security group rules only permitting the access required but rules can be added/deleted on the fly and multiple groups can be used so a server may contain for example a rule for http and mariadb rather than needing a combined rule with both
  • before logging out and still as your own userid go to Project/Compute/Key Pairs and create a ssh key for your userid, you will need it to ssh into instances you launch

At this time you would also want to create a ‘rc’ file for your new user, ‘cp -p keystonerc_admin keystonerc_username’ end edit the new file to contain your new username and password and set the project to your personal project; this is the rc file you will source when working from the command line for your project instead of using the admin project.

This is a good time to look around, from your project signon look at Admin/Compute/Flavors (if you remembered to make your personal userid an admin role); you will see that the default flavors are too large for most home lab use, you will use this location to define custom flavours useful to your environment as you load images to use for your servers. You wil also notice that under the Images selection there are no images available which is correct, we have not yet adeded any.

Also check the Hypervisors tab to make sure all the compute nodes you defined in the answers file have been correctly setup. Testing the compute nodes is covered later in this post.

Obtaining cloud images to use

To launch an instance you need a cloud image and a flavour that supports the image. Fortunately many distributions provide cloud images that can be used in OpenStack, for example the Fedora33 one can be found at https://alt.fedoraproject.org/cloud/. There are also CentOS cloud images available at https://cloud.centos.org/centos/8/.

It is important to note that not all operating systems provide cloud images, and some operating systems simply will not run in an OpenStack environment; an example of one such is OpenIndianna which runs fine in a KVM but will not run properly under OpenStack.

Once you have a cloud image downloaded onto your network/control node you need to load it into OpenStack, to do so you need to define a few minimum values needed for the image. Using the F33 image as an example the cloud image disk size can be obtained from qemu-img as below.

[root@vmhost3 Downloads]# qemu-img info Fedora-Cloud-Base-33-1.2.x86_64.qcow2
image: Fedora-Cloud-Base-33-1.2.x86_64.qcow2
file format: qcow2
virtual size: 4 GiB (4294967296 bytes)
disk size: 270 MiB
cluster_size: 65536
Format specific information:
    compat: 0.10
    refcount bits: 16
[root@vmhost3 Downloads]#

That shows the minimum virtual disk size that can be allocated is 4Gb, it is important to do this step as disk sizes change, for example CentOS7 cloud image needs a minimum of a 10G virtual disk. It is up to you to decide what the miminum memory requirements should be, to load the F33 image you would use commands such as the below (making it public allows all tenants to use it, if omitted only the project owner can use it; you would create a keystomerc_username for each of your own project/user environments if you wanted the image private to your own environment).

source keystonerc_admin     # note you can use your own projects rc file here
glance image-create \
 --name "Fedora33" \
 --visibility public \
 --disk-format qcow2 \
 --min-ram 512 \
 --min-disk 4 \
 --container-format bare \
 --protected False \
 --progress \
 --file Fedora-Cloud-Base-33-1.2.x86_64.qcow2

Once the image is loaded you would use the horizon dashboard to create a new flavour that supports the new image; rather than use the default flavours which will be too large. The minimum disk size is important and you must code it to avoid issues; while you could load the image with no min-disk-size if you then tried to launch an instance with this disk with a 2Gb disk size flavor it would obviously crash.

Also note that the cloud-init scripts will resize the image upward automatically if a larger disk size is used in a flavor so you can load the image with a min-disk of 4 and use a flavor with a disk size of 10G quite happily and end up with a 10G disk in your instance.

You should load all the openStack cloud images you are likely to need to use at this time, I would generally have a few versions of Fedora and CentOS. Create flavors for the images also.

Testing your installation

If you had a look around the dashboard as suggested earlier you will have found the hypervisors display, from the compute nodes tab of that screen you are able to individually disable compute nodes.

What you should do as a final verification step is disable all but one compute node at a time an launch an instance with a security group rule that allows ICMP so you can ping it to ensure the instance starts correctly on each individually active compute node and that you have console connectivity to the instance via the daskboard instances screen. Some earlier releases required manual correction of configuration files on compute nodes to obtain console access but that has been fixed in this release, but still needs to be tested.

From the instances screen you can select each instance and the details will show you which compute node an instance has been configured on.

You should assign a floating ip-address to one of the instances and ensure you can ssh into the instance on the floating ip-address assigned using the key-pair you created. Note that the userid to use for each instance differes, for Fedora cloud images the user will be fedora, likewise use the userid centos for CentOS cloud images. That will verify inward connectivity. Use syntax suxh as “ssh -i identity_file.pem fedora@floating-ip-address”.

From that ssh session to your server you should also from that instance connection ping the instances you started on other compute nodes to test private network connectivity; you cannot ssh into them unless you copy your key to the instance you just logged onto. Note that if in the user configuration section when you launched your instance you set the root password you could logon directly from a console session to do the ping tests.

You should also ping one of the servers in your external network to ensure outward connectivity, and an internet based server also to ensure routing to the big wide world works as you will probably want to be installing software from repositories onto your instances.

Once you are happy all compute nodes are working correctly you can re-enable all the compute nodes; delete your test instances and start using it.

Additional Tips and notes

  • The Fedora33 image load example used in this post had a minimum memory allocation of 512; dnf will be killed by the early out-of-memory killer with this allocation but then run OK with over 100K free, if you want to use dnf allocate at least 756 in your flavour
  • In the user custom script section always have the first line as “#!/bin/bash” or commands will not be executed but produce an invalid multipart/mime error from cloud-init; to enable easy troubleshooting I normally have in that section
    #!/bin/bash
    echo "password" | passwd root --stdin
    
  • you do not need many floating ip-addresses, for each private network I only assign one to a server and use that server as a gateway/router into the private network from my external network
  • I recomend installing heat, using heat patterns to deploy a stack of servers is the easiest way to do so; some of my earlier posts have examples of using heat patterns to deploy multiserver test environments
  • remember to make your personal userid an admin role; this avoids having to repeatedly login as admin to enable/disable hypervisors and manage flavors
  • Also note that if you have a slow network between compute nodes the first instance deployment for an image may fail, as the image must be copied to the remote compute node before the instance launches and may timeout. Waiting for 5-10mins and trying again will work as the image will have completed transferring and not need to be copied again; although unused images will be cleaned up by timers if it remains unused on the compute node for too long
Posted in OpenStack, Virtual Machines | Comments Off on Installing OpenStack Ussuri in a multicompute node configuration in your home lab

DropBox and its LanSync facility

First let me make it clear this post is on dropbox as used in a Linux environment, specifically for this post Fedora (Fedora 32). Dropbax clients may behave differently in different environments.

What is lansync

The dropbox reference pages promote lansync as a way of saving external bandwidth, in that if a file is updated it can be replicated to other machines replicating that folder across the local subnet/network without each local client needing to retrieve the changed file from the dropbox servers.

Ports lansync and dropbox use

A DuckDuckGo search of “dropbox lansync port numbers” returns the following pages.
The description of lansync is that it wants port 17500 on both udp and tcp.
(reference https://help.dropbox.com/installs-integrations/sync-uploads/lan-sync-overview)
Or 17500 only on tcp (reference https://dropbox.tech/infrastructure/inside-lan-sync)

From observation it uses port 17500 on udp, tcp, and tcp6

[mark@hawk bin]$ netstat -an | grep 17500
[mark@hawk bin]$ dropbox lansync y
[mark@hawk bin]$ netstat -an | grep 17500
tcp        0      0 0.0.0.0:17500           0.0.0.0:*               LISTEN     
tcp6       0      0 :::17500                :::*                    LISTEN     
udp        0      0 0.0.0.0:17500           0.0.0.0:*                          

However, due to the extensive logging on run on one of my servers, a server which does not run dropbox, only udp appears to be used. Worse, dropbox never tops trying, two machines run dropbox and each are retying a server that will never respond at thirty second intervals

Sep 17 14:49:24 mdickinson kernel: DROPPED IN=ens3 OUT= MAC=ff:ff:ff:ff:ff:ff:50:b7:c3:20:19:8f:08:00 SRC=192.168.1.9 DST=255.255.255.255 LEN=161 TOS=0x00 PREC=0x00 TTL=64 ID=34187 DF PROTO=UDP SPT=17500 DPT=17500 LEN=141 
Sep 17 14:49:33 mdickinson kernel: DROPPED IN=ens3 OUT= MAC=ff:ff:ff:ff:ff:ff:74:d0:2b:92:f4:f7:08:00 SRC=192.168.1.187 DST=255.255.255.255 LEN=161 TOS=0x00 PREC=0x00 TTL=64 ID=7163 DF PROTO=UDP SPT=17500 DPT=17500 LEN=141 
Sep 17 14:49:54 mdickinson kernel: DROPPED IN=ens3 OUT= MAC=ff:ff:ff:ff:ff:ff:50:b7:c3:20:19:8f:08:00 SRC=192.168.1.9 DST=255.255.255.255 LEN=161 TOS=0x00 PREC=0x00 TTL=64 ID=54938 DF PROTO=UDP SPT=17500 DPT=17500 LEN=141 
Sep 17 14:50:03 mdickinson kernel: DROPPED IN=ens3 OUT= MAC=ff:ff:ff:ff:ff:ff:74:d0:2b:92:f4:f7:08:00 SRC=192.168.1.187 DST=255.255.255.255 LEN=161 TOS=0x00 PREC=0x00 TTL=64 ID=21945 DF PROTO=UDP SPT=17500 DPT=17500 LEN=141 
Sep 17 14:50:24 mdickinson kernel: DROPPED IN=ens3 OUT= MAC=ff:ff:ff:ff:ff:ff:50:b7:c3:20:19:8f:08:00 SRC=192.168.1.9 DST=255.255.255.255 LEN=161 TOS=0x00 PREC=0x00 TTL=64 ID=2628 DF PROTO=UDP SPT=17500 DPT=17500 LEN=141 
Sep 17 14:50:33 mdickinson kernel: DROPPED IN=ens3 OUT= MAC=ff:ff:ff:ff:ff:ff:74:d0:2b:92:f4:f7:08:00 SRC=192.168.1.187 DST=255.255.255.255 LEN=161 TOS=0x00 PREC=0x00 TTL=64 ID=26919 DF PROTO=UDP SPT=17500 DPT=17500 LEN=141 

Note: when dropbox is running it also uses ports 17600 and 17603, although these are bound to localhost so not a risk as long as you have those ports open for access from localhost (yes I do have some firewall rules that prevent access to services from localhost). That is documented at https://help.dropbox.com/installs-integrations/desktop/configuring-firewall

root@hawk ~]# netstat -an | grep 176
tcp        0      0 127.0.0.1:17600         0.0.0.0:*               LISTEN     
tcp        0      0 127.0.0.1:17603         0.0.0.0:*               LISTEN 

Nowhere can I find documentation on when and why dropbox will use port 17601/tcp. However on multiple occasions on multiple machines I have seen port 17601 tcp in use by /home/mark/.dropbox-dist/dropbox-lnx.x86_64-105.4.651/dropbox (identified by ‘netstat -anp’ and querying the pid; note also seen on earlier versions of dropbox), although only on localhost (only listening on 127.0.0.1). And it appears to stop listening immediately when lansync is turned off ?.

The same cannot be said for the udp port 17500, turning off lansync stops the 30 second polling but leaves the local udp port open until dropbox is restarted.

[mark@hawk bin]$ dropbox lansync n
[mark@hawk bin]$ netstat -an | grep 17500
udp        0      0 0.0.0.0:17500           0.0.0.0:*                          
[mark@hawk bin]$ dropbox stop
Dropbox daemon stopped.
[mark@hawk bin]$ dropbox start
[mark@hawk bin]$ netstat -an | grep 17500
[mark@hawk bin]$ dropbox status
Up to date
[mark@hawk bin]$ 

It is probable that it uses udp to find machines, and tcp to transfer data.

Why not to use lansync, and when it may be ok

In a small replication environment where there may be only two or three machines replicating a folder if the file changes are small it is probably going to generate less network traffic by having all machines use the dropbox servers as the source for the changed file; especially as whichever machine changes the file will push it to the dropbox servers anyway.

Also in a small replication environment where there may be only a few users replicating the directory contents having those few users running lansync to poll every machine in the local subnet every thirty seconds is a large overhead (there may be hundreds of machines on the local subnet having to put up with all the polling).

Then of course there may be other small groups all doing the same thing causing a lot of needless broadcast udp traffic.

In environments such as this the preferred solution would be to do something like have an ownCloud or nextCloud server for each small group, all that needs is a single http server and each group could be assigned its own unique url to their own copy of it. Yes both of those two file sync software products also have clients that perform extensive polling, but only to the ownCloud/nextCloud server they are configured to use if they are doing file syncronisation rather than to every server and desktop in the local subnet/network. And if the files should be on dropbox, that one server can push them there; with maybe one ‘manager’ interacting with dropbox to manage conflicts.

In a large environment lansync would make sense if the majority of users on the same subnet were all interesting in replicating the same set of files. However I would think such a situation would be the exception rather than the norm.

It would also make sense in an environment where internet bandwidth is constrained or chargeable as while generating a lot of additional traffic on the local subnet it should cut down on external internet traffic, although one copy would always be uodated on the dropbox servers.

The main issue with lansync

The main issue with lansync is that is enabled by default. For home users with a small local network of a few PCs, maybe a TV, possible a couple of cell phones and maybe a laptop or tablet, connected to your default home router; having all those devices spammed with UDP traffic every thirty seconds simply because one device may have had dropbox installed on it is a pretty bad default.

In a large commercial environment even where lansync may make sense having it spam from install time before admins get around to setting up routing/firewalls/filtering also seems like a bad default; it should default to off in both large and small environments and turned on if needed.

However as toggling it is as sinple as using the commands ‘dropbox lansync n’ and ‘dropbox lansync y’ it is only an issue if users do not know that is is enabled by default. I only discovered it was the default when the firewall logs on one of my servers starting dropping traffic on port 17500 and I investigated where it was coming from.

Disclaimer

I do use dropbox for personal use, there are some files I want to guarantee a copy of should all my servers suddenly die, as it is only on a desktop and laptop (laptop normally powered off) I have disabled lansync.

A second desktop is just manually kept in sync by copying changes to important files in the dropbox folder to an ownCloud folder syncing across all my desktop machines; and of course bacula does daily backups of both.

For important files do not rely on one solution. An if you are syncronising files across multiple desktops use a solution that does not broadcast udp traffic to servers, TVs, cell phones etc as that is a needless overhead that you do not want; if you use dropbox at home disable lansync.

Posted in Unix | Comments Off on DropBox and its LanSync facility

Ansible vs Puppet, why I chose Puppet

First off let me make it clear I looked at options for automating configuration quite a while ago and chose puppet as the most secure option. I still believe that to be the case.

The issue I have with options such as Ansible and Chef is that they need ssh keys lying about the place, where the CE release of puppet uses only agents. I have not really looked at Chef as I do not have the computing resources available to run it but I have taken another look at ansible, which is what this post is about.

My reference for quickly installing ansible was https://www.redhat.com/sysadmin/configuring-ansible

It is also important to note that this is my first serious look at ansible, as such many of the issues I have with it may have solutions I have not found yet.

This post is in no way meant to identify one solution as better that the other as there are probably many ansible (and puppet) plugins and tools to emulate each others functionality that I have not had the need to find yet.

Ansible notes quick overview

Ansible requires as well as the ‘ansible’ package installed on the control server

  1. the ansible user defined an all servers to be managed
  2. ssh keys generated and the public key of the ansible control server(s) to be installed on all servers to be managed
  3. an entry in the sudoers file to allow the ansible user on all servers to be managed to run commands as root without password prompting
  4. the sftp service an all servers to be managed to be available

The issues I see with this are

  1. no issues, a normal user
  2. anyone with access to logon to the userid on any of the control servers can logon to any managed server without authentication. Normally not much of an issue as most users setup ssh keys to allow that, but for all servers in the environment is not normal
  3. and once on any server can issue any command they want with root authority
  4. and normally (and the default) is that the sftp service is disabled in /etc/ssh/sshd_config as most users use ‘scp’. This last is probably not a risk but why require services disabled by default when there are alternatives that are enabled by default

Anyway, I created a new class in puppet to define the new group and user on all servers I have puppet managing. I explicitly defined a group, then added the new user to that group, to ensure it was the same on all servers (well away from existing user number ranges rather than defaulting to the next free gid/uid as that would have resulted in different numerical ids on my servers). This new puppet class also copied the public key for the ansible user to all servers being managed.

At that point I had to manually (disclaimer: where I refer to ‘all server’s I only chose a subset of servers to add to a group, as it was getting to be a lot of work for a simple test)

  1. from the control server ssh to all the servers to be managed to reply ‘y’ to the new fingerprint message; I now know that could have been avoided by running the ansible ping command against a ‘group’ containing all servers with the –ssh-common-args=”-o StrictHostKeyChecking=no” option to the ansible command would which have ssh automatically accept the fingerprint
  2. manually on every server to be managed add a new sudoers entry to allow the ansible group to issue any damn command it wants with root authority without needing password authentication (I used the new group rather than adding the new user to an existing group as some tutorials suggest simply because I want existing groups to still have to enter a password when using ‘su’)
  3. manually on all servers to be managed uncomment the sftp entry in the sshd_config file and restart sshd

Only then would a simple ansible ping of the servers work.

Ansible allows groupings of servers so commands can be issued against groups, it also allows for multiple inventories to be used as long as you remeber to use the ‘-i’ option to select the correct one. Output from commands appears to be returned in JSON format so if you want to script handling responses there will be a few scripting tool packages you would need to install.

By default ansible works on a ‘push’ model when changes are pushed out from the control server; however documentation at https://docs.ansible.com/ansible/latest/user_guide/playbooks_intro.html#id16 describes an “ansible-pull” utility that can alter that to an environment where your managed nodes query the control node instead which is apparently best for large environments. I have not tried that and presumably it would require more messing about with ssh keys.

Puppet notes quick overview

To obtain a working puppet CE environment is simply a case of

  1. ensure your DNS can resolve the server name ‘puppet’ to the server you will install ‘puppetserver’ on
  2. firewall ports, agents poll the master on port 8140 so that needs to be open on the master server
  3. installing ‘puppetserver’ on the control node and starting it
  4. installing the ‘puppet’ (agent) package on each server to be managed and starting it

At this point on the ‘puppet’ master a ‘puppet cert list’ will show all your servers waiting for you to ‘puppet cert sign hostname’ to allow them to use the master. It should also be noted that there is a puppet configuration option to permit ‘autosigning’ which I switched on when first adding all my servers before switching off again that makes it easer to enroll all your agent servers when first installing a puppet solution.

What the puppet CE solution does not provide is an equivalent of the ansible ‘–become’ option that allows anyone logged on to the ansible user on the control node to issue any command they desire as root without authentication on any of the managed server nodes… I personally think not being able to do so is a good thing.

However if you really wanted that facility you could configure sshd to permitrootlogin on all your puppet managed nodes and setup ssh keys from the puppet master and simply use ‘ssh hostname anycommand’ to issue commands as root on any managed server, so if you want to open the gaping hole ansible opens you can do so anyway… although I would suggest adding a new user and allowing it to su without a password exactly like ansible does; so you don’t need ansible for that dangerous feature.

Puppets equivalent of the ansible playbook groupings by function are puppet profiles and roles, and a server may have multiple of each. It also supports multiple environments (ie: a test/dev as well as the default ‘production’), there is no reason it could not also contain environments such as webservers, databases etc but application specific grouping is probably better left to puppets use of role and profile functions to group application configurations.

Its grouping of servers themselves is done in the manifest/site.pp where groups of servers can be defined by wildcard host name or selected as lists of individual hosts and given the roles/modules associated with them.

Puppet works on a ‘pull’ model, the puppet agents on each managed note poll the master for any updates.

Usage differences

Ansible uses YAML syntax, which is very dependant on correct indenting. The latest versions of puppet have their own syntax although still support ruby for backward compatibility and puppet class files do not care about indenting as long as you have the correct number of braces and brackets. I also find puppet configurations easier to read.

An example of a ansible playbook to install httpd

---
- hosts: webservers
  remote_user: ansible
  become: yes
  tasks:
  - name: Installing apache
    yum:
      name: httpd
      state: latest
  - name: Enabling httpd service
    service:
      name: httpd
      enabled: yes
    notify:
      - name: restart httpd
  handlers:
  - name: restart httpd
    service:
      name: httpd
      state: restarted

Then run the command "ansible-playbook -i idenityfile playbookname.yaml"

The puppet equavilent

class httpd {
   package { 'httpd':
     ensure => installed,
   }
   service { 'httpd':
     ensure => running,
     enable => true,
   }
} # end httpd class

Then ensure the new class is added to the site manifest

node 'somewebserver','anotherwebserver' {
   ...some stuff
   include httpd
}

Deployment is automatic although dependant upon the agents poll interval; and immediate refesh can be done from the agent side if impatient.

Both ansible and puppet provide a way to publish configuration files and restart services when a file changes

Ansible is a bit confusing, I am not sure if the below will work or even where it belongs in a playbook.

tasks:
  - name: Copy ansible inventory file to client
    copy: src=/some/path/to/a/file/httpd.conf dest=/etc/httpd/httpd.conf
            owner=root group=root mode=0644
    notify:
         - restart apache
handlers:
    - name: restart apache
      service:
        name: apache
        state: restarted

In Puppet there is no ‘handler’ required to be defined as the ‘notify’ can be part of the copy statement. And I personally bundle configuration files for an application within the application class to make them easy to find.

class httpd {
   package { 'httpd':
      ensure => installed,
   }
   service { 'httpd':
      ensure => running,
      enable => true,
   }
   file { '/etc/httpd/httpd.conf': # file resource name, standard is use the name
      path => '/etc/httpd/httpd.conf', # destination path
      ensure => file,
      owner => 'root',
      group => 'root',
      mode => '0644',
      source => 'puppet:///modules/httpd/httpd.conf', # source of file to be copied
      notify => Service['httpd'],
   }
} # end httpd class

Additional puppet features I use

One very usefull feature is managing logs. While I am sure most sites have implemented  log maintenance scripts I found using puppet to manage its own logs easier than creating new scripts, an example is below.

node 'puppet' {
   ... some roles and classes being used
   tidy { "/opt/puppetlabs/server/data/puppetserver/reports":
      age => "1w",
      recurse => true,
   }
}

There is also templating which allows files to be customised for each server, it is used to read a template file from which substitutions are made to provide the input contents for a file to be copied to an agent node. This means that configuration files that would be identical in all but a few key parameters can when being copied to an agent be built on the fly with the correct values set rather than having to have a seperate file for each agent to handle those small differences; and the values can also be set using ‘facter’ information.

The below example I use to install bacula-fd (bacula-client package) on all my servers and ensure the FD name is unique by using the hostname as part of the FD name, and to bind it to the default ip-address rather than the default of listen on all interfaces… the one template creates a unique configuration for all my servers as soon as the puppet agent starts.

For example a snippet from a template (epp) file may be

<%- | String $target_hostname,
String $target_ipaddr
| -%>
# Bacula File Daemon Configuration file
# FileDaemon name is set to agent hostname-fd
...lots of stuff
# "Global" File daemon configuration specifications
FileDaemon { # this is me
   Name = <%= $target_hostname %>-fd
   FDport = 9102 # where we listen for the director
   FDAddress = <%= $target_ipaddr %>
   WorkingDirectory = /var/spool/bacula
   Pid Directory = /var/run
   Maximum Concurrent Jobs = 20
}
...lots more stuff


And the class file would contain the below to get the facter
hostname and use it in creating the file contents

class bacula_fd (
  $target_hostname = $facts['hostname'],
  $target_ipaddr = $facts['networking']['ip'],
){
   ...lots of stuff
   # Use a template to create the FD configuration file as it uses
   # the hostname to customise the file.
   $template_hash = {
     target_hostname => $target_hostname,
     target_ipaddr => $target_ipaddr,
   }
   file { '/etc/bacula/bacula-fd.conf':           # file resource name
       path => '/etc/bacula/bacula-fd.conf',      # destination path
       ensure => file,
       owner => 'root',
       group => 'bacula',
       mode => '0640',
       content => epp('bacula_fd/bacula-fd.conf.epp', $template_hash),
       notify  => Service['bacula-fd'],
     }
   ...lots more stuff
} # end class

It should be noted that ansible documentation makes reference to templates. From what I can see that term is used in a different way for ansible as I can’t see how they can interact with a ansible copy task. I have found an example of ansible using variables in a similar way as below so I assume it is possible, just hard to find documentation on.

   - name: create default page content
     copy:
       content: "Welcome to {{ ansible_fqdn}} on {{ ansible_default_ipv4.address }}"
       dest: /var/www/html/index.html
       owner: webadm
       group: web
       mode: u=rw,g=rw,o=r

One other ability of puppet I make heavy use of is its ability to query facter information, one class file can with use of if/else statements run blocks of code depending on OS version so an application class file can install the correct packages for CentOS7, the correct but different packages for CentOS8, completely different packages for each of Fedora30/31/32, to result in the application installed and running (or skipped if the OS does not support it). I have not seen any ansible yaml files that provide that so assume multiple inventory files are needed, one for each OS type.

For servers with firewalld I can use a single class file with all common services and ports and if/else to provide all customisation for different servers in one place using rich firewalld rules (note: ansible seems to have only the normal rules for services and ports but not rich rules, but it may just be another case of ansible documentation/examples being hard to find). Looks like for something similar in ansible you would have seperate yaml files (playbooks) for each application type, or in other words not possible to contain all firewall rules for the entire infrastructure in one file if using ansible.

The above two paragraphs highlight an issue for me, as I believe one of the key reasons for using a configuration product is that configuration information can be easily accessed in one place and that one place can be deployed, if multiple files are used you may as well just have those multiple files managed on thier multiple servers as ansible is effectively just a backup copy doing pushes, if you have to maintain multiple files exactly the same file placement can be achieved by editing the files on their individual servers and keeping copies on a backup server; or pointless as you of course backup your servers.

Puppet examples of a class using if/else (now how would you put this into a single ansible yaml file ?; you don’t you create lots of server groups based on OS I would assume with seperate playbooks)

   if $facts['hostname'] == 'phoenix' {
      ...do something unique for this server
   }
   # note: htmldoc is not available on CentOS8 except from snap, so need a check here
   if ( $facts['os']['name'] == "CentOS" and $facts[os][release][major] < 8 ) { package { 'htmldoc': ensure => installed,
      }
   } else {
      if ( $facts['os']['name'] == "Fedora" ) {
         package { 'htmldoc':
           ensure => installed,
         }
      } # else package not available so do nothing
   }

And of course a puppet class has case statements, which can either do actions or set variables to be used later in the class.

   # These rules below are specific to OS releases where the command syntax is different
   # note: 'facter -p' run on a server provides factor details
   case $facts['os']['name'] {
      'RedHat', 'CentOS': {
                             case $facts['networking']['hostname'] {
                                'region1server1': { $fname = "centos_openstack_controller.cfg" }
                                'region1server2': { $fname = "centos_openstack_compute.cfg" }
                                default:            { $fname = "centos.cfg" }
                             }
                          }
      'Fedora':           { $fname = "fedora.cfg" }
      default:            { $fname = "fedora.cfg" }
   }
   file { 'nrpe_os_specific':
      path => "/etc/nrpe.d/${fname}",
      ensure => file,
      owner => 'root',
      group => 'root',
      mode => '0644',
      source => "puppet:///modules/nrpe/${fname}",
      notify  => Service['nrpe'],
    }

As a puppet class is effectively the equivalent of an ansible yaml file playbook I consider puppet to be better for self documenting, as a single class can contain all the login needed to deploy an application on multiple OSs where I believe ansible may require a playbook per OS; although I may later find I am incorrect in that assumption.

Features Ansible provides that Puppet CE does not

The most glaring difference is the way that the ansible command line can be used to issue commands to multiple hosts at once. I have never had a need to simultaneously shutdown multiple databases or webservers on multiple hosts at once although I can see the power of it.

The most useful thing I can think of to do which such power is to have a script that runs a playbook on each server to do lots of netstats, pings, nmap etc and have a script process the results to build a map of you network and its responsiveness. But then there are probably existing tools for that.

Ansible also has a ‘schedule’ tag that can be used in tasks to add crontab entries. I can see how that would be usefull when deploying new software.

Where I can see it being useful is the ability to ad-hoc copy a file to multiple servers with one command, although for config files that need managing puppet does that well.

The documentation says that playbooks can orchestrate steps even if different steps must bounce between machines in a particular order. This will be useful to me as openstack yaml stack files are limited in the information they can pass to the instances they are creating so ansible could replace some of my custom scripts… the damb ssh fingerprint prompt the first time a server is connected to by ansible which totally breaks any automation can be suppressed with the option ‘ –ssh-common-args=”-o StrictHostKeyChecking=no” ‘ used the first time a command is run to allow that.

Complience and consistency and sumary

Ansibles ‘push’ method requires user interaction, although I am sure a cron job could be setup to run every hour across all servers to ensure configuration files and software packages are in an expected state. It is entirely possible RedHat have a commercial dashboard and scheduling system to do just that, but you don’t get that from installing ansible.

Puppet on the other hand will poll the master at regular intervals and replace any configuration file it finds changed with the one from the puppet master; ensuring that is miscreants are modifying critical configuration files all their changes are undone in a totally hands-off way.
Puppet also at the agent poll interval starts services that should be running that were stopped which is nice to have done automatically rather than having to issue ansible commands periodically to see if state needs changing. It is also ‘not-nice’ when you want something stopped and find stopping the puppet agent before stopping an app does not have the desired effect when nagios event handlers restart puppet which restarts the app which… is a reason to document every automation tool in place and what they manage in a pretty flow diagram.

An example of what not to do, in my environment if I want to modify iptables I need to stop docker; if I stop docker a nagios/nrpe event handler will see it is down and reload the iptables and start docker; so I have to stop docker and nrpe; then puppet pokes its nose in and expects nrpe up so starts it resulting in nrpe loading iptables and starting docker again; so I have to stop puppet, nrpe and docker to make a change; do I want a stray ansible command issued to start things again as well ?, no. Si iseally there should only be one tool to manage application persistence in use at a time, on the other hand if nrpe crashed I would want it automatically restarted… where do you draw a line ?, well in a flow diagram so you know what applications to stop in order to keep them stopped.

So my summary of the two is that Puppet is best for configuration management, and ansible is best for issuing commands to multiple places at once. I may leave ansible installed along with puppet a while to see if ansible can be of use to me. For ensuring services are running nagios/nrpe monitoring and nagios/nrpe event handlers to try to restart failed tasks is still the best opensource/free application persistence tool and configuration management tools like ansible/puppet/chef should avoid stepping on its toes.

Other notes

I did use ansible to install one package using a ‘lynx.yaml’ file that used ‘hosts:desktops’ which had two hosts defined under it in the hosts_local inventory, which after I updated the sudoers file on the two machines I was testing against worked.

The errors from a failed change (before I updated sudoers) is below… now I ask you how could you automate scripting to check for errors in output like this. Also Fedora 32 is the latest release of Fedora and ‘/usr/bin/python –version’ returns ‘Python 3.8.5’ so ansible really needs its checks updated as not all OS’s have renamed it to python3.

[ansible@puppet desktops]$ ansible-playbook -i /etc/ansible/hosts_local lynx.yaml

PLAY [desktops] ***************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************
[DEPRECATION WARNING]: Distribution fedora 32 on host vmhost3 should use /usr/bin/python3, but is using /usr/bin/python for backward 
compatibility with prior Ansible releases. A future Ansible release will default to using the discovered platform python for this host. 
See https://docs.ansible.com/ansible/2.9/reference_appendices/interpreter_discovery.html for more information. This feature will be 
removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
ok: [vmhost3]
[DEPRECATION WARNING]: Distribution fedora 32 on host phoenix should use /usr/bin/python3, but is using /usr/bin/python for backward 
compatibility with prior Ansible releases. A future Ansible release will default to using the discovered platform python for this host. 
See https://docs.ansible.com/ansible/2.9/reference_appendices/interpreter_discovery.html for more information. This feature will be 
removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
ok: [phoenix]

TASK [Ensure lynx is installed and updated] ***********************************************************************************************
fatal: [vmhost3]: FAILED! => {"msg": "Missing sudo password"}
fatal: [phoenix]: FAILED! => {"msg": "Missing sudo password"}

PLAY RECAP ********************************************************************************************************************************
phoenix                    : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
vmhost3                    : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
Posted in Automation, Unix | Comments Off on Ansible vs Puppet, why I chose Puppet

Getting AWSTATS running on CentOS8

First let me make it clear I prefer webalizer to awstats; however webalizer is not available in the repositories after CentOS7. It is also worth noting that both webalizer.org and mrunix.net (where webalizer sources/downloads could be manually obtained) appear to be no longer online and there does not appear to be a github source for it either.

Another issue is that webalizer requires, and awstats can use the free geo-ip ‘lite’ country databases from maxmind to resolve ip-addresses to countries; however while still free from December 2019 you must create an account to download them (https://dev.maxmind.com/geoip/geoip2/geolite2/) and we all have far too many internet accounts as it is.

So with webalizer being unavailable and awstats fortunately able to use another free (although out of date) ip lookup database awstats is the logical replacement.

Preparation work

For awstats to use the free ip database you need an additional perl module, you should therefore try to install that first.

perl -MCPAN -e shell
install Geo::IPfree

For awstats to produce PDF reports the ‘htmldoc’ package needs to be installed; that unfortunately is also not in the CentOS8 repositories and would need to be installed using snap. Documentation in installing htmldoc using snap is at https://snapcraft.io/install/htmldoc/centos; however I did not install it as snap is a bloody pain to use and chews up a lot of system resources; the only time I installed snap to get a package I immediately uninstalled it all again. Fortunately if you do not need to generate PDF reports that is no issue. Should the ‘htmldoc’ package ever make it to the CentOS8 repositories I will install it from those at that time.

It is also very important to note that awstats expects all httpd log messages to be in a specific format or it will refuse to process the logs. Anywhere you are defining log files you should use

LogFormat "%h %l %u %t \"%r\" %>s %b \"%{Referer}i\" \"%{User-Agent}i\"" combined
CustomLog "logs/access_log" combined

You would normally need to set these in a default installation in /etc/httpd/conf/httpd.conf and /etc/httpd/conf.d/ssl.conf; but set anywhere you define log files.

It is also important to note that awstats wants to process only one file, and expects it to be in date order, so if you have seperate access_log and ssl_access_log messages (the default in a CentOS install) you will have issues. What I have done, as I wish to keep seperate files to identify the different traffic for other utilities, is leave those alone but add as an additional entry anywhere that needs logging a new logfile all will share, for example the below line I would have in both the main httpd.conf and ssl.conf files so as well as the unique files all messages from both are also logged to a combined file in the correct LogFormat tagged as ‘combined’.

CustomLog "logs/combined_access_log" combined

This permits the single LogFile name needed by awstats to be used against a file that does contain all the log messages needed.

If your logfiles were not already using the correct format all the existing logfiles cannot be processed by awstats. If you have just configured a new ‘combined_access_log’ as discussed above in the required format restart httpd to start aquiring messages in that file that you can use in later steps.

Installing, configuring, populating

To install awstats it is simple a case of ‘dnf -y install awstats’.

After which there is a bit of editing involved, while it does create a /etc/httpd/conf.d/awstats.conf file all the entries in there refer to /usr/local/awstats but the package files are actually installed in /usr/share/awstats. The best option is to remove everything in that file and replace it with the contents below. The file below is based upon the documentation at https://tecadmin.net/steps-to-configure-awstats-on-centos-and-rhel-system/ but modified to include access to the documentation directory (as being able to browse the documentation is useful) as well as using the latest ‘require ip’ syntax (as ‘allow from’ and ‘require host’ are obsolete and not able to be used without loading compatibility modules).

#
# Content of this file, with correct values, can be automatically added to
# your Apache server by using the AWStats configure.pl tool.
#
# If using Windows and Perl ActiveStat, this is to enable Perl script as CGI.
#ScriptInterpreterSource registry
#
# Directives to add to your Apache conf file to allow use of AWStats as a CGI.
# Note that path "/usr/local/awstats/" must reflect your AWStats install path.
Alias /awstatsclasses "/usr/share/awstats/wwwroot/classes/"
Alias /awstatscss "/usr/share/awstats/wwwroot/css/"
Alias /awstatsicons "/usr/share/awstats/wwwroot/icon/"
ScriptAlias /awstats/ "/usr/share/awstats/wwwroot/cgi-bin/"

Alias /awstatsdoc "/usr/share/doc/awstats/"
<Directory /usr/share/doc/>
   DirectoryIndex index.html
   Require all granted
</Directory>

<Directory "/usr/share/awstats/wwwroot">
   Options +ExecCGI
   AllowOverride None
   Require ip 192.168.1.0/24
</Directory>
<IfModule mod_env.c>
   SetEnv PERL5LIB /usr/share/awstats/lib:/usr/share/awstats/plugins
</IfModule>

You then need to ‘cd /etc/awstats’. You will find a model.conf, awstats.localhost.localdomain.conf, and awstats.HOSTNAME.conf (where HOSTNAME) is the name of your host have been created. Copy any one of them to awstats.your.website.name.conf where your.website.name is your website name, for example if your host is webserver.example.org you need a file named awstats.webserver.example.org.conf; and then open the new file in your favourite editor.

  • set the ‘LogFile’ value; if you followed my recomendation of using a combined_access_log above set it to that, otherwise set it to the only one of the files you want to process (masks such as “*access.log” will not work; also logresolvmerge as documented in template file did not work for me)
  • set the ‘SiteDomain’ value to match your website name
  • set the ‘HostAliases’ values
  • set the ‘DNSLookup’ value to ‘0’ unless you want to do lots of dns lookups
  • review the comments for ‘EnableLockForUpdate’ and decide if you want to set it to ‘0’
  • if using dns lookups set the ‘SkipDNSLookupFor’ list
  • you may want to set ip-addr ranges for clients that can browse the results in the ‘AllowAccessFromWebToFollowingIPAddresses’ list; although you are also limiting by ip-address in the conf.d/awstats.conf file above
  • set the ‘SkipHosts’ value to avoid reporting on internal hosts that would skew the report values (such as nagios/nrpe or other health checking activity)
  • Seach on ‘LoadPlugin=”geoipfree”‘ and uncomment it (if you installed the Geo::IPfree mentioned at the start of this post)
  • set ‘UseFramesWhenCGI=0’, browsers such as FireFox will refuse to display any pages presented in frames aas frames are inherently insecure

You should also note the entries for URL alias references (/awstatscss,/awstatsicons etc) throughout the file, initially leave all those as the defaults as they match what is set in the /etc/httpd/conf.d/awstats.conf file; and they must match.

Before you are able to do anything you need to create some initial data. If you have had to change your log message formats after restarting httpd just wait for a while, you need some log messages available before you are able to initialise awstats.

When you have a reasonably large collection of logged messages simply run the command below replacing ‘webserver.example.org’ with your hostname, which is the part between awstats. and .conf in the /etc/awstats directory; the example below would use the config file ‘awstats.webserver.example.org.conf’.

perl /usr/share/awstats/wwwroot/cgi-bin/awstats.pl -config=webserver.example.org -update

Keeping the statistics up to date

Installing the package creates a file that will perform updates for all the hosts you have configured in seperate conf files for each host in /etc/awstats. That job is /etc/cron.hourly/awstats.

So once you have all the configuration in the steps above done your data is updated automatically.

You may want to create a report job in /etc/cron.daily or /etc/cron.monthly to produce reports that can be archived in case you need to initialise your collected statistics at some point.

Viewing the statistics with a web browser

I suppose after all that work you actually want to look at some of the statistics. If you use the example /etc/httpd/conf.d/awstats.conf file above you would point your web browser at https://yourhostname/awstats/awstats.pl?config=webserver.example.org where the config value matches the configuration file you used.

If you are used to webalizer output you may find it frustrating to hunt down information in. Some information is also incorrect with default settings although I suppose it is possible to correct it with a lot of manual effort in the configuration file and log format statements. But as webalizer is no longer available awstats is a viable replacement; if a lot harder to get working.

The ability to generate static reports may be useful for archiving, but I have not tested that as they would be most useful in PDF form, which as mentioned above awaits the ‘htmldoc’ package reaching the CentOS8 repos one day.

Posted in Unix | Comments Off on Getting AWSTATS running on CentOS8

Setting up and using a local (insecure) Docker registry service container

The first thing to note about this post is that it sets up an insecure local registry facility using the standard registry container. This type of registry is ideal for local ‘internal network’ development use. It is also suitable for standalone docker, no need for a docker swarm.

An important thing to note is that the regsitry server itself is not insecure, it wants TLS traffic by default; however it supports insecure traffic if the client requests it. To use insecure access to the local registry server it is the clients of the registry server that are reconfigured to request insecure communication; and the registry server will permit it.

Configuring an insecure Docker registry allows anyone to ‘push’ images to your registry without authentication, so it must only be used for internal use; never internet facing.

You will notice the official documentation for installing a docker registry container as an insecure installation states that not even basic authentication can be used for an insecure configuration. The may actually be incorrect as the configuration document at https://docs.docker.com/registry/configuration/#htpasswd states basic authentication can be configured without TLS… although the user/password information will be passed in clear text as part of the http header so it seems it is only recomended not to use it.

Installing a local Docker registry

A local registry server can be installed on any server already running Docker or docker-ce. How to setup a local registry server is covered at https://docs.docker.com/registry/deploying/ however it is a little fuzzy on configuring insecure traffic.

The actual document on configuring for insecure use is https://docs.docker.com/registry/insecure/ but it omits the rather import detail that the “insecure-registry” settings must be set on all the clients, not on the docker server running the container. There is a lot of confusion about that easily seen from all the questions about it in forums where everyone assumes it is set on the server providing the registry container; it is set on all the clients. Also note in this document it does state that secure traffic will always be tried first in all cases, the “insecure-registries” entry just allows fallback to insecure traffic, so changing your registry to be secure at a later time is trivial.

It is also important you are aware that by default the container running the registry uses volume persistent storage within the container; this means that while it will survive stopping/starting the container should you delete the container your data will be lost. If you intend to be deleting/reconfiguring the container a lot you probably don’t want to use that default.

I implemented my local registry container to use a directory on the docker host filesystem. The run command I used is below, the importance of the REGISTRY_STORAGE_DELETE_ENABLED environment variable I will discuss later in this post in managing the registry; you would probably normally leave it disabled (the default).

docker run -d \
  -p 5000:5000 \
  --restart=always \
  --name registry \
  -v /home/docker/data:/var/lib/registry \
  -e REGISTRY_STORAGE_DELETE_ENABLED="true" \
  registry:2

Remember to open firewall port 5000 on your registry docker host, and if any client docker hosts have rules blocking outbound traffic ensure the port is opened on those also.

Configuring the Docker client servers for insecure traffic

The “insecure-registries” setting needs to be configured on the servers running Docker that are to be clients of this local registry. On those servers add to (or create if it does not exist) the file /etc/docker/daemon.json and restart the docker service on those clients when you have done so.

{
  "insecure-registries" : [ "hostname-or-ipaddr:5000" ]
}

Of course if you also wish to use the registry on the Docker server running the regisry container set it there also.
If you have a local DNS server you should use the hostname rather than an ip-address.

Pushing and Pulling using your local Docker registry

In the examples here I have the regsitry container running on a host with a DNS entry of docker-local.myexample.org

To push images to your local registry they must be tagged to refer to your local registry. For example if you have an image named busybox:latest you would

docker tag busybox:latest docker-local.myexample.org:5000/busybox:latest
docker push docker-local.myexample.org:5000/busybox:latest

If you get an error along the lines of https expected but got http check your client insecure-registry entries again. On the client the command ‘docker info’ will list the insecure registries configured near the eand of the reponse.

It is also important to note that the tag is a pointer to what is already there, you could in the above example ‘docker image rm busybox:latest’ (which only removed the old pointer) and change your docker run command to run docker-local:5000/busybox:latest instead of busybox:latest which would work perfectly well.

If you have a hostname:port/ in the image name that hostname:port is the registry used, if omitted only the registry at docker.io is used; which you obviously do not have permission to push to.

Once you have images pushed to your local registry you can pull them with the same syntax

docker pull docker-local.myexample.org:5000/busybox:latest

Managing your local registry

Querying your local registry

Useful commands such as ‘docker search’ only apply to the docker.io registry. You obviously need a way of managing your local registry and keeping track of what is in it.

The ‘v2’ interface of the Docker registry container provides a way of looking up what is in your local registry. This can be done with any web browser.

Remembering that we have no certificates and are using an insecure registry the following URLs are useful for determining what is in your registry. The examples use the same example hostname and image.

To see what is in the local registry

    http://docker-local.myexample.org:5000/v2/_catalog
    http://docker-local.myexample.org:5000/v2/_catalog?n=200

Note the second example above, the default response is only the first 100 entries; that can be changed with the n value. If using the default and there are more than 100 entries a link to click on to provide the next 100 entries is provided in the response.

The above URL only displays the image names. You will need to see what tagged versions are stored in the registry. Should yiu have an image names ‘busybox’ and want to see all tagged versions the URL would be

http://docker-local.myexample.org:5000/v2/busybox/tags/list

Of course there are scripts on github that make all that a lot easier and can be run from the command line where most Unix developers work. One is discussed below.

Deleting images from your local registry

The Docker registry ‘v2’ API provides a DELETE facility. As it is possible to corrupt images if you delete incorrect layers it is better to use some of the utilities users have made available on github for that purpose.

I would suggest, and the examples below use, this utility…

cd ~
mkdir git-3rdparty
cd git-3rdparty
git clone https://github.com/byrnedo/docker-reg-tool.git
dnf -y install jp

The below examples of using the script obviously use different examples than busybox above to provide a few more entries to play with; but using the script is fairly simple as seen below. Note that we use “INSECURE_REGISTRY=true” as we have setup an insecure registry; if using TLS there are parameters to provide certs and credentials which are explained on the github page.

[root@gitlab ~]# cd ~git-3rdparty/docker-reg-tool
[root@gitlab docker-reg-tool]# INSECURE_REGISTRY=true ./docker_reg_tool http://docker-local:5000 list
ircserver
mvs38j
[root@gitlab docker-reg-tool]# INSECURE_REGISTRY=true ./docker_reg_tool http://docker-local:5000 list ircserver
f32
f30
[root@gitlab docker-reg-tool]# INSECURE_REGISTRY=true ./docker_reg_tool http://docker-local:5000 delete ircserver f30
DIGEST: sha256:355a3c2bd111b42ea7c1320085c6472ae42bc2de7b13cc1adf54a815ee74fa45
Successfully deleted
[root@gitlab docker-reg-tool]# INSECURE_REGISTRY=true ./docker_reg_tool http://docker-local:5000 list ircserver
f32
[root@gitlab docker-reg-tool]#

However for the delete example above you will most likely get a response as below

[root@gitlab docker-reg-tool]# INSECURE_REGISTRY=true ./docker_reg_tool http://docker-local:5000 delete ircserver f30
DIGEST: sha256:355a3c2bd111b42ea7c1320085c6472ae42bc2de7b13cc1adf54a815ee74fa45
Failed to delete: 405

Refer back to by ‘docker run’ command above, and the parameter I said I would explain later, specifically the environment parameter ‘REGISTRY_STORAGE_DELETE_ENABLED=”true”‘. If that parameter is not set it is not possible to delete entries from the registry… so for a local regsitry you should probably set it unless you intend to keep infinate amounts of images in your local registry.

Reclaiming space in the registry filesystem

Using the registry ‘v2’ API to delete an image/tag from your local registry does not delete anything except the references to the object. This is normally ideal as it preserves layers; for example if you have seven images based on centos:8 they can share the same base layer to minimise space, deleting one of them removes just the pointer. If all are deleted the layers remain as you may push another centos:8 based image in which case a pointer can be re-added rather than the push having to send all layers of the new image.

However there will be occasions when you do want to remove all unused objects from the registry. For example your local development may have been using f30 base images but all development has moved to f32 so you want to reclaim all space used by the old f30 layers.

In order to do so you need to run the registry garbage cleaner to remove the obsolete objects.

The registry garbage cleaner is provided with the docker registry image, and must be run from within the running container.

It should only be run when the registry is not in use; in read-only mode or with user access locked out in some other way. While there is nothing to stop you running it while the registry is in active use any ‘push’ command in progress while the registry garbage cleanup is in progress will result in a corrupt image being stored.

With all that being said, to change to read-only-mode is a case of stopping the regsitry container/deleteing the registry container/redefining-starting the registry in read-only-mode/doing the garbage collect/stopping the registry container/deleting the registry container/redefining starting the registry on write mode again… and if you are going to do all the work you may as well throw in an extra set of stops/deleted/reconfigures to turn on/off the ability to delete entries. Personally I think for a local development registry that is too much complication and I leave registry storage delete enabled and do not switch to read-only-mode (if you need to ensure read-only much simpler to disable the firewall rule allowing access to port 5000 and add it back when done).

To actually perform the garbage reclaimation; on the server hosting Docker container for the registry simply

docker exec -ti regsitry /bin/sh

bin/registry garbage-collect --dry-run /etc/docker/registry/config.yml
exit

Obviously remove the ‘–dry-run’ flag when you are ready to really perform the garbage cleanup.

Some notes on restricting external access, apache proxy

As noted earlier using an insecure registry supposedly prevents any authentication method being used. While easy to switch to TLS that does not magically enable authentication, a lot of extra work is required.

Setting up basic authentication is reasonably easy, and it seems this can be done in clear traffic without TLS if you really want to. However that will the require all your users to authenticate for not just ‘push’ but also for ‘pull’ requests (via ‘docker login’). That limits it’s usability as ideally users should be able to pull anonymously or why bother making it available to them in the first place.

The simpleset way of setting up authentication where users that ‘push’ must authenticate but anyone can ‘pull’ without authentication would seem to be using Apache as a authenticating proxy, by playing with the recipe provided at https://docs.docker.com/registry/recipes/apache/; changing the restriction on GET to allow everyone should do it. And of course create a conf.d file for the virtualhost in your existing web server configuration rather than use the httpd container the example uses. This however still uses basic htaccess authentication although on the web server itself, the registry itself is still insecure using this method but as that would normally run on a seperate machine to the webserver with all public facing traffic having to go via the web server to reach it that is not so much of an issue. Also notice that the example does not really make sense (htpasswd is run in a httpd docker container and the groups created outside the container), but it does at least indicate that all the auth is done by apache in the apache container and not the by the registry container.

One note on the Apache authentication proxy method linked to above is that the document lists as a drawback that the TLS certs must be moved to the webserver as the TLS endpoint instead of on the registry, and the proxy is to the registry at http://servername:5000/v2. Yes, it does mean you must leave your registry container configured as discussed above in that no certificates are configured on the registry itself, but you no longer need to configure the “insecure-hosts” entry on your clients if they pull via the proxy as they will now get https reposnses (provided by the web server).

Also if you already have a public facing Apache webserver with valid certificates an apache proxy setup may be the way to go as yiu do not need to obtain additional certificates. The issues and benefits with the apache proxy approach are

  • issue:if you have not used ‘docker login’ to login to your proxy a ‘push’ request results in ‘no basic auth credentials’, however a ‘pull’ request returns the web servers 400 error page (with ‘require valid user’ for GET even if the option to not proxy apache httpd error pages is commented). However if you do not restrict GET access that is not an issue
  • benefit: after using ‘docker login’ users can push/pull as expected
  • benefit: configuring the GET method rule to “all granted” rather than “valid user” allows any user to ‘pull’ from your registry, which would be the only point of exposing it to the internet. ‘push’ and delete requests are still denied via the proxy if the user has not isued a ‘docker login’ to you local registry
  • issue: the mapping must be to the path /v2 as that is what the docker command requires; an open invite to hackers ?. Exactly what can be done with registry GET requests, are any destructive, thats an unknown
  • benefit: you are not required to obtain any new ssl certificates, if your website already has working certificates you can configure the same certificates your website already uses in the virtual host entry for the registry

Using the apache proxy with “require all” for GET and leaving the other http options unchanged results in all ‘push’ requests being denied unless a user in the ‘pusher’ group you defined has used ‘docker login’ while all ‘pull’ requests are being permitted; which is probably want you want if you expose it to the internet for end users to pull images from.

[root@gitlab docker]# docker tag busybox mywebproxy.myexample.org:5043/busybox
[root@gitlab docker]# docker push mywebproxy.myexample.org:5043/busybox
The push refers to repository [mywebproxy.myexample.org:5043/busybox]
514c3a3e64d4: Preparing 
unauthorized: authentication required

[root@gitlab docker]# docker pull mywebproxy.myexample.org:5043/mvs38j:f30
f30: Pulling from mvs38j
33fd06a469a7: Pull complete 
Digest: sha256:46eb3fb42e4c6ffbd98215ea5d008afc6f19be80d5c1b331f7ba23e07c9d8e46
Status: Downloaded newer image for mywebproxy.myexample.org:5043/mvs38j:f30
mywebproxy.myexample.org:5043/mvs38j:f30

While configuring users in a htpasswd and group file on an apache server providing a proxy service can seperate users that can only use GET operations and those that can perform all operations by requiring them to ‘docker login’ via the proxy, if you do intend to allow external users to pull images from your repository my recomendation would be to allow all users to ‘pull’ and no users to ‘push’ (simply have no users in the pusher group) via a public facing proxy configuration. Any ‘push’ processing should only be done from the trusted internal network anyway.

This is my replacement for the script provided on the docker site to create a working  Apache proxy configuraion on an existing Apache web server.

DOCKER_REGISTRY_HOST="docker-local" # hostname running the registry container, may be localhost if running it there
DOCKER_REGISTRY_PORT="5000" # port name used by the insecure container
APACHE_DOCKER_AUTH_DIR="/var/www/html/registry-auth" # directory to use for proxy vhost htpasswd and group data files
USERNAME_PUSH_AUTH="mark" # user to demo push access
USERNAME_PUSH_PASSWORD="pusherpwd" # password for above
SSL_CERT_PATH="/etc/letsencrypt/live/mywebserver.myexample.org" # where are the certs used by the existing website

# Did we have a valid cert directory ?, change nothing if not
if [ ! -d ${SSL_CERT_PATH} ];
then
echo "${SSL_CERT_PATH} is not a directory"
exit 1
fi

# Ensure the directories exist, create the needed files
if [ ! -d ${APACHE_DOCKER_AUTH_DIR} ];
then
mkdir -p ${APACHE_DOCKER_AUTH_DIR}
chown apache:apache ${APACHE_DOCKER_AUTH_DIR}
done
htpasswd -Bbn ${USERNAME_PUSH_AUTH} ${USERNAME_PUSH_PASSWORD} >> ${APACHE_DOCKER_AUTH_DIR}/httpd.htpasswd
echo "pusher: ${USERNAME_PUSH_AUTH}" >> ${APACHE_DOCKER_AUTH_DIR}/httpd.groups"
chown apache:apache ${APACHE_DOCKER_AUTH_DIR}/httpd.htpasswd
chown apache:apache ${APACHE_DOCKER_AUTH_DIR}/httpd.groups"

# Create the proxy configuration file
cat << EOF > /etc/httpd/conf.d/docker-registry.conf
LoadModule headers_module modules/mod_headers.so
LoadModule authn_file_module modules/mod_authn_file.so
LoadModule authn_core_module modules/mod_authn_core.so
LoadModule authz_groupfile_module modules/mod_authz_groupfile.so
LoadModule authz_user_module modules/mod_authz_user.so
LoadModule authz_core_module modules/mod_authz_core.so
LoadModule auth_basic_module modules/mod_auth_basic.so
LoadModule access_compat_module modules/mod_access_compat.so
LoadModule ssl_module modules/mod_ssl.so
LoadModule proxy_module modules/mod_proxy.so
LoadModule proxy_http_module modules/mod_proxy_http.so
LoadModule unixd_module modules/mod_unixd.so

Listen 5043
<VirtualHost *:5043>
ServerName mywebproxy.myexample.org
SSLEngine on
SSLCertificateFile ${SSL_CERT_PATH}/fullchain.pem
SSLCertificateKeyFile ${SSL_CERT_PATH}/privkey.pem
SSLCertificateChainFile ${SSL_CERT_PATH}/fullchain.pem
SSLCompression off
SSLProtocol all -SSLv2 -SSLv3 -TLSv1
SSLCipherSuite EECDH+AESGCM:EDH+AESGCM:AES256+EECDH:AES256+EDH
SSLHonorCipherOrder on
Header always set "Docker-Distribution-Api-Version" "registry/2.0"
Header onsuccess set "Docker-Distribution-Api-Version" "registry/2.0"
RequestHeader set X-Forwarded-Proto "https"
ProxyRequests off
ProxyPreserveHost on
# no proxy for /error/ (Apache HTTPd errors messages)
ProxyPass /error/ !
ProxyPass /v2 http://${DOCKER_REGISTRY_HOST}:${DOCKER_REGISTRY_PORT}/v2
ProxyPassReverse /v2 http://${DOCKER_REGISTRY_HOST}:${DOCKER_REGISTRY_PORT}/v2
<Location /v2>
Order deny,allow
Allow from all
# MUST match realm to the 'basic-realm' used by default in the registry container
# If you change the realm value used by the registry container change this also
AuthName "basic-realm"
AuthType basic
AuthUserFile "${APACHE_DOCKER_AUTH_DIR}/httpd.htpasswd"
AuthGroupFile "${APACHE_DOCKER_AUTH_DIR}/httpd.groups"
# Read access to any users
<Limit GET HEAD>
Require all granted
</Limit>
# Write access to docker-deployer only
<Limit POST PUT DELETE PATCH>
Require group pusher
</Limit>
</Location>
</VirtualHost>
EOF

chown apache:apache /etc/httpd/conf.d/docker-registry.conf

# Implement !
# The below assumes that your apache configuration uses the default of loading all .conf files
# in the conf.d directory; if you selectivy load files from that directory ensure you add
# the new conf file created by this script.
systemctl restart httpd

# What you MUST do
# your webserver host must accept traffic on port 5043
# your webserver mut be able to pass traffic to port 5000 on the remote (or local) registry host
# Then it should all just work with docker images tagged mywebproxy.myexample.org:5043/imagename[:tag]

Have fun.

Posted in Unix | Comments Off on Setting up and using a local (insecure) Docker registry service container

Running remote X-windows applications across ssh from a Fedora server

This post has been created as it may assist others who want to do the same thing. It is not complicated or world shaking.

Often you will run Fedora (or CentOS/RHEL) servers and do not want a GUI desktop installed. However occasionally you may want to run a X-windows GUI application on that server.

This is normally done simply with a normal SSH connection with the remote server allowing X11Forwarding; if the remote server has all the desktop packages installed and boots into GUI mode there is no problem with that approach. However if you SSH into a remote server with “ssh -X user@remotehost” and get a message saying X11 forwarding failed or the command “echo $DISPLAY” returns nothing then congratulations, you have a perfectly normal server install… however you are not able to run GUI applications, which is fortunately easily fixed.

Before you do anything check your /etc/ssh/sshd_config and make sure you have uncommented ‘X11Forwarding yes’ and restarted sshd since doing so. If you had to make that change retry the ssh and echo command above again as while unlikely on a server install you may have had the needed packages all along. Only if you still get the errors do you need to continue reading this post.

It should also be noted that the “X11Forwarding yes” can be set on a per client basis rather than globally if you preffer, an example of that is commented out at the end of the /etc/ssh/sshd_config file on Fedora.

If you have been trying to get this working already you will have already noticed that all fedora server only installs already have a xorg-x11-server-Xorg package installed, for example F32 at the time of writing this post has “xorg-x11-server-Xorg-1.20.8-1.fc32.x86_64” installed by default on a server only install. Obviously this in insuffient to run X-applications across SSH or I would not be writing this post.

To get the required functionality working simply, on the server you wish to remotely connect to using ssh to run GUI apps simply

dnf -y install gdm
systemctl start gdm.service     # note:is enabled by default so will start after reboots

You can then session ssh into the server again, and ‘echo $DISPLAY’ will now have a value.

To test it works, while sshed onto that remote server, simply

sudo dnf -y install xterm
xterm

If you get an xterm window on your client machine with a prompt of the remote server you have it all working.

Important note: on the client machine you must have the X11 forwarding ports open in your firewall, at a minimum have port 6010 open on your client to accept the forwarded X11 session data.

Obviously installing the GDM package installs a few extra required files. Below is a list of all files installed on my server whan I needed to install the GDM package. As the server still boots up in non-GUI mode (unless for some reason you make the changes necessary to change the default runlevel) installed packages used by a desktop session such as pulseaudio and evolution do not start so there is no impact on the server other than the gdm service being started.

[root@xxxx X11]# dnf install gdm
Last metadata expiration check: 1:52:25 ago on Mon 13 Jul 2020 13:44:13.
Dependencies resolved.
=================================================================================================================
 Package                                    Arch        Version                Repository                   Size
=================================================================================================================
Installing:
 gdm                                        x86_64      1:3.36.2-2.fc32        updates                     561 k
Installing dependencies:
 accountsservice                            x86_64      0.6.55-2.fc32          fedora                      118 k
 accountsservice-libs                       x86_64      0.6.55-2.fc32          fedora                       93 k
 bluez-obexd                                x86_64      5.54-1.fc32            fedora                      196 k
 bolt                                       x86_64      0.9-1.fc32             updates                     203 k
 bubblewrap                                 x86_64      0.4.1-1.fc32           fedora                       51 k
 cheese-libs                                x86_64      2:3.34.0-3.fc32        fedora                      814 k
 clutter                                    x86_64      1.26.4-1.fc32          fedora                      1.1 M
 clutter-gst3                               x86_64      3.0.27-3.fc32          fedora                       87 k
 clutter-gtk                                x86_64      1.8.4-7.fc32           fedora                       48 k
 cogl                                       x86_64      1.22.8-1.fc32          updates                     495 k
 colord-gtk                                 x86_64      0.2.0-3.fc32           fedora                       32 k
 cups-pk-helper                             x86_64      0.2.6-9.fc32           fedora                       91 k
 enchant2                                   x86_64      2.2.8-1.fc32           fedora                       63 k
 evolution-data-server                      x86_64      3.36.3-1.fc32          updates                     2.2 M
 evolution-data-server-langpacks            noarch      3.36.3-1.fc32          updates                     1.4 M
 fdk-aac-free                               x86_64      2.0.0-3.fc32           fedora                      401 k
 flatpak-selinux                            noarch      1.6.4-1.fc32           updates                      23 k
 flatpak-session-helper                     x86_64      1.6.4-1.fc32           updates                      77 k
 geoclue2                                   x86_64      2.5.6-1.fc32           fedora                      137 k
 geoclue2-libs                              x86_64      2.5.6-1.fc32           fedora                       51 k
 geocode-glib                               x86_64      3.26.2-1.fc32          fedora                       72 k
 gjs                                        x86_64      1.64.3-3.fc32          updates                     380 k
 gnome-autoar                               x86_64      0.2.4-2.fc32           fedora                       56 k
 gnome-bluetooth                            x86_64      1:3.34.1-1.fc32        fedora                       44 k
 gnome-bluetooth-libs                       x86_64      1:3.34.1-1.fc32        fedora                      318 k
 gnome-control-center                       x86_64      3.36.3-1.fc32          updates                     5.7 M
 gnome-control-center-filesystem            noarch      3.36.3-1.fc32          updates                      12 k
 gnome-desktop3                             x86_64      3.36.3.1-1.fc32        updates                     576 k
 gnome-keyring-pam                          x86_64      3.36.0-1.fc32          fedora                       29 k
 gnome-online-accounts                      x86_64      3.36.0-1.fc32          fedora                      484 k
 gnome-session                              x86_64      3.36.0-2.fc32          fedora                      383 k
 gnome-session-wayland-session              x86_64      3.36.0-2.fc32          fedora                       13 k
 gnome-session-xsession                     x86_64      3.36.0-2.fc32          fedora                       13 k
 gnome-settings-daemon                      x86_64      3.36.1-1.fc32          updates                     1.0 M
 gnome-shell                                x86_64      3.36.4-1.fc32          updates                     1.5 M
 gsound                                     x86_64      1.0.2-11.fc32          fedora                       34 k
 gssdp                                      x86_64      1.0.4-1.fc32           updates                      51 k
 gupnp                                      x86_64      1.0.5-1.fc32           updates                      96 k
 gupnp-av                                   x86_64      0.12.11-3.fc32         fedora                       92 k
 gupnp-dlna                                 x86_64      0.10.5-12.fc32         fedora                       89 k
 harfbuzz-icu                               x86_64      2.6.4-3.fc32           fedora                       16 k
 hyphen                                     x86_64      2.8.8-13.fc32          fedora                       29 k
 ibus                                       x86_64      1.5.22-7.fc32          updates                     7.4 M
 ibus-gtk2                                  x86_64      1.5.22-7.fc32          updates                      28 k
 ibus-gtk3                                  x86_64      1.5.22-7.fc32          updates                      29 k
 ibus-libs                                  x86_64      1.5.22-7.fc32          updates                     262 k
 ibus-setup                                 noarch      1.5.22-7.fc32          updates                      61 k
 iio-sensor-proxy                           x86_64      3.0-1.fc32             fedora                       57 k
 libappindicator-gtk3                       x86_64      12.10.0-28.fc32        updates                      42 k
 libcanberra                                x86_64      0.30-22.fc32           fedora                       87 k
 libcanberra-gtk3                           x86_64      0.30-22.fc32           fedora                       32 k
 libdbusmenu                                x86_64      16.04.0-15.fc32        fedora                      136 k
 libdbusmenu-gtk3                           x86_64      16.04.0-15.fc32        fedora                       41 k
 libgdata                                   x86_64      0.17.12-1.fc32         fedora                      472 k
 libgee                                     x86_64      0.20.3-1.fc32          fedora                      289 k
 libgnomekbd                                x86_64      3.26.1-3.fc32          fedora                      163 k
 libgtop2                                   x86_64      2.40.0-3.fc32          fedora                      150 k
 libgweather                                x86_64      3.36.1-1.fc32          updates                     2.9 M
 libhandy                                   x86_64      0.0.13-4.fc32          updates                     162 k
 libical-glib                               x86_64      3.0.8-1.fc32           fedora                      187 k
 libimobiledevice                           x86_64      1.2.1-0.3.fc32         fedora                       79 k
 libindicator-gtk3                          x86_64      12.10.1-17.fc32        fedora                       67 k
 libmediaart                                x86_64      1.9.4-9.fc32           fedora                       44 k
 libnma                                     x86_64      1.8.28-1.fc32          updates                     302 k
 libnotify                                  x86_64      0.7.9-1.fc32           fedora                       43 k
 libplist                                   x86_64      2.1.0-3.fc32           fedora                       78 k
 libsbc                                     x86_64      1.4-5.fc32             fedora                       44 k
 libusbmuxd                                 x86_64      2.0.0-2.fc32           fedora                       38 k
 libvncserver                               x86_64      0.9.11-11.fc32         fedora                      272 k
 libwpe                                     x86_64      1.6.0-1.fc32           fedora                       26 k
 libxklavier                                x86_64      5.4-15.fc32            fedora                       67 k
 low-memory-monitor                         x86_64      2.0-4.fc32             fedora                       34 k
 mesa-vulkan-drivers                        x86_64      20.1.2-1.fc32          updates                     3.3 M
 mobile-broadband-provider-info             noarch      20190618-3.fc32        fedora                       67 k
 mozjs68                                    x86_64      68.10.0-1.fc32         updates                     6.8 M
 mutter                                     x86_64      3.36.4-1.fc32          updates                     2.4 M
 network-manager-applet                     x86_64      1.16.0-2.fc32          updates                     208 k
 nm-connection-editor                       x86_64      1.16.0-2.fc32          updates                     864 k
 ostree-libs                                x86_64      2020.3-5.fc32          updates                     398 k
 pipewire                                   x86_64      0.3.6-1.fc32           updates                     110 k
 pipewire-libs                              x86_64      0.3.6-1.fc32           updates                     708 k
 pipewire0.2-libs                           x86_64      0.2.7-2.fc32           fedora                      354 k
 pulseaudio                                 x86_64      13.99.1-4.fc32         updates                     1.0 M
 pulseaudio-module-bluetooth-freeworld      x86_64      1.4-1.fc32             rpmfusion-free-updates      100 k
 python3-cairo                              x86_64      1.18.2-4.fc32          fedora                       94 k
 python3-gobject                            x86_64      3.36.1-1.fc32          updates                      17 k
 rtkit                                      x86_64      0.11-23.fc32           fedora                       58 k
 sound-theme-freedesktop                    noarch      0.8-13.fc32            fedora                      378 k
 speexdsp                                   x86_64      1.2.0-1.fc32           updates                     453 k
 startup-notification                       x86_64      0.12-19.fc32           fedora                       42 k
 switcheroo-control                         x86_64      2.2-1.fc32             updates                      38 k
 upower                                     x86_64      0.99.11-3.fc32         fedora                      176 k
 vulkan-loader                              x86_64      1.2.135.0-1.fc32       updates                     126 k
 webkit2gtk3                                x86_64      2.28.3-1.fc32          updates                      15 M
 webkit2gtk3-jsc                            x86_64      2.28.3-1.fc32          updates                     6.0 M
 webrtc-audio-processing                    x86_64      0.3.1-4.fc32           fedora                      313 k
 woff2                                      x86_64      1.0.2-8.fc32           fedora                       61 k
 wpebackend-fdo                             x86_64      1.6.0-1.fc32           fedora                       36 k
 xdg-dbus-proxy                             x86_64      0.1.2-2.fc32           fedora                       43 k
 xdg-desktop-portal                         x86_64      1.7.2-2.fc32           updates                     434 k
 xdg-desktop-portal-gtk                     x86_64      1.7.1-1.fc32           fedora                      239 k
 xorg-x11-server-Xwayland                   x86_64      1.20.8-1.fc32          fedora                      988 k
 xorg-x11-xauth                             x86_64      1:1.1-3.fc32           fedora                       36 k
 xorg-x11-xinit                             x86_64      1.4.0-6.fc32           fedora                       56 k
 zenity                                     x86_64      3.32.0-3.fc32          fedora                      4.3 M
Installing weak dependencies:
 flatpak                                    x86_64      1.6.4-1.fc32           updates                     1.5 M
 gnome-remote-desktop                       x86_64      0.1.8-2.fc32           updates                      69 k
 libldac                                    x86_64      2.0.2.3-5.fc32         fedora                       41 k
 p11-kit-server                             x86_64      0.23.20-1.fc32         fedora                      189 k
 pinentry-gtk                               x86_64      1.1.0-7.fc32           fedora                       48 k
 rygel                                      x86_64      0.36.2-5.fc32          fedora                      1.0 M
 vino                                       x86_64      3.22.0-17.fc32         fedora                      453 k

Transaction Summary
=================================================================================================================
Install  113 Packages

Total download size: 81 M
Installed size: 344 M
Is this ok [y/N]: y
Posted in Unix | Comments Off on Running remote X-windows applications across ssh from a Fedora server

Installing airgraph-ng on Kali Linux

There are many tutorials on using airgraph-ng on youtube; most omit the small detail that it cannot be installed using a package manager.

The airodump-ng utility within the aircrack-ng toolkit is used to scan wireless activity near your location, including not just wireless hot spots but also the devices connected to or trying to connect to those hot spots. As it only scans activity happening in the air and cannot be considered a hacking tool it can be considered legal to use anywhere. It can save results to a CSV file, which id not easily human readable.

The airgraph-ng utility covered here is one of the experimental scripts available in the aircrack-ng suite of tools. It’s purpose is to take the CSV file generated by the airodump-ng utility and display the data captured as a pretty diagram showing the association between the devices captured. This is useful in determining what wireless devices are connected to what access points at a glance without having to write your own tools to parse the file.

As the airgraph-ng script is in the ‘experimental’ category it is not shipped as part of aircrack-ng packages that are available for most Linux distributions so needs to be installed from source. The aircrack-ng sources can be obtained with “git clone https://github.com/aircrack-ng/aircrack-ng.git”.

The airgraph-ng tool requires a bit of manual fiddling to get it to work after installation, which may be why it is bundled in the experimental category; but the fiddling required is covered in this post.

This post covers installing airgraph-ng (actually installing the entire package) on the normal full Kali OS server install, plus covers additional steps needed if you installed your Kali system from the Live DVD media rather than the server media.

The live DVD used was

Linux kali 4.19.0-kali3-amd64 #1 SMP Debian 4.19.20-1kali1 (2019-02-14) x86_64 GNU/Linux
gcc (Debian 8.2.0-14) 8.2.0

The server install used was

Linux kali 5.7.0-kali1-amd64 #1 SMP Debian 5.7.6-1kali2 (2020-07-01) x86_64 GNU/Linux
gcc (Debian 9.3.0-14) 9.3.0

Requirements:

  • A GCC compiler version below 10, as of July 18 2020 it is not possible to compile the aircrack-ng tools using GCC 10
  • Any Linux OS with an older version of the GCC Compiler available (Fedora 32 for example uses GCC 10 so cannot be used). This post uses the Kali OS
  • The OS must have python 2 installed (2.7 works OK); the airgraph-ng python script will not run under python3. Kali Linux includes both python and python3 so this is not an issue if you are using Kali Linux

The issue:

Most OS’s have the package aircrack-ng available in their repositories, Kali even comes with the package installed; this provides the standard airmon-ng and airodump-ng utilities. The issue is that the supplied packages do not include optional utilities such as airgraph-ng. If you want experimental features you must install from source.

This post was written on July 18 2020; issues with the source on github not compiling using GCC 10 may have been resolved by the time you read this.

Pre-requisites

If you installed from the live DVD:

If you installed a VM from the ‘live DVD’ after the install you will have no repositories configured, even the DVD media will have been commented out of the available repository list after installation. You need to be able to access repositories to install packages required to compile the source. This is easily fixed with the below command.

cat << EOF >> /etc/apt/sources.list
deb http://http.kali.org/kali kali-rolling main non-free contrib
# For source package access, uncomment the following line
# deb-src http://http.kali.org/kali kali-rolling main non-free contrib
EOF

Additional packages required:

In order to compile the aircrack-ng utilities you need at a minimum the following additional packages installed in addition to the base Kali install.

apt-get clean
apt-get autoclean
apt update
apt upgrade
apt install build-essential
apt-get install automake autoconf autotools-dev 
apt install libtool pkg-config m4
apt install libssl-dev
apt install libnl-3-200 libnl-3-dev
apt install libpcap-dev
apt install libnl-genl-3-200 libnl-genl-3-dev

# The below are only needed if you installed Kali from the live DVD
apt install libstdc++-dev
apt install python3-graphviz graphviz
apt install python-pygraphviz
apt install python3-pydot python3-pydot-ng python3-pydotplus

Obtaining aitcrack-ng, and Compiling

After all the pre-requisites have been satisfied you are ready to actually compile it. Do not just blindly copy/paste the below commands, run each one by one and fix errors as needed before running the next.

cd ~
mkdir git
cd git
git clone https://github.com/aircrack-ng/aircrack-ng.git
cd aircrack-ng
autoreconf -i
./configure --with-ext-scripts --disable-shared --enable-static
make
make install

If all went well you will have updated versions of the software in /usr/local/bin and /usr/local/sbin. The files installed by the default Kali install package reside in /usr/bin and /usr/sbin so both can co-exist at the same time. It is important to note however that on Kali the /usr/local directories are by default searched first so your newly compiled files will be chosen by default.

I would recomend you do not uninstall the aircrack-ng package and actually use the packaged versions of the utilities. The reason for that is that is that if you were paying attention during the configure step you will have noticed that many facilities were not implemented (for example pcap is available as libpcap-dev was one of the packages I stated you need to download, pcre was not available as I could not locate a libpcre-dev package to obtain it). It can be assumed that the packaged utilites have all features available so are probably the better ones to use. Remember this post is about obtaining the airgraph-ng command, however if you want to obtain all the features and use the latest source you need to hunt down all the development libraries and header files needed to provide all the features.

airgraph-ng is now available, but it will not yet run

So, you think you are now reasy to run airgraph-ng ?. Bad news, the script is broken. Fortunately not too badly, the issues are

  1. it requires a ‘support’ directory under your current working directory, which you need to manually create
  2. it wants to download a device list file from http://standards-oui.ieee.org/oui.txt into that directory, and fails to do so

The fix is simply, in your work directory

mkdir support
cd support
wget http://standards-oui.ieee.org/oui.txt
cd ..

and run the airgraph-ng command again, with the ‘support’ directory and required file within it existing it will finally work.

Other important things to note: are

  • The airmon-ng script runs using python 2 (fortunately Kali has 2.7 ‘python command’ as well as python 3.x ‘python3 command’ so that is not an issue
  • The airgraph-ng script will not run under python 3.x, if you try to install it on a server with only python 3.x forget it (at the current time) as the graphviz libraries are for python2 and if airmon-ng is run with ‘python3’ it cannot find the graphviz libraries (this will obviously change in later releases, but for now it is a stopper)

An example usage, assuming wireless device is wlp1s0

--- On Terminal 1
airmon-ng start wlp1s0

--- On Terminal 2
airodump-ng wlp1s0mon -w /root/osint/data/airodump_scan_location_date
^C (control-C) when run for a while to capture data, at least 5mins

# take the wireless adapter out of monitor mode, should free up terminal 1 again
airmon-ng stop wlp1s0mon

# Then use the collected data
   # map as a png the wireless devices actually connected to the networks located
   airgraph-ng -i airodump_scan_location_date.csv -o 'airodump_scan_location_date_CAPR' -g CAPR
   # map as png devices trying to connect to networks, this can show what networks they
   # connected to in the past that they are trying to re-connect to.
   airgraph-ng -i airodump_scan_location_date.csv -o 'airodump_scan_location_date_CPG' -g CPG

The first graph allows connected wireless devices nearby to be mapped to the wireless hot spots they are connected to which is useful for penetration testers, especially if ‘open’ unsecured ssid’s are found.

The details in second graph can be used by hackers, to use ssid spoofing to obtain the first two parts of a key handshake from the device trying to connect the the fake ssid; while the device will not connect as the fake ssid does not have a valid key there are tools that allow a key to be determined from those first two parts of the handshake the device used; this allows a fully operational fake ssid to be created and route all the traffic from the device connecting to it through that fake ssid hot spot… so ensure your wireless devices are never setup to try to autoconnect (most phones do) and instead manually connect to networks you know are expected to be near you when needed, as all connection attempts are broadcast to every wireless hot spot within range including those mapping other peoples networks.

Posted in Penetration Testing, Unix | Comments Off on Installing airgraph-ng on Kali Linux

Generating a new puppet client certificate

This is for the community edition of puppetserver, although will work for all recent puppet releases.

I use puppetserver as it is supposed to be able to manage up to 100 agents without having to hook it up to a web-server such as apache without any issues, and I have nowhere near that many servers or VMs in my home lab. At the time of this post it is running on a CentOS7 VM with agents running on CentOS7, CentOS8, Fedora32 and Fedora31.

It should also be noted this has been posted primarily for my use, as I needed the info again and it took me a while to remeber where I had placed the documentation; so posting it so I can just search my blog as I need it.

I initially investigated this as I had an interesting issue where on a puppet master server a “puppet cert list –all” did not show a cert for one of the agent nodes, but the agent node had no problem updating from the puppet master server. It turns out that ‘cleaning’ a cert does not stop an agent node updating, as the node still has a signed cert. You have to revoke it on the puppet master and leave it there rather than clean (remove) it from the puppet master to stop an agent updating.

So to stop a node updating just revoke.

For my initial problem I just think the agent server in question had another nodes certificate somehow; but thats another issue.

Anyway I was looking for the documentation a second time because after a full OS upgrade of an agent the agent was unable to retrieve files from the puppet master; it seems to connect OK and handshake OK but the errors were simply that it was unable to access file xxx where xxx was all the files that were relevant to that agent server. The solution in most cases where agent/master communication hits bumps is to generate a new certificate for the agent server.

It took me almost 10mins to find the documentation (after searching my blog first and not finding it). Therefore this blog post so I can find it easily next time by simply searching the blog, and there will be a next time.

To get a new certiificate for an agent follow the steps below

To generate a new certificate for an agent node you must delete all existing certificates for the agent on the agent and master.

  1. On the puppet agent as the root user, see the paragraph below the list of steps
    • systemctl stop puppet
    • puppet config print | grep ssldir
    • cd to the directory the ssl certificates are stored in identified by the ssldir entry
    • rm -rf *
  2. On the puppet master
    • puppet cert clean youragenthostname
  3. On the puppet agent
    • systemctl start puppet
  4. On the puppet master
    • use ‘puppet cert list’ occasionally to wait for the signing request, and ‘puppetcert sign youragenthostname’ when it eventually appears

The reason the work on the agent must be done as the root user is simply because that is the correct environment to work in. If you were to run the ‘puppet print config’ command as a non-root user you would see all the directory paths as $HOME/username/.puppetlabs’ instead of the directories used by the puppet agent itself, in most cases paths under the user home directory would not exist.

If you have a complex environment where you do run copies of the puppet agent under multiple userids (pointless as only the copy running under root can do updates to system files), presumably overriding the hostname, as a way of targeted testing; the you have missed the point of having the puppet master able to serve multiple environments (production, test1, test2) etc. and I would not try to do such a thing. So as this post is to remind me how to fix my environment you are on your own sorting that out should you have gone down the path of running agents under non-root userids.

Special notes on cleaning up the master

Puppet does a good job of preventing you from messing up the certificates on the master, but if you try hard enough you can. Resolving that is similar.

  1. On the puppet master as root
    • systemctl stop puppetserver
    • puppet config print | grep ssldir
    • cd to the directory the ssl certificates are stored in
    • rm -rf *
    • systemctl start puppetserver
  2. On every agent server follow the steps to be taken on the agent server mentioned in obtaining a new certificate above

When the puppetserver is started again it will recreate the keys it needs to function. The reason you must generate new certificates for every agent is simply because you will have deleted the signing key from the puppet master and it will have created a new one when it started, so all agent certificates are invalid and will error when trying to use the puppet master, they must be recreated so they can be signed with the new key.

It is certainly possible to selectively delete certificates on the puppet master to ensure the signing key remains, do not do that. If there is an issue on your puppet master severe enough to require manually deleting ssl keys go for the ‘big bang’ and wipe them all, trying to fiddle about selectively could leave you in a worse unknown position, simpler to regain a consistent environment by deleting them all and effectively starting from scratch.

Depending on how many agents you have you can configure the master to ‘autosign’ tne new certificate requests coming in from the agents; however as you will be manually working through deleting old certificated from the agent servers one by one I see no benefit in that.

UPDATE: 8Jan2022 – a bit of a late update as the change was a wile ago, but please note that more recent versions of puppet have retired the ‘puppet cert’ facility on the puppetserver/master and the puppet command is no longer used to manage puppetserver certificates, you must use puppetserver; you now need to use ‘puppetserver ca list [–all]’ and ‘puppetserver ca clean –certname=xxxx’ or ‘puppetserver ca sign –certname=xxxx’ on the puppetserver machine rather than the ‘puppet cert’ commands from the origional post. All else remains the same.

Posted in Automation, Unix | Comments Off on Generating a new puppet client certificate

Installing a local GitLab instance using the supplied container

There are lots of complicated ways to install a local GitLab instance, however for small environments it is easiest to simply use the supplied container image. Instructions on installing that are at https://docs.gitlab.com/omnibus/docker/ and are simple to follow.

Install, start, configure

For my home lab environment installing the GitLab container was simply a case of

  1. create a new KVM based on CentOS-8 with 4Gb of memory, 2 vcpus, and a 50Gb qcow2 disk
  2. install docker
    dnf config-manager --add-repo=https://download.docker.com/linux/centos/docker-ce.repo
    dnf list docker-ce
    dnf install docker-ce --nobest
    usermod -aG docker root
    usermod -aG docker mark
    systemctl enable docker
    systemctl start docker
    
  3. in /etc/profile add the required ‘export GITLAB_HOME=”/srv”‘ line (it does not have to be in the global profile but I did not want to have to add it in multiple places)
  4. docker pull gitlab/gitlab-ce:latest
  5. start it
    docker run --detach \
       --hostname gitlab-local.xxx.xxx.org \
       --publish 443:443 --publish 80:80 --publish 5522:22 \
       --name gitlab \
       --restart always \
       --volume $GITLAB_HOME/gitlab/config:/etc/gitlab:Z \
       --volume $GITLAB_HOME/gitlab/logs:/var/log/gitlab:Z \
       --volume $GITLAB_HOME/gitlab/data:/var/opt/gitlab:Z \
       gitlab/gitlab-ce:latest
    

    note that I use port 5522 for ssh as the server running the container is already using port 22 for it’s ssh server, also the :Z appended to the volume lines are because selinux is on my server

Then point a web browser at the ip-address of the server (http://xxx.xxx.xxx.xxx rather than https) and using the userid root create a new password for the GitLab root user. At that point you can configure a few of the options, such as not letting new users create their own accounts.

Create a normal non-admin user, and setup ssh keys

You will then want to create a non-root/non-admin user for day to day use for your projects.

When defining a new user GitLab wants to send an email to the new user to allow them to create their new password, by default the GitLab container does not support outbound email. If that is an issue for you documentation on setting up email is at https://docs.gitlab.com/omnibus/settings/smtp.html#example-configuration, however it is not required (see next paragraph).

After defining a new user you just “edit” the user, which allows you to set an initial password for the user, when they use it at first logon they are required to change it; so having email is not actually required to define a new user. If you are going to run the container in your home lab you probably do not want emails out of GitLab enabled anyway.

Once done log out from the root user.

Setup a ssh key for the new user, fully documented at https://docs.gitlab.com/ee/ssh/, but briefly covered in the following paragraphs.

Logon to the GitLab interface using the new user you created using the password set when you edited the user above. You will be required to change the password, and then logon again with the new password.
Once logged on go to the user settings and select ssh keys.

On a terminal window on your Linux development desktop use ssh-keygen -t ed25519 -C “some meaningful comment” to create a new ssh key, note that I changed the name of the default file when prompted from id_ed25519 to id_ed25519_local_gitlab so I can clearly identify it, this raises issues also covered below.

Once you have generated the key done copy the contents of the .pub file created for the key to your clipboard.

Paste the ssh key in your clipboard into the ‘Key’ field provided, give it a meaningfull title and add the new key.

Back in the terminal window test the key works with “ssh -T git@xxx.xxx.xxx.xxx”, or in my case as I mapped port 5522 “ssh -p 5522 -T git@xxx.xxx.xxx.xxx”; reply yes to accept the fingerprint; you should see a ‘Welcome to GitLab @username!’ message before being returned to your own hosts shell. Repeat to ensure the fingerprint has been accepted and prompts will not interfere with workflow anymore.

For those used to using ssh to jump around machines it is worth pointing out that the command is in fact git@xxx.xxx.xxx.xxx and not the username of the GitLab account; that can be confusing at a first glance but the userid used must be git, and it magicaly determines what username to map to based on the key used.

It should also be noted that by default one of the key files ssh looks for is ~/.ssh/id_ed25519 whch is the default filename used when generating the key. If like me you have many idendity key files discretely seperated the command for me is “ssh -p 5522 -i ~/.ssh/id_ed25519_local_gitlab -T git@xxx.xxx.xxx.xxx” to select the key I want to use.

Create a project as the new user

While logged on as the newly created user you may want to create groups, but if it is for a home lab environment you probably want to jump directly to starting a new project.

The main reason new users should create a new project via the browser interface is that when a new project is created a ‘help page’ for the new project is displayed showing the git commands needed to setup the git environment and create a new local repository on your development desktop by cloning the new empty project or pushing an existing git repository to your new project. It does assume you are using default ssh identity files however.

Workarounds needed for using a container

The documentation recomends mapping port 22:22 which if used means the host running the container would have to provide ssh services on another port; to me that is silly. So as noted in my example of running the container above I mapped 5522:22 so the host running the container can still provide normal ssh services on port 22. It does mean port 5522 (or whatever port you chose) needs to be opened in your firewall).

If you need to use git with multiple ssh identity files as I do there are a few interesting answers at https://superuser.com/questions/232373/how-to-tell-git-which-private-key-to-use.

I prefer the use of ~/.ssh/config myself to provide different keys to different hosts as that also allows the port to be changed. It does mean you need a seperate dns entry for the hostname and gitlab-hostname as you only want to switch ports when accesing the GitLab ssh port and not interfere with the host running the container providing its normal ssh service on port 22.

Assuming a /etc/hosts file entry as below (or dns entry where both hosts map to the same ip-address)

xxx.xxx.xxx.xxx gitlab-local.xxx.xxx.org realservername.xxx.xxx.org

Using the below example of a ~/.ssh/config file any ssh to gitlab-local.xxx.xxx.org will use the specified key in the users .ssh directory instead of the default plus use port 5522 to connect to the remote hosts ssh service. And ssh to any other host (including realservername.xxx.xxx.org) will (as no overrides have been added for the ‘host *’ catch-all entry) use the normal port and all default ssh parameters.

Host gitlab-local.xxx.xxx.org
  IdentityFile ~/.ssh/id_ed25519_local_gitlab
  Port 5522
Host *

Actually using your project from the standard git command line

Then on your development server to push files you are already working on to the project you created earlier is simply a case of following the git instructions that were shown on the web page displayed when your project was created (obviously it would be easier if you added a hostname to your /etc/hosts or dns rather than use the ip-address, but either work).

cd existing_folder
git init
git remote add origin git@xxx.xxxx.xxxx.xxxx:user_name_you_created/project_name_you_created.git
git add .
git commit -m "Initial commit"
git push -u origin master

And your Gitlab browser interface will show the files are in the project now.

To use any of the devops features you will need a bit more work, but for simply installing a local git based software repository users can work on that’s all that is needed.

So excluding the time it takes to download a CentOS-8 install iso, it takes less than 15mins to create a new CentOS-8 VM (remember to asign a static ip-addr), and about another 15mins to get it all working; so give it a try.

Posted in Unix | Comments Off on Installing a local GitLab instance using the supplied container

Re-initialising a bacula database

OK, first on a well managed system you would never need to do this. Despite the fact my system I thought was well managed and had 250gb free in the backup storage pool rotating around quite happily when one of my servers suddenly needed over 500Gb of backup space; oops 100% full backup filesystem. A windy day and the motion activated webcam was going wild from a tree waving about.

The normal activity of pruning jobs could not resolve the issue as multiple servers were using the backup volumes so despite extensive pruning no volumes became free to release space. My only option was to start with a fresh bacula environment.

From previous experience I knew that simply recreating the SQL tables did not create a working environment as flat files (non database files but filesystem files) are also used and a non-working state is achieved by not cleaning up everything. And I discovered I had not documented how I resolved it so many years ago; so for my future reference this is what is needed.

It should be an absolute last resort, as all existing backups are lost. Even if you use multiple storage daemons across multiple servers all backups are lost, re-initialising the database means you must delete all backup volumes across all your distributed storage servers associated with the database used by the bacula director affected.

This post is primarily for my use, as while I hope I will not need it again, I probably will, if only for moving existing servers to new storage pools.

Anyway, out with the sledge hammer; how to start from scratch.

On your bacula director server

systemctl stop bacula-dir

On all storage servers used by that director

systemctl stop bacula-sd
cd /your/storage/pool/dir
rm *

On your bacula director server

# Just drop and recreate the tables. Do not delete the database,
# leaving the database itself in existence means all user/grants
# for the database remain valid.
cd /usr/libexec/bacula
./drop_mysql_tables -u root -p
mysql -u root -p
  use bacula;
  list tables;
  drop table xxx;   # for each table remaining
  \q
./make_mysql_tables -u root -p
# remove everything from /var/spool/bacula, note I do not delete everything as
# I have a custom script in that directory for catalog backups; so this deletes
# everything apart from that script.
# Failure to do this step results in errors as backup details are stored
# here, which would conflict with an empty database.
cd /var/spool/bacula
/bin/rm *bsr
/bin/rm *state
/bin/rm *conmsg
/bin/rm log*

On all of your storage servers

systemctl start bacula-sd

On you bacula director server… after editing the exclude lists to exclude whatever directory cause the blowout

systemctl start bacula-dir

At this point all is well, your scheduled backups will be able to run again. The issue of course is that the first time they are next scheduled to run all incremental backups will now say something like ‘no full backup found, doing a full backup’, which while exactly what you want means your backups will take a lot longer than expected on their next run, plus if you had been staggering your full backups across multiple days bad news as they will now (assuming they all have the same retention period) all want to run on the same day in future.

Use of the ‘bconsole’ interface over a few days to delete/prune and rerun full backups can get the staggering back but it is a bit of a pain.

It is the damb incremental backups that use all the space; the full backups of all my servers only used 22% of the storage pool.

Ideally due to space limitations I should revise my monthly full backup strategy to fortnightly so I can keep two weeks of incrementals rather than four weeks of them; and hope I never need to restore a file over two weeks old. However in the real world if a personal physical server dies it may take over two weeks to be repaired and slotted back in so for now I’ll stick with monthly.

Posted in Automation, Unix | Comments Off on Re-initialising a bacula database