Ansible vs Puppet, why I chose Puppet

First off let me make it clear I looked at options for automating configuration quite a while ago and chose puppet as the most secure option. I still believe that to be the case.

The issue I have with options such as Ansible and Chef is that they need ssh keys lying about the place, where the CE release of puppet uses only agents. I have not really looked at Chef as I do not have the computing resources available to run it but I have taken another look at ansible, which is what this post is about.

My reference for quickly installing ansible was https://www.redhat.com/sysadmin/configuring-ansible

It is also important to note that this is my first serious look at ansible, as such many of the issues I have with it may have solutions I have not found yet.

This post is in no way meant to identify one solution as better that the other as there are probably many ansible (and puppet) plugins and tools to emulate each others functionality that I have not had the need to find yet.

Ansible notes quick overview

Ansible requires as well as the ‘ansible’ package installed on the control server

the ansible user defined an all servers to be managed
ssh keys generated and the public key of the ansible control server(s) to be installed on all servers to be managed
an entry in the sudoers file to allow the ansible user on all servers to be managed to run commands as root without password prompting
the sftp service an all servers to be managed to be available

The issues I see with this are

no issues, a normal user
anyone with access to logon to the userid on any of the control servers can logon to any managed server without authentication. Normally not much of an issue as most users setup ssh keys to allow that, but for all servers in the environment is not normal
and once on any server can issue any command they want with root authority
and normally (and the default) is that the sftp service is disabled in /etc/ssh/sshd_config as most users use ‘scp’. This last is probably not a risk but why require services disabled by default when there are alternatives that are enabled by default

Anyway, I created a new class in puppet to define the new group and user on all servers I have puppet managing. I explicitly defined a group, then added the new user to that group, to ensure it was the same on all servers (well away from existing user number ranges rather than defaulting to the next free gid/uid as that would have resulted in different numerical ids on my servers). This new puppet class also copied the public key for the ansible user to all servers being managed.

At that point I had to manually (disclaimer: where I refer to ‘all server’s I only chose a subset of servers to add to a group, as it was getting to be a lot of work for a simple test)

from the control server ssh to all the servers to be managed to reply ‘y’ to the new fingerprint message; I now know that could have been avoided by running the ansible ping command against a ‘group’ containing all servers with the –ssh-common-args=”-o StrictHostKeyChecking=no” option to the ansible command would which have ssh automatically accept the fingerprint
manually on every server to be managed add a new sudoers entry to allow the ansible group to issue any damn command it wants with root authority without needing password authentication (I used the new group rather than adding the new user to an existing group as some tutorials suggest simply because I want existing groups to still have to enter a password when using ‘su’)
manually on all servers to be managed uncomment the sftp entry in the sshd_config file and restart sshd

Only then would a simple ansible ping of the servers work.

Ansible allows groupings of servers so commands can be issued against groups, it also allows for multiple inventories to be used as long as you remeber to use the ‘-i’ option to select the correct one. Output from commands appears to be returned in JSON format so if you want to script handling responses there will be a few scripting tool packages you would need to install.

By default ansible works on a ‘push’ model when changes are pushed out from the control server; however documentation at https://docs.ansible.com/ansible/latest/user_guide/playbooks_intro.html#id16 describes an “ansible-pull” utility that can alter that to an environment where your managed nodes query the control node instead which is apparently best for large environments. I have not tried that and presumably it would require more messing about with ssh keys.

Puppet notes quick overview

To obtain a working puppet CE environment is simply a case of

ensure your DNS can resolve the server name ‘puppet’ to the server you will install ‘puppetserver’ on
firewall ports, agents poll the master on port 8140 so that needs to be open on the master server
installing ‘puppetserver’ on the control node and starting it
installing the ‘puppet’ (agent) package on each server to be managed and starting it

At this point on the ‘puppet’ master a ‘puppet cert list’ will show all your servers waiting for you to ‘puppet cert sign hostname’ to allow them to use the master. It should also be noted that there is a puppet configuration option to permit ‘autosigning’ which I switched on when first adding all my servers before switching off again that makes it easer to enroll all your agent servers when first installing a puppet solution.

What the puppet CE solution does not provide is an equivalent of the ansible ‘–become’ option that allows anyone logged on to the ansible user on the control node to issue any command they desire as root without authentication on any of the managed server nodes… I personally think not being able to do so is a good thing.

However if you really wanted that facility you could configure sshd to permitrootlogin on all your puppet managed nodes and setup ssh keys from the puppet master and simply use ‘ssh hostname anycommand’ to issue commands as root on any managed server, so if you want to open the gaping hole ansible opens you can do so anyway… although I would suggest adding a new user and allowing it to su without a password exactly like ansible does; so you don’t need ansible for that dangerous feature.

Puppets equivalent of the ansible playbook groupings by function are puppet profiles and roles, and a server may have multiple of each. It also supports multiple environments (ie: a test/dev as well as the default ‘production’), there is no reason it could not also contain environments such as webservers, databases etc but application specific grouping is probably better left to puppets use of role and profile functions to group application configurations.

Its grouping of servers themselves is done in the manifest/site.pp where groups of servers can be defined by wildcard host name or selected as lists of individual hosts and given the roles/modules associated with them.

Puppet works on a ‘pull’ model, the puppet agents on each managed note poll the master for any updates.

Usage differences

Ansible uses YAML syntax, which is very dependant on correct indenting. The latest versions of puppet have their own syntax although still support ruby for backward compatibility and puppet class files do not care about indenting as long as you have the correct number of braces and brackets. I also find puppet configurations easier to read.

An example of a ansible playbook to install httpd

---
- hosts: webservers
  remote_user: ansible
  become: yes
  tasks:
  - name: Installing apache
    yum:
      name: httpd
      state: latest
  - name: Enabling httpd service
    service:
      name: httpd
      enabled: yes
    notify:
      - name: restart httpd
  handlers:
  - name: restart httpd
    service:
      name: httpd
      state: restarted

Then run the command "ansible-playbook -i idenityfile playbookname.yaml"

The puppet equavilent

class httpd {
   package { 'httpd':
     ensure => installed,
   }
   service { 'httpd':
     ensure => running,
     enable => true,
   }
} # end httpd class

Then ensure the new class is added to the site manifest

node 'somewebserver','anotherwebserver' {
   ...some stuff
   include httpd
}

Deployment is automatic although dependant upon the agents poll interval; and immediate refesh can be done from the agent side if impatient.

Both ansible and puppet provide a way to publish configuration files and restart services when a file changes

Ansible is a bit confusing, I am not sure if the below will work or even where it belongs in a playbook.

tasks:
  - name: Copy ansible inventory file to client
    copy: src=/some/path/to/a/file/httpd.conf dest=/etc/httpd/httpd.conf
            owner=root group=root mode=0644
    notify:
         - restart apache
handlers:
    - name: restart apache
      service:
        name: apache
        state: restarted

In Puppet there is no ‘handler’ required to be defined as the ‘notify’ can be part of the copy statement. And I personally bundle configuration files for an application within the application class to make them easy to find.

class httpd {
   package { 'httpd':
      ensure => installed,
   }
   service { 'httpd':
      ensure => running,
      enable => true,
   }
   file { '/etc/httpd/httpd.conf': # file resource name, standard is use the name
      path => '/etc/httpd/httpd.conf', # destination path
      ensure => file,
      owner => 'root',
      group => 'root',
      mode => '0644',
      source => 'puppet:///modules/httpd/httpd.conf', # source of file to be copied
      notify => Service['httpd'],
   }
} # end httpd class

Additional puppet features I use

One very usefull feature is managing logs. While I am sure most sites have implemented log maintenance scripts I found using puppet to manage its own logs easier than creating new scripts, an example is below.

node 'puppet' {
   ... some roles and classes being used
   tidy { "/opt/puppetlabs/server/data/puppetserver/reports":
      age => "1w",
      recurse => true,
   }
}

There is also templating which allows files to be customised for each server, it is used to read a template file from which substitutions are made to provide the input contents for a file to be copied to an agent node. This means that configuration files that would be identical in all but a few key parameters can when being copied to an agent be built on the fly with the correct values set rather than having to have a seperate file for each agent to handle those small differences; and the values can also be set using ‘facter’ information.

The below example I use to install bacula-fd (bacula-client package) on all my servers and ensure the FD name is unique by using the hostname as part of the FD name, and to bind it to the default ip-address rather than the default of listen on all interfaces… the one template creates a unique configuration for all my servers as soon as the puppet agent starts.

For example a snippet from a template (epp) file may be

<%- | String $target_hostname,
String $target_ipaddr
| -%>
# Bacula File Daemon Configuration file
# FileDaemon name is set to agent hostname-fd
...lots of stuff
# "Global" File daemon configuration specifications
FileDaemon { # this is me
   Name = <%= $target_hostname %>-fd
   FDport = 9102 # where we listen for the director
   FDAddress = <%= $target_ipaddr %>
   WorkingDirectory = /var/spool/bacula
   Pid Directory = /var/run
   Maximum Concurrent Jobs = 20
}
...lots more stuff


And the class file would contain the below to get the facter
hostname and use it in creating the file contents

class bacula_fd (
  $target_hostname = $facts['hostname'],
  $target_ipaddr = $facts['networking']['ip'],
){
   ...lots of stuff
   # Use a template to create the FD configuration file as it uses
   # the hostname to customise the file.
   $template_hash = {
     target_hostname => $target_hostname,
     target_ipaddr => $target_ipaddr,
   }
   file { '/etc/bacula/bacula-fd.conf':           # file resource name
       path => '/etc/bacula/bacula-fd.conf',      # destination path
       ensure => file,
       owner => 'root',
       group => 'bacula',
       mode => '0640',
       content => epp('bacula_fd/bacula-fd.conf.epp', $template_hash),
       notify  => Service['bacula-fd'],
     }
   ...lots more stuff
} # end class

It should be noted that ansible documentation makes reference to templates. From what I can see that term is used in a different way for ansible as I can’t see how they can interact with a ansible copy task. I have found an example of ansible using variables in a similar way as below so I assume it is possible, just hard to find documentation on.

   - name: create default page content
     copy:
       content: "Welcome to {{ ansible_fqdn}} on {{ ansible_default_ipv4.address }}"
       dest: /var/www/html/index.html
       owner: webadm
       group: web
       mode: u=rw,g=rw,o=r

One other ability of puppet I make heavy use of is its ability to query facter information, one class file can with use of if/else statements run blocks of code depending on OS version so an application class file can install the correct packages for CentOS7, the correct but different packages for CentOS8, completely different packages for each of Fedora30/31/32, to result in the application installed and running (or skipped if the OS does not support it). I have not seen any ansible yaml files that provide that so assume multiple inventory files are needed, one for each OS type.

For servers with firewalld I can use a single class file with all common services and ports and if/else to provide all customisation for different servers in one place using rich firewalld rules (note: ansible seems to have only the normal rules for services and ports but not rich rules, but it may just be another case of ansible documentation/examples being hard to find). Looks like for something similar in ansible you would have seperate yaml files (playbooks) for each application type, or in other words not possible to contain all firewall rules for the entire infrastructure in one file if using ansible.

The above two paragraphs highlight an issue for me, as I believe one of the key reasons for using a configuration product is that configuration information can be easily accessed in one place and that one place can be deployed, if multiple files are used you may as well just have those multiple files managed on thier multiple servers as ansible is effectively just a backup copy doing pushes, if you have to maintain multiple files exactly the same file placement can be achieved by editing the files on their individual servers and keeping copies on a backup server; or pointless as you of course backup your servers.

Puppet examples of a class using if/else (now how would you put this into a single ansible yaml file ?; you don’t you create lots of server groups based on OS I would assume with seperate playbooks)

   if $facts['hostname'] == 'phoenix' {
      ...do something unique for this server
   }
   # note: htmldoc is not available on CentOS8 except from snap, so need a check here
   if ( $facts['os']['name'] == "CentOS" and $facts[os][release][major] < 8 ) { package { 'htmldoc': ensure => installed,
      }
   } else {
      if ( $facts['os']['name'] == "Fedora" ) {
         package { 'htmldoc':
           ensure => installed,
         }
      } # else package not available so do nothing
   }

And of course a puppet class has case statements, which can either do actions or set variables to be used later in the class.

   # These rules below are specific to OS releases where the command syntax is different
   # note: 'facter -p' run on a server provides factor details
   case $facts['os']['name'] {
      'RedHat', 'CentOS': {
                             case $facts['networking']['hostname'] {
                                'region1server1': { $fname = "centos_openstack_controller.cfg" }
                                'region1server2': { $fname = "centos_openstack_compute.cfg" }
                                default:            { $fname = "centos.cfg" }
                             }
                          }
      'Fedora':           { $fname = "fedora.cfg" }
      default:            { $fname = "fedora.cfg" }
   }
   file { 'nrpe_os_specific':
      path => "/etc/nrpe.d/${fname}",
      ensure => file,
      owner => 'root',
      group => 'root',
      mode => '0644',
      source => "puppet:///modules/nrpe/${fname}",
      notify  => Service['nrpe'],
    }

As a puppet class is effectively the equivalent of an ansible yaml file playbook I consider puppet to be better for self documenting, as a single class can contain all the login needed to deploy an application on multiple OSs where I believe ansible may require a playbook per OS; although I may later find I am incorrect in that assumption.

Features Ansible provides that Puppet CE does not

The most glaring difference is the way that the ansible command line can be used to issue commands to multiple hosts at once. I have never had a need to simultaneously shutdown multiple databases or webservers on multiple hosts at once although I can see the power of it.

The most useful thing I can think of to do which such power is to have a script that runs a playbook on each server to do lots of netstats, pings, nmap etc and have a script process the results to build a map of you network and its responsiveness. But then there are probably existing tools for that.

Ansible also has a ‘schedule’ tag that can be used in tasks to add crontab entries. I can see how that would be usefull when deploying new software.

Where I can see it being useful is the ability to ad-hoc copy a file to multiple servers with one command, although for config files that need managing puppet does that well.

The documentation says that playbooks can orchestrate steps even if different steps must bounce between machines in a particular order. This will be useful to me as openstack yaml stack files are limited in the information they can pass to the instances they are creating so ansible could replace some of my custom scripts… the damb ssh fingerprint prompt the first time a server is connected to by ansible which totally breaks any automation can be suppressed with the option ‘ –ssh-common-args=”-o StrictHostKeyChecking=no” ‘ used the first time a command is run to allow that.

Complience and consistency and sumary

Ansibles ‘push’ method requires user interaction, although I am sure a cron job could be setup to run every hour across all servers to ensure configuration files and software packages are in an expected state. It is entirely possible RedHat have a commercial dashboard and scheduling system to do just that, but you don’t get that from installing ansible.

Puppet on the other hand will poll the master at regular intervals and replace any configuration file it finds changed with the one from the puppet master; ensuring that is miscreants are modifying critical configuration files all their changes are undone in a totally hands-off way.
Puppet also at the agent poll interval starts services that should be running that were stopped which is nice to have done automatically rather than having to issue ansible commands periodically to see if state needs changing. It is also ‘not-nice’ when you want something stopped and find stopping the puppet agent before stopping an app does not have the desired effect when nagios event handlers restart puppet which restarts the app which… is a reason to document every automation tool in place and what they manage in a pretty flow diagram.

An example of what not to do, in my environment if I want to modify iptables I need to stop docker; if I stop docker a nagios/nrpe event handler will see it is down and reload the iptables and start docker; so I have to stop docker and nrpe; then puppet pokes its nose in and expects nrpe up so starts it resulting in nrpe loading iptables and starting docker again; so I have to stop puppet, nrpe and docker to make a change; do I want a stray ansible command issued to start things again as well ?, no. Si iseally there should only be one tool to manage application persistence in use at a time, on the other hand if nrpe crashed I would want it automatically restarted… where do you draw a line ?, well in a flow diagram so you know what applications to stop in order to keep them stopped.

So my summary of the two is that Puppet is best for configuration management, and ansible is best for issuing commands to multiple places at once. I may leave ansible installed along with puppet a while to see if ansible can be of use to me. For ensuring services are running nagios/nrpe monitoring and nagios/nrpe event handlers to try to restart failed tasks is still the best opensource/free application persistence tool and configuration management tools like ansible/puppet/chef should avoid stepping on its toes.

Other notes

I did use ansible to install one package using a ‘lynx.yaml’ file that used ‘hosts:desktops’ which had two hosts defined under it in the hosts_local inventory, which after I updated the sudoers file on the two machines I was testing against worked.

The errors from a failed change (before I updated sudoers) is below… now I ask you how could you automate scripting to check for errors in output like this. Also Fedora 32 is the latest release of Fedora and ‘/usr/bin/python –version’ returns ‘Python 3.8.5’ so ansible really needs its checks updated as not all OS’s have renamed it to python3.

[ansible@puppet desktops]$ ansible-playbook -i /etc/ansible/hosts_local lynx.yaml

PLAY [desktops] ***************************************************************************************************************************

TASK [Gathering Facts] ********************************************************************************************************************
[DEPRECATION WARNING]: Distribution fedora 32 on host vmhost3 should use /usr/bin/python3, but is using /usr/bin/python for backward 
compatibility with prior Ansible releases. A future Ansible release will default to using the discovered platform python for this host. 
See https://docs.ansible.com/ansible/2.9/reference_appendices/interpreter_discovery.html for more information. This feature will be 
removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
ok: [vmhost3]
[DEPRECATION WARNING]: Distribution fedora 32 on host phoenix should use /usr/bin/python3, but is using /usr/bin/python for backward 
compatibility with prior Ansible releases. A future Ansible release will default to using the discovered platform python for this host. 
See https://docs.ansible.com/ansible/2.9/reference_appendices/interpreter_discovery.html for more information. This feature will be 
removed in version 2.12. Deprecation warnings can be disabled by setting deprecation_warnings=False in ansible.cfg.
ok: [phoenix]

TASK [Ensure lynx is installed and updated] ***********************************************************************************************
fatal: [vmhost3]: FAILED! => {"msg": "Missing sudo password"}
fatal: [phoenix]: FAILED! => {"msg": "Missing sudo password"}

PLAY RECAP ********************************************************************************************************************************
phoenix                    : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0   
vmhost3                    : ok=1    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0