Openstack:Upgrade from Mitaka to Newton was a pain, but completed

The expected issues with a systemwide update where rabbitmq-server and all the erlang packages have to be removed before the update and re-installed afterward; but that was the only real issue (if you exclude the minor detail that config files had to be retweaked [openstacklocal was put back in as the domain which stopped dns working, and the extension_drivers = dns entry also had to be re-added; but minor things]).

After the upgrade instances would not boot correctly; in case it was an issue with my upgrade procedure I decided to use freash vanilla installs for this post so build new VMs and installed Newton from the RDO distribution from scratch into those. Exactly the same problems so I guess my upgrade probably worked; but I have a new ‘clean’ install to play with now anyway.

So this post is about a fresh install as while I think the upgrade worked I ended up working from a fresh Newton install. This post covers the main issues I had getting a fresh Newton install working as correctly as the Mitaka one was. At least the issues I can remember, it took me a few months in my spare time so some of the earier issues may have faded.

The install was onto “CentOS Linux release 7.3.1611 (Core)” using the RDO repositories for Newton. The configuration used was a packstack generated and customised answers file for two servers; those being an all-in-one primary server plus a second compute only server using vxlan across openvswitch for the tenant networks and a “flat” interface for the external network provided by the primary server.

Instances do not boot, a known issue, manual config change to fix

But instances just would not boot (would go to running state but never complete a boot). I ended up doing a complete vanilla install using both –allinone and using my two compute node environment; that problem still exists in a vanilla RDO install.

This is a documented bug at https://bugzilla.redhat.com/show_bug.cgi?id=1404627 which specifically identifies the new default settings as causing issues with the RDO (and with any non-bare-metal) distribution. The workaround of setting the value “cpu_mode=none” on all compute nodes does resolve this issue and instances start correctly again (note:for qemu-kvm instances; there is a lot of documentation on setting up kvm embedded and xml pasthrough if you want to run kvm under kvm under kvm; in a lab environment stick with the default qemu).

On checking my Mitaka system that cpu_mode value was set, I cannot remember if that was something I had to set manually for Mitaka or if it was something the RDO install scripts for Mitaka did. Reguardless it is a known issue so you will need to manually change the value to launch instances if your installation was not onto bare metal.

Console connectivity issues, manual work to fix

An issue with more than one compute node. When installing the RDO Newton release with a “answers file” using multiple compute nodes the novncproxy package is not installed onto the additional compute nodes, the visible effect being the dashboard can only start console sessions to instances on the main control/api/network/compute node but is not able to start console sessions to instances on additional compute nodes.
The fix is simply to on each additional compute instance to manually “yum install openstack-nova-novncproxy” and update the paramaters in the nova.conf file on each compute instance to use the correct local interface address for each compute node… and enable and start the service on each compute node of course.

And a curious bug

If the console webpage is left open the session for that page never times out, coming back to my lab machine after being at work for 9hrs the console page was still displayed in the web browser still logged on as root; and commands were still able to be entered, it had not timed out.
Moving to any other page triggers the timeout and the dashboard needs to be logged onto again. So just don’t leave your machine while it is on the console webpage.

And memory usage is a lot larger

This may be simply because the RDO Newton release correctly installs all the packages, ahdo, gnochhi, ceilometer all install and work. Under Mitaka I had the first two disabled and the last needed quick fingers to edit a config file in the middle of the install; conflicts like that do not exist (that I have found) in the RDO Netwon install.

Mitaka would at a very tight pinch run in a 5Gb CentOS VM with 4Gb of swap, using around 1Gb of swap but working. Trying to run Newton in a VM with 5Gb of memory and having to allocate 8Gb of swap has swap usage at about 5Gb and the system is basically stationary at 90%+ iowait time… with swap increasing constantly as long as the VM is running but that may be a linux (CentOS7/rhel7) issue with memory/swap management when real memory is exhausted.
The “tight” pinch was on my laptop where I could play with issues while out-and-about.

In my home lab with memory to spare using all-in-one configurations Mitaka would run in a 10Gb memory VM using 4-5Gb of memory, with Newton in the identical virsh (kvm) configuration it uses 9Gb of memory; with no instances running.

Summary

With a lot of effort (and a lot of google searches to find known problems) it works OK. It works as well as Mitaka although I have not found any major feature benefits in Newton (other than tne RDO release for Newton being able to sucessfully install more features.

I do not see that there is anything to be gained in going from Mitaka to Newton as far as usable functionality is concerned (for a home lab anyway). However if you like trawling through the many log files openstack uses to hunt down problems it is, as always, a frustrating exercise… actually the Newton Dashboard seems to provide more detail on errors during deploment so that may be a good reason to upgrade ?.

The only reason you should upgrade is it will probably make it less of a hurdle in uprading to the next release when that is available.

Openstack:Upgrade from Mitaka to Newton was a pain, but completed

Instances do not boot, a known issue, manual config change to fix

Console connectivity issues, manual work to fix

And a curious bug

And memory usage is a lot larger

Summary

About mark

Recent Posts

Archives