OpenStack woes continue

Still playing with OpenStack; Matika currently.

After lots of swearing, and reading, a second compute node now appears to be running correctly, using vxlan and openvswitch as the transport layer between the two. At least the hypervisor and compute node displays show the second compute node as available now; and the ovs-vsctl show commands on both servers show they are finally pointing at each other. [ Getting the openstack openvswitch and nova tasks running on the second compute node was a pain; and I’m still not sure which of the many changes made got them working, lost count of reboots; or they may have finally pushed a patch ]

Of course to test it I will need to migrate an instance from the primary compute node to the second, as there does not appear to be a way of selecting what compute node to use when launcing an instance.

Which leads onto a new problem, with the same frustrating issues in resolving the problem

  • the same frustrating issues with resolving it that there are lots of log files, logging lots of information, but when there is an error message it is totally out of context making it virtually useless in diagnosing a problem
  • the new problem is that migration does not work; and yes you guessed it the error message has no context

The nova compute log on the main server shows

2016-12-16 16:44:51.272 2094 INFO nova.compute.manager [req-2ee445a5-18c0-4087-93df-a9869e8a82de 83c4ed3d4df24391b4b1627e0ef69fc6 0b083d9b40a74894a8d841e16e888d2b - - -] [instance: e38855c4-b6cd-48b8-9288-6f59098b920b] Successfully reverted task state from None on failure for instance.
2016-12-16 16:44:51.277 2094 ERROR oslo_messaging.rpc.dispatcher [req-2ee445a5-18c0-4087-93df-a9869e8a82de 83c4ed3d4df24391b4b1627e0ef69fc6 0b083d9b40a74894a8d841e16e888d2b - - -] Exception during message handling: Resize error: not able to execute ssh command: Unexpected error while running command.
Command: ssh mkdir -p /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
Exit code: 255
Stdout: u''
Stderr: u'Host key verification failed.\r\n'

The problem is that running the command manually works perfectly well; the SSH keys are perfectly correct !. So as below commands via ssh just work as expected.

root@region1server1 nova]# ssh mkdir -p /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
[root@region1server1 nova]# ssh ls -la /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
total 8
drwxr-xr-x. 2 root root 4096 Dec 16 16:46 .
drwxr-xr-x. 5 nova nova 4096 Dec 16 16:46 ..
[root@region1server1 nova]# ssh rmdir /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
[root@region1server1 nova]# 

So what caused the error, somewhere before the error the command must have changed environment somehow, of course that is not recorded in the nova log that had the error message; in one of the many other logs pherhaps; if really lucky.

When OpenStack is working it’s great. The smallest hiccup requires months of trawling through log files and configuration files looking for a needle in a haystack.

Anyway, I will play with this in my spare time over the next few months as I would like to get migration (and ‘live migration’ although thats not recomended) working.
It is a low priority for me as I am still using native KVM for real guest workloads so do not need this yet; and it is pointless in a home lab as I don’t have the hardware to HA openstack region and network servers as well. It started as a curiosity thing; now I just don’t want to let it beat me; but no hurry, will probably be on the next release of openstack before I get there.

About mark

At work, been working on Tandems for around 30yrs (programming + sysadmin), plus AIX and Solaris sysadmin also thrown in during the last 20yrs; also about 5yrs on MVS (mainly operations and automation but also smp/e work). At home I have been using linux for decades. Programming background is commercially in TAL/COBOL/SCOBOL/C(Tandem); 370 assembler(MVS); C, perl and shell scripting in *nix; and Microsoft Macro Assembler(windows).
This entry was posted in Virtual Machines. Bookmark the permalink.