RDO OpenStack disk space usage, Newton release

This is for those of you who have installed the RDO release using the default loopback device for cinder volume storage and swift storage. It is a brief set of notes I will refer back to myself on where the space used is going to stop myself deleting files.
It is specific to the Newton release, while I am aware Octa is now available I have only just finished upgrading from Matika to Newton (which was a few very painfull months, the next jump can wait until I finish writing posts on the last).

The key thing is when you are running low on space do not delete any large files even if they appear not to be in use, they are probably important; as I discovered :-)

Another very important point is that this post is primarily for “all-in-one” RDO installed systems, as I will have to read a lot more documentation to find out what component(s)/servers should be managing these files. My setup is a “all-in-one” VM with a second compute node VM added, all (now, as far as I can tell) working 100% correctly; cinder and swift storage is not created on the second compute node.

Volume storage for cinder and swift are created by a minimal (or all in one) install as large disk files on the machine(s) the install is performed on. Instance disk storage in placed in the compute node(s) filesystem.

First a performance note for cinder volumes

When deleting a volume you will notice your system seems to come to a halt for a while. That is easily avoided but not recomended if you are using a VM(s) to run your test environment. The cause is an entry (the default) in cinder.conf ‘volume_clear = zero’ which basically means when you delete a volume it is overwritten with zeros presumably as a complience/security setting which obviously takes a long time with large volumes. The considerations for changing it are

  • setting it to “volume_clear = none” will greatly speed up deltion of a cinder volume, usefull in a lab only environment running in a non-VM environment
  • leaving it set to write zeros is still recomended for a lab environment running within a VM simply because if you compress your VM disk image occasionally having lots of zeros together is a good idea if you really want to compress the space used by your VM disk image

Swift storage

A simple RDO install creates the file /srv/loopback-device/swiftloopback as a local virtual disk, this is mounted as a loopback device as a normal loopback device via /etc/fstab on mountpoint /srv/node/swiftloopback. As the size of that file limits the amount of space you will have for swift storage you need to define it as a reasonable size when doing the RDO install.

[root@region1server1 ~(keystone_mark)]# ls -la /srv/loopback-device/swiftloopback
-rw-r--r--. 1 root root 10737418240 Mar 20 00:20 /srv/loopback-device/swiftloopback

[root@region1server1 ~(keystone_mark)]# file /srv/loopback-device/swiftloopback
/srv/loopback-device/swiftloopback: Linux rev 1.0 ext4 filesystem data, UUID=e674ab98-5137-4895-a99d-ae92302fa035 (needs journal recovery) (extents) (64bit) (large files) (huge files)

As I do not use swift storage thats an empty filesystem for me… and as I have not (as far as I am aware) ever used it I’m supprised it needs journal recovery. A umount/e2fsck reported it clean/remount still shows journal recovery needed ?. It mounts ok anyway, on the not to worry about list for now as I don’t need it.

Cinder volume storage

Do not, as I did when trying to reclaim space, delete the file /var/lib/cinder/cinder-volumes. Unless of course you like rebuilding filesystems as that is a “disk”.

A simple RDO install creates the file /var/lib/cinder/cinder-volumes as a local virtual disk, a volume group is created and that file is added as a PV to the volume group. As the size of that file limits the amount of space you will have for volume storage you need to define it as a resonable size when doing the RDO install.

[root@region1server1 cinder]# file /var/lib/cinder/cinder-volumes
/var/lib/cinder/cinder-volumes: LVM2 PV (Linux Logical Volume Manager), UUID: d76UV6-eKFo-LS5v-K0dU-Lvky-QTEy-kkKK0X, size: 22118662144

When a cinder volume is required it is created as a LV within that volume group. An example of that cinder-volumes file with two instances defined (requiring two volumes) is shown below

[root@region1server1 cinder]# ls -la /var/lib/cinder/cinder-volumes
-rw-r-----. 1 root root 22118662144 Mar 19 23:54 /var/lib/cinder/cinder-volumes
[root@region1server1 cinder]# file /var/lib/cinder/cinder-volumes
/var/lib/cinder/cinder-volumes: LVM2 PV (Linux Logical Volume Manager), UUID: d76UV6-eKFo-LS5v-K0dU-Lvky-QTEy-kkKK0X, size: 22118662144
[root@region1server1 cinder]# vgdisplay
  --- Volume group ---
  VG Name               cinder-volumes
  System ID             
  Format                lvm2
  Metadata Areas        1
  Metadata Sequence No  14
  VG Access             read/write
  VG Status             resizable
  MAX LV                0
  Cur LV                3
  Open LV               3
  Max PV                0
  Cur PV                1
  Act PV                1
  VG Size               20.60 GiB
  PE Size               4.00 MiB
  Total PE              5273
  Alloc PE / Size       2304 / 9.00 GiB
  Free  PE / Size       2969 / 11.60 GiB
  VG UUID               0zP0P3-V59L-CJBF-cbq2-2X5R-mNwd-wA4oO0
   
[root@region1server1 cinder]# pvdisplay
  --- Physical volume ---
  PV Name               /dev/loop1
  VG Name               cinder-volumes
  PV Size               20.60 GiB / not usable 2.00 MiB
  Allocatable           yes 
  PE Size               4.00 MiB
  Total PE              5273
  Free PE               2969
  Allocated PE          2304
  PV UUID               d76UV6-eKFo-LS5v-K0dU-Lvky-QTEy-kkKK0X
   
[root@region1server1 cinder]# lvdisplay
  --- Logical volume ---
  LV Path                /dev/cinder-volumes/volume-3957e356-880a-4c27-b18e-9ca2d24cfaad
  LV Name                volume-3957e356-880a-4c27-b18e-9ca2d24cfaad
  VG Name                cinder-volumes
  LV UUID                WH8p3K-Rujb-dl3p-QscE-BcxX-0dxA-qV0lz5
  LV Write Access        read/write
  LV Creation host, time region1server1.mdickinson.dyndns.org, 2017-02-07 17:11:38 -0500
  LV Status              available
  # open                 1
  LV Size                3.00 GiB
  Current LE             768
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           252:0
   
  --- Logical volume ---
  LV Path                /dev/cinder-volumes/volume-bfd6fe7f-d5a4-4720-a5fd-07acdd0e8ef0
  LV Name                volume-bfd6fe7f-d5a4-4720-a5fd-07acdd0e8ef0
  VG Name                cinder-volumes
  LV UUID                dSuD5D-NuYR-4dMh-GJai-AZDr-idND-fDPPGa
  LV Write Access        read/write
  LV Creation host, time region1server1.mdickinson.dyndns.org, 2017-03-14 00:13:19 -0400
  LV Status              available
  # open                 1
  LV Size                3.00 GiB
  Current LE             768
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           252:1
   
  --- Logical volume ---
  LV Path                /dev/cinder-volumes/volume-aecee04f-a1b2-4daa-bdf4-26da7c346495
  LV Name                volume-aecee04f-a1b2-4daa-bdf4-26da7c346495
  VG Name                cinder-volumes
  LV UUID                pBMTT2-IC1d-FPI4-wxSG-hxcd-6mIT-ZHgHmj
  LV Write Access        read/write
  LV Creation host, time region1server1.mdickinson.dyndns.org, 2017-03-14 00:51:49 -0400
  LV Status              available
  # open                 1
  LV Size                3.00 GiB
  Current LE             768
  Segments               1
  Allocation             inherit
  Read ahead sectors     auto
  - currently set to     8192
  Block device           252:3

[root@region1server1 ~(keystone_mark)]# cinder list
+--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
| ID                                   | Status | Name | Size | Volume Type | Bootable | Attached to                          |
+--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+
| 3957e356-880a-4c27-b18e-9ca2d24cfaad | in-use |      | 3    | iscsi       | true     | 07db7e7a-beef-46c6-8ca4-01331bb01a80 |
| aecee04f-a1b2-4daa-bdf4-26da7c346495 | in-use |      | 3    | iscsi       | true     | d99bf2a4-9d01-49a9-bb43-24389b73d711 |
+--------------------------------------+--------+------+------+-------------+----------+--------------------------------------+

Note one annoyance, the “cinder list” command only displays volumes assigned to the current tenant credentials, running it using the keystonerc_admin will return no volumes (unless you are using admin for all your projects of course). That is correct behaviour.

And another annoyance is that it seems the only way (I have found do far anyway) to check on freespace in the cinder storage volume is pvdisplay; if you have insufficient space to create a volume you will get a “block device mapping error” rather than a you have run out of space error.

And a word of caution, the defaults in Mitaka were to use image boot (not to create a volume) when launcing an instance, that has changed in Newton to defaulting to creating a boot volume for instances when they are launched. Unless you specifically want a volume created you should remember to change the use a volume flag when launching instances or you will soon run out of space.

Storage of images

Images are stored as individual files on the filesystem of the control/region node, they are not expected to be on the compute nodes. So when you use the “glance image-upload” or dashboard import of new image files they are stored as individual files on the main (control/region) server.

[root@region1server1 ~(keystone_mark)]# ls -la /var/lib/glance/images
total 214516
drwxr-x---. 2 glance glance      4096 Mar  9 22:17 .
drwxr-xr-x. 3 glance nobody      4096 Feb  4 19:52 ..
-rw-r-----. 1 glance glance 206359552 Feb  5 16:19 152753e9-0fe2-4cc9-8ab5-a9a61173f4b9
-rw-r-----. 1 glance glance  13287936 Feb  4 19:53 bcd2c760-295d-498b-9262-6a83eb3b8bfe
[root@region1server1 ~(keystone_mark)]# glance image-list
+--------------------------------------+--------------------+
| ID                                   | Name               |
+--------------------------------------+--------------------+
| bcd2c760-295d-498b-9262-6a83eb3b8bfe | cirros             |
| 152753e9-0fe2-4cc9-8ab5-a9a61173f4b9 | Fedora24-CloudBase |
+--------------------------------------+--------------------+

When an instance is launched requiring one of the images a copy of the image will be magically copied to the compute node the instance it to be launched from if the instance is launched without a volume being created, if a volume is created the instance boots across the network using the volume created in the cinder service on the main control node, as discussed below.

Storage used by defined instances

Whether an instance is running or not it will be using storage.

if launched to use a volume as the boot source (Newton default)

If an instance is launced to use a boot volume as the boot source then the volume will be created in the cinder storage residing on the machine providing the cinder service, this is unlikely to be the compute node, meaning you require a fast network in order to access the volume remotely from the compute node. And this seems to be the default in Newton.

On the compute node an instance directory is created, but no local storage on the compute node is required as the disk image(s) are stored on the remote cinder storage server.

[root@compute2 ~]# ls -laR /var/lib/nova/instances
/var/lib/nova/instances:
total 24
drwxr-xr-x. 5 nova nova 4096 Mar 14 00:52 .
drwxr-xr-x. 9 nova nova 4096 Mar 14 00:13 ..
drwxr-xr-x. 2 nova nova 4096 Mar 14 23:30 _base
-rw-r--r--. 1 nova nova   52 Mar 20 23:50 compute_nodes
drwxr-xr-x. 2 nova nova 4096 Mar 14 00:52 d99bf2a4-9d01-49a9-bb43-24389b73d711
drwxr-xr-x. 2 nova nova 4096 Mar 14 23:30 locks

/var/lib/nova/instances/_base:
total 8
drwxr-xr-x. 2 nova nova 4096 Mar 14 23:30 .
drwxr-xr-x. 5 nova nova 4096 Mar 14 00:52 ..

/var/lib/nova/instances/d99bf2a4-9d01-49a9-bb43-24389b73d711:
total 36
drwxr-xr-x. 2 nova nova  4096 Mar 14 00:52 .
drwxr-xr-x. 5 nova nova  4096 Mar 14 00:52 ..
-rw-r--r--. 1 root root 26313 Mar 15 01:15 console.log

/var/lib/nova/instances/locks:
total 8
drwxr-xr-x. 2 nova nova 4096 Mar 14 23:30 .
drwxr-xr-x. 5 nova nova 4096 Mar 14 00:52 ..
-rw-r--r--. 1 nova nova    0 Feb  4 20:57 nova-storage-registry-lock
[root@compute2 ~]#

[root@region1server1 ~(keystone_mark)]# nova show d99bf2a4-9d01-49a9-bb43-24389b73d711
+--------------------------------------+----------------------------------------------------------------------------------+
| Property                             | Value                                                                            |
+--------------------------------------+----------------------------------------------------------------------------------+
| OS-DCF:diskConfig                    | AUTO                                                                             |
| OS-EXT-AZ:availability_zone          | nova                                                                             |
| OS-EXT-SRV-ATTR:host                 | compute2.mdickinson.dyndns.org                                                   |
| OS-EXT-SRV-ATTR:hostname             | test-compute2                                                                    |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute2.mdickinson.dyndns.org                                                   |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000012                                                                |
| OS-EXT-SRV-ATTR:kernel_id            |                                                                                  |
| OS-EXT-SRV-ATTR:launch_index         | 0                                                                                |
| OS-EXT-SRV-ATTR:ramdisk_id           |                                                                                  |
| OS-EXT-SRV-ATTR:reservation_id       | r-gwvib2vv                                                                       |
| OS-EXT-SRV-ATTR:root_device_name     | /dev/vda                                                                         |
| OS-EXT-SRV-ATTR:user_data            | -                                                                                |
| OS-EXT-STS:power_state               | 4                                                                                |
| OS-EXT-STS:task_state                | -                                                                                |
| OS-EXT-STS:vm_state                  | stopped                                                                          |
| OS-SRV-USG:launched_at               | 2017-03-14T04:52:10.000000                                                       |
| OS-SRV-USG:terminated_at             | -                                                                                |
| accessIPv4                           |                                                                                  |
| accessIPv6                           |                                                                                  |
| config_drive                         |                                                                                  |
| created                              | 2017-03-14T04:51:40Z                                                             |
| description                          | test-compute2                                                                    |
| flavor                               | marks.tiny (2a10119c-b273-48d5-b727-57bda760a3d2)                                |
| hostId                               | 5afd74a02f0332b995921841d985ba3ea39e9a077acbdb3d8a8e9a55                         |
| host_status                          | UP                                                                               |
| id                                   | d99bf2a4-9d01-49a9-bb43-24389b73d711                                             |
| image                                | Attempt to boot from volume - no image supplied                                  |
| key_name                             | marks-keypair                                                                    |
| locked                               | False                                                                            |
| metadata                             | {}                                                                               |
| name                                 | test-compute2                                                                    |
| os-extended-volumes:volumes_attached | [{"id": "aecee04f-a1b2-4daa-bdf4-26da7c346495", "delete_on_termination": false}] |
| security_groups                      | default                                                                          |
| status                               | SHUTOFF                                                                          |
| tags                                 | []                                                                               |
| tenant-mark-10-0-3-0 network         | 10.0.3.19                                                                        |
| tenant_id                            | 325a12dcc6a7424aa1f96d63635c2913                                                 |
| updated                              | 2017-03-15T05:15:53Z                                                             |
| user_id                              | f833549171d94242b3af5d341b9270de                                                 |
+--------------------------------------+----------------------------------------------------------------------------------+

Snapshots for boot volumes are also written to cinder storage as LVM disks and show up with lvdisplay so additional cinder storage space is needed for those as well. Interestingly the “zero” flag mentioned at the start of this post does not appear to apply to snapshots stored in cider, they get deleted quickly.

They are displayed in the dashboard “volume snapshots” tab for the project; as well as being available as selectable images from the dashboard image list (which I do not think they should be as they need a virt-sysprep at least to be usable I would have thought).

While they are visable in the projects volume “snapshot” tab if the snapshot is deleted via the dashboard the image entry added for the snapshot remains; it must be seperately deleted from the dashboard images page.

if launched to use an instance as the boot source

In a home lab environment, with a not necessarily fast network and limited storage on the control (cinder) machine but lots of storage on compute nodes, this is preferable (to me anyway). The boot from image must be manually selected with creating a volume switch manually flicked off, the Newton default is to create a boot volume.

The required image is copied to the remote compute node the instance will be running on and placed in the /var/lib/nova/instances/_base directory on the compute node where it is used as a backing disk, the actual instance does not modify this image as it is just a backing disk, changes unique to the image are simple recorded in a directory for the image as normal for using qemu disks this way.

The major benefits are that instances should boot from local compute server storage, and if you are launching multiple instances using the same image then only that one backing file is needed as each instance booting from that image will only need disk space for its own changes. And of course there is no cinder storage used.

[root@compute2 ~]# ls -laR /var/lib/nova/instances
/var/lib/nova/instances:
total 24
drwxr-xr-x. 5 nova nova 4096 Mar 21 00:48 .
drwxr-xr-x. 9 nova nova 4096 Mar 14 00:13 ..
drwxr-xr-x. 2 nova nova 4096 Mar 21 00:37 4f0d7176-6573-415a-99e4-2846971679e4
drwxr-xr-x. 2 nova nova 4096 Mar 21 00:36 _base
-rw-r--r--. 1 nova nova   53 Mar 21 00:30 compute_nodes
drwxr-xr-x. 2 nova nova 4096 Mar 21 00:36 locks

/var/lib/nova/instances/4f0d7176-6573-415a-99e4-2846971679e4:
total 15024
drwxr-xr-x. 2 nova nova     4096 Mar 21 00:37 .
drwxr-xr-x. 5 nova nova     4096 Mar 21 00:48 ..
-rw-r--r--. 1 qemu qemu    10202 Mar 21 00:40 console.log
-rw-r--r--. 1 qemu qemu 15400960 Mar 21 00:46 disk
-rw-r--r--. 1 nova nova       79 Mar 21 00:36 disk.info


/var/lib/nova/instances/_base:
total 539220
drwxr-xr-x. 2 nova nova       4096 Mar 21 00:36 .
drwxr-xr-x. 5 nova nova       4096 Mar 21 00:48 ..
-rw-r--r--. 1 qemu qemu 3221225472 Mar 21 00:36 afbece9196679001187cff5e6e96ad5425b329e6


/var/lib/nova/instances/locks:
total 8
drwxr-xr-x. 2 nova nova 4096 Mar 21 00:36 .
drwxr-xr-x. 5 nova nova 4096 Mar 21 00:48 ..
-rw-r--r--. 1 nova nova    0 Mar 21 00:36 nova-afbece9196679001187cff5e6e96ad5425b329e6
-rw-r--r--. 1 nova nova    0 Feb  4 20:57 nova-storage-registry-lock
[root@compute2 ~]# 

[root@region1server1 ~(keystone_mark)]# nova show 4f0d7176-6573-415a-99e4-2846971679e4
+--------------------------------------+-----------------------------------------------------------+
| Property                             | Value                                                     |
+--------------------------------------+-----------------------------------------------------------+
| OS-DCF:diskConfig                    | AUTO                                                      |
| OS-EXT-AZ:availability_zone          | nova                                                      |
| OS-EXT-SRV-ATTR:host                 | compute2.mdickinson.dyndns.org                            |
| OS-EXT-SRV-ATTR:hostname             | test-compute2-novolume                                    |
| OS-EXT-SRV-ATTR:hypervisor_hostname  | compute2.mdickinson.dyndns.org                            |
| OS-EXT-SRV-ATTR:instance_name        | instance-00000014                                         |
| OS-EXT-SRV-ATTR:kernel_id            |                                                           |
| OS-EXT-SRV-ATTR:launch_index         | 0                                                         |
| OS-EXT-SRV-ATTR:ramdisk_id           |                                                           |
| OS-EXT-SRV-ATTR:reservation_id       | r-f90oaldt                                                |
| OS-EXT-SRV-ATTR:root_device_name     | /dev/vda                                                  |
| OS-EXT-SRV-ATTR:user_data            | -                                                         |
| OS-EXT-STS:power_state               | 1                                                         |
| OS-EXT-STS:task_state                | -                                                         |
| OS-EXT-STS:vm_state                  | active                                                    |
| OS-SRV-USG:launched_at               | 2017-03-21T04:39:23.000000                                |
| OS-SRV-USG:terminated_at             | -                                                         |
| accessIPv4                           |                                                           |
| accessIPv6                           |                                                           |
| config_drive                         |                                                           |
| created                              | 2017-03-21T04:36:28Z                                      |
| description                          | test-compute2-novolume                                    |
| flavor                               | marks.tiny (2a10119c-b273-48d5-b727-57bda760a3d2)         |
| hostId                               | 5afd74a02f0332b995921841d985ba3ea39e9a077acbdb3d8a8e9a55  |
| host_status                          | UP                                                        |
| id                                   | 4f0d7176-6573-415a-99e4-2846971679e4                      |
| image                                | Fedora24-CloudBase (152753e9-0fe2-4cc9-8ab5-a9a61173f4b9) |
| key_name                             | marks-keypair                                             |
| locked                               | False                                                     |
| metadata                             | {}                                                        |
| name                                 | test-compute2-novolume                                    |
| os-extended-volumes:volumes_attached | []                                                        |
| progress                             | 0                                                         |
| security_groups                      | default                                                   |
| status                               | ACTIVE                                                    |
| tags                                 | []                                                        |
| tenant-mark-10-0-3-0 network         | 10.0.3.22                                                 |
| tenant_id                            | 325a12dcc6a7424aa1f96d63635c2913                          |
| updated                              | 2017-03-21T04:39:23Z                                      |
| user_id                              | f833549171d94242b3af5d341b9270de                          |
+--------------------------------------+-----------------------------------------------------------+
[root@region1server1 ~(keystone_mark)]# 

As noted above, the image is copied across and made into a local boot image in the _base firectory where it can be shared as a backing file for all instances that need it. The second command shows the instance disk… showing it is using the backing file.

[root@compute2 ~]# file /var/lib/nova/instances/_base/afbece9196679001187cff5e6e96ad5425b329e6
/var/lib/nova/instances/_base/afbece9196679001187cff5e6e96ad5425b329e6: x86 boot sector; partition 1: ID=0x83, active, starthead 4, startsector 2048, 6289408 sectors, code offset 0xc0

[root@compute2 ~]# file /var/lib/nova/instances/4f0d7176-6573-415a-99e4-2846971679e4/disk
/var/lib/nova/instances/4f0d7176-6573-415a-99e4-2846971679e4/disk: QEMU QCOW Image (v3), has backing file (path /var/lib/nova/instances/_base/afbece9196679001187cff5e6e96ad542), 3221225472 bytes
[root@compute2 ~]# 

When all instances on the compute node stop using the image in the _base directory it hangs around for a while, presumably in case it is needed to launce other instances using the same image. I find it gets cleaned up within 30mins of inactivity.

Snapshots are not stored on the compute node the instance was on. A snapshots directory is created on the compute node but left empty, a snapshot in this scenareo is saved as a new “image” file on the main control node. They are not displayed in the projects volume “snapshot” tab.

They are only displayed in the dashboard as being available as selectable bootable images from the dashboard image list (which I do not think they should be as they need a virt-sysprep at least to be usable I would have thought… and users can lose track of snapshots). But at least in this case no cinder storage is used.

Summary

In a home lab environment don’t waster cinder space, launch instances using boot images not volumes, unless you either intend to keep them around for a long time or have a lot of storage allocated to cinder.

And when you are getting low on space do not delete large files from the filesystem directories :-)

When you do make a oops

Not a troubleshooting section, but if you do accidentaly delete the cinder-volumes file the only way to delete the entries known to openstack is (replacing the id with your volume ids of course…

MariaDB [(none)]> use cinder;
MariaDB [cinder]> delete from volume_glance_metadata where volume_id="aecee04f-a1b2-4daa-bdf4-26da7c346495";
Query OK, 8 rows affected (0.05 sec)
MariaDB [cinder]> delete from volume_attachment where volume_id="aecee04f-a1b2-4daa-bdf4-26da7c346495";
Query OK, 1 row affected (0.03 sec)
MariaDB [cinder]> delete from volume_admin_metadata where volume_id="aecee04f-a1b2-4daa-bdf4-26da7c346495";
Query OK, 2 rows affected (0.02 sec)
MariaDB [cinder]> delete from volumes where id="aecee04f-a1b2-4daa-bdf4-26da7c346495";
Query OK, 1 row affected (0.22 sec)
MariaDB [cinder]> commit;
Query OK, 0 rows affected (0.01 sec)
MariaDB [cinder]> \q
Bye

And then recreate the cinder-volumes file, clean up the pv entry to use it (and kill off any stray lv entries); and well basically recreate the vg environment by hand. It is not hard, and this is not a linux tutorial, and thats not what this post is about :-). I would recomend not deleting it in the first place however.

Posted in OpenStack | Comments Off on RDO OpenStack disk space usage, Newton release

Openstack:Upgrade from Mitaka to Newton was a pain, but completed

The expected issues with a systemwide update where rabbitmq-server and all the erlang packages have to be removed before the update and re-installed afterward; but that was the only real issue (if you exclude the minor detail that config files had to be retweaked [openstacklocal was put back in as the domain which stopped dns working, and the extension_drivers = dns entry also had to be re-added; but minor things]).

After the upgrade instances would not boot correctly; in case it was an issue with my upgrade procedure I decided to use freash vanilla installs for this post so build new VMs and installed Newton from the RDO distribution from scratch into those. Exactly the same problems so I guess my upgrade probably worked; but I have a new ‘clean’ install to play with now anyway.

So this post is about a fresh install as while I think the upgrade worked I ended up working from a fresh Newton install. This post covers the main issues I had getting a fresh Newton install working as correctly as the Mitaka one was. At least the issues I can remember, it took me a few months in my spare time so some of the earier issues may have faded.

The install was onto “CentOS Linux release 7.3.1611 (Core)” using the RDO repositories for Newton. The configuration used was a packstack generated and customised answers file for two servers; those being an all-in-one primary server plus a second compute only server using vxlan across openvswitch for the tenant networks and a “flat” interface for the external network provided by the primary server.

Instances do not boot, a known issue, manual config change to fix

But instances just would not boot (would go to running state but never complete a boot). I ended up doing a complete vanilla install using both –allinone and using my two compute node environment; that problem still exists in a vanilla RDO install.

This is a documented bug at https://bugzilla.redhat.com/show_bug.cgi?id=1404627 which specifically identifies the new default settings as causing issues with the RDO (and with any non-bare-metal) distribution. The workaround of setting the value “cpu_mode=none” on all compute nodes does resolve this issue and instances start correctly again (note:for qemu-kvm instances; there is a lot of documentation on setting up kvm embedded and xml pasthrough if you want to run kvm under kvm under kvm; in a lab environment stick with the default qemu).

On checking my Mitaka system that cpu_mode value was set, I cannot remember if that was something I had to set manually for Mitaka or if it was something the RDO install scripts for Mitaka did. Reguardless it is a known issue so you will need to manually change the value to launch instances if your installation was not onto bare metal.

Console connectivity issues, manual work to fix

An issue with more than one compute node. When installing the RDO Newton release with a “answers file” using multiple compute nodes the novncproxy package is not installed onto the additional compute nodes, the visible effect being the dashboard can only start console sessions to instances on the main control/api/network/compute node but is not able to start console sessions to instances on additional compute nodes.
The fix is simply to on each additional compute instance to manually “yum install openstack-nova-novncproxy” and update the paramaters in the nova.conf file on each compute instance to use the correct local interface address for each compute node… and enable and start the service on each compute node of course.

And a curious bug

If the console webpage is left open the session for that page never times out, coming back to my lab machine after being at work for 9hrs the console page was still displayed in the web browser still logged on as root; and commands were still able to be entered, it had not timed out.
Moving to any other page triggers the timeout and the dashboard needs to be logged onto again. So just don’t leave your machine while it is on the console webpage.

And memory usage is a lot larger

This may be simply because the RDO Newton release correctly installs all the packages, ahdo, gnochhi, ceilometer all install and work. Under Mitaka I had the first two disabled and the last needed quick fingers to edit a config file in the middle of the install; conflicts like that do not exist (that I have found) in the RDO Netwon install.

Mitaka would at a very tight pinch run in a 5Gb CentOS VM with 4Gb of swap, using around 1Gb of swap but working. Trying to run Newton in a VM with 5Gb of memory and having to allocate 8Gb of swap has swap usage at about 5Gb and the system is basically stationary at 90%+ iowait time… with swap increasing constantly as long as the VM is running but that may be a linux (CentOS7/rhel7) issue with memory/swap management when real memory is exhausted.
The “tight” pinch was on my laptop where I could play with issues while out-and-about.

In my home lab with memory to spare using all-in-one configurations Mitaka would run in a 10Gb memory VM using 4-5Gb of memory, with Newton in the identical virsh (kvm) configuration it uses 9Gb of memory; with no instances running.

Summary

With a lot of effort (and a lot of google searches to find known problems) it works OK. It works as well as Mitaka although I have not found any major feature benefits in Newton (other than tne RDO release for Newton being able to sucessfully install more features.

I do not see that there is anything to be gained in going from Mitaka to Newton as far as usable functionality is concerned (for a home lab anyway). However if you like trawling through the many log files openstack uses to hunt down problems it is, as always, a frustrating exercise… actually the Newton Dashboard seems to provide more detail on errors during deploment so that may be a good reason to upgrade ?.

The only reason you should upgrade is it will probably make it less of a hurdle in uprading to the next release when that is available.

Posted in OpenStack | Comments Off on Openstack:Upgrade from Mitaka to Newton was a pain, but completed

Accessing openstack instances, proxy gateways and proxy clients

While I treat my home lab openstack instances as proper cloud instances, specifically they are supposed to be thrown up and torn down not exist for a long time, I don’t like assigning more that one floating ip per tenant-network as most of the instances should never be accessable from my main network.

Basically each tenant network I give one instance a floating ip address and use that as a gateway to all the others. When only using a few instances on the internal that is fine; but that gets a little irritating when I have quite a few instances I need to bounce around.

The main problem is the public key pair needed to access each instance has to be copied onto that gateway server and used to access all the others. Should I have a need to ssh from one of the internal instances to another that public key also has to be copied onto the server(s) I wish to ssh from.
Ok, that is not a problem, it is by design in cloud images, but gets frustrating. Not just having to copy the key to the gateway server as step one but having to copy it to any instance I do a lof of ssh’ing from.

I did add a network route for a tenant subnet to the gateway floating ip-address but network traffic to the internal network only gets as far as the internal address for the gateway server; probably just needs something like ipv4 forwarding enabled in the config; but I want to minimise changes to images.

While any internal instance can easily be accessed from the openstack network server simply by ssh’ing from the correct network namespace on the network node for the tenant network, that ability is supposed to be hidden from customers, so we will not mention that anybody with access to the network node can login to instances on your private network (if they have a correct ssh key, or if you stupidly allow password logons).

The ideal solution would be to just proxy all traffic from any of my desktops via the gateway instance to any of the internal network servers. Hard though it may be to believe the hardest part of that to get working was finding a socks5 client that would work.

My proxy server solution

My proxy solution is not perfect, and cannot be automatically configured because like using the gatway instance as a jumphost to the other internal servers this also requires the public key (the key assigned to the gateway instance only, you can use other keys on other instances in the internal network if desired) copied onto the gateway server. In this case however it is to allow ssh to localhost as ssh logins are still only permitted by the default confguration using keys. Yes you could reconfigure sshd to allow login without keys, but in either case manual configuration would be required on the gateway instance and allowing logins without keys makes it less secure.

Anyway, ssh can act as a proxy server (source document referenced http://www.catonmat.net/blog/linux-socks5-proxy/). Once the public key is copied to the gateway instance ssh can be used to start a proxy server with the command (use your own key of course)

ssh -i ./marks_keypair.pem -N -D 0.0.0.0:1080 localhost

The key point about this solution is that no additional software needs to be installed onto your ‘cloud base’ instance. You can also if you wish use iptables rules to limit what external addressed can connect to the proxy port so it is no less secure than commercial packages. Plus ssh is common across all *nix distributions so this should work with any flavour of linux I choose to use on a gateway instance.

A correctly configured proxy client can then access all internal tenant network instances. This means I only need to copy the one public key to the gateway server and with sshd running as a proxy access the internal instances can all be accessed (it means the key used on the gateway server can be different from user keys added to the instances at instance launch time, user public keys can live on their own workstations as I don’t need to copy them anywhere, and from any workstation with a valid public key I can access all the internal instances without having to logon to the gateway server… just as cloud instances should be accessed :-).

My proxy client solution

As noted above, the hardest part of getting a proxied solution going was finding a working proxy client solution. tsocks is now in the fedora repository (I am using F25 and the version available is tsocks-1.8-0.16.beta5.fc24.x86_64). As it is in the repository that was immediately my preferred solution.

The biggest issue is that the BETA tag definately include the man pages. A “man tsocks.conf” undoubtably shows you what will be supported in a configuration file one day; but attempting to build a configuration file using the man page will get you nowhere.

I found a working configuration via google that got everything working for me. My ‘gateway’ server floating ip-address is 192.168.1.235 and the tenant network range the floating address is attached to is 10.0.3.0/24, and the below configuration in /etc/tsocks.conf allows me to ssh directly into any of the tenant private network machines; client problem solved.

# default server
local = 192.168.1.0/255.255.255.0
server = 192.168.1.235
server_type = 5
server_port = 1080

# explicit ranges accessed via specific proxy servers
path {
  server = 192.168.1.235
  server_port = 1080
  server_type = 5
  reaches = 10.0.3.0/255.255.0.0
}

My reasoning for this solution

It is fair to say I was looking at full function dedicated proxy servers for a while, and was looking at dante and srelay. However all products advertising themselves as proxy servers are either commercial or provided as discrete installs… not in distribution repositories. Anything not in a distribution repository requires manually watching for security updates and manual updates when there are.

I am playing with too many things at any given time to even consider all that extra work. I chose a solution where a simple dnf/yum update will keep me up-to-date. Well the images are kept up-to-date, running instances are supposed to be considered temporary in a cloud environment so can look af ter themselves until torn down.

I will use repository provided files wherever possible, this achieves that goal. Also if a utility is not going to be supported by a repository package that utility has no future (unless thay have taken it commercial only in which case 90% of linux users will never use it again).

Posted in OpenStack | Comments Off on Accessing openstack instances, proxy gateways and proxy clients

OpenStack Matika DNS resolution, or lack of

Unfortunately even at the Matika release neutron networking is unable to use DNS lookup by instance name yet. The issue is that DNS entries (if DNS is configured to be used) are not created for the instance hostname, but instead are created as “host-ipaddress”; which is totally useless as if you have to know the ip-address anyway to lookup the DNS entry as “host-ipaddress”… why bother with DNS if you know the ip-address !.

This is a known irritation for openstack users and was documented as a feature enhancement (obviously still in progress) at https://specs.openstack.org/openstack/neutron-specs/specs/liberty/internal-dns-resolution.html.

The good news is the enhancement (as seen by the link URL) was identified as an issue with liberty, and a lot of the foundation work seems to be in place now for Matika. For instances created via the dashboard the port details created now correctly contain the correct details for dns_assignment and dns_name using the instance hostname which is a definate improvement.

It may even have been resolved in the “newton” release of openstack which has been available for a while now.

[root@region1server1 ~(keystone_admin)]# neutron port-list
+--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+
| id                                   | name | mac_address       | fixed_ips                                                                            |
+--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+
| 12442c2f-2d2f-4126-bd76-58a4087b2e71 |      | fa:16:3e:02:e3:30 | {"subnet_id": "9de25f6d-4c53-47d4-9dab-7d19f0bab113", "ip_address": "10.0.3.1"}      |
| 4bb68efc-e3d9-42b7-9885-ddc78918e8b5 |      | fa:16:3e:1b:be:8a | {"subnet_id": "9de25f6d-4c53-47d4-9dab-7d19f0bab113", "ip_address": "10.0.3.10"}     |
| 549ab7b2-317f-4dbb-98bb-cca911dc4eef |      | fa:16:3e:8f:42:b9 | {"subnet_id": "9de25f6d-4c53-47d4-9dab-7d19f0bab113", "ip_address": "10.0.3.12"}     |
| 60e26ef5-ad19-4713-a7b8-82fb6885bc90 |      | fa:16:3e:33:36:b1 | {"subnet_id": "2c88cd9a-623b-474b-98d1-65c5e482ae50", "ip_address": "192.168.1.235"} |
| 729c5a6c-211b-4df8-8ea7-bddf4c50abe3 |      | fa:16:3e:f8:11:f1 | {"subnet_id": "2c88cd9a-623b-474b-98d1-65c5e482ae50", "ip_address": "192.168.1.236"} |
| efa70f39-66de-4c10-88b7-c7cac61ffe21 |      | fa:16:3e:db:4c:11 | {"subnet_id": "2c88cd9a-623b-474b-98d1-65c5e482ae50", "ip_address": "192.168.1.234"} |
+--------------------------------------+------+-------------------+--------------------------------------------------------------------------------------+
[root@region1server1 ~(keystone_admin)]# neutron port-show 549ab7b2-317f-4dbb-98bb-cca911dc4eef
+-----------------------+----------------------------------------------------------------------------------------------------------------+
| Field                 | Value                                                                                                          |
+-----------------------+----------------------------------------------------------------------------------------------------------------+
| admin_state_up        | True                                                                                                           |
| allowed_address_pairs |                                                                                                                |
| binding:host_id       | region1server1.mdickinson.dyndns.org                                                                           |
| binding:profile       | {}                                                                                                             |
| binding:vif_details   | {"port_filter": true, "ovs_hybrid_plug": false}                                                                |
| binding:vif_type      | ovs                                                                                                            |
| binding:vnic_type     | normal                                                                                                         |
| created_at            | 2017-01-31T03:08:26                                                                                            |
| description           |                                                                                                                |
| device_id             | 76db9e9d-3b53-4822-bf1c-b251a1899113                                                                           |
| device_owner          | compute:nova                                                                                                   |
| dns_assignment        | {"hostname": "gateway-10-0-3-0", "ip_address": "10.0.3.12", "fqdn": "gateway-10-0-3-0.mdickinson.dyndns.org."} |
| dns_name              | gateway-10-0-3-0                                                                                               |
| extra_dhcp_opts       |                                                                                                                |
| fixed_ips             | {"subnet_id": "9de25f6d-4c53-47d4-9dab-7d19f0bab113", "ip_address": "10.0.3.12"}                               |
| id                    | 549ab7b2-317f-4dbb-98bb-cca911dc4eef                                                                           |
| mac_address           | fa:16:3e:8f:42:b9                                                                                              |
| name                  |                                                                                                                |
| network_id            | 831c2bb0-cf40-49eb-a0e4-22d0c4a2bacf                                                                           |
| security_groups       | 83080bb5-0bb4-4c24-a723-20a4d847a319                                                                           |
| status                | ACTIVE                                                                                                         |
| tenant_id             | 0b083d9b40a74894a8d841e16e888d2b                                                                               |
| updated_at            | 2017-01-31T03:08:31                                                                                            |
+-----------------------+----------------------------------------------------------------------------------------------------------------+
[root@region1server1 ~(keystone_admin)]#

The bad news, is that as of the Matika release those details are not yet being set in the DNS resolver, for DNS lookups only the rather useless “host-ipaddr” entry is still being used.

[fedora@gateway-10-0-3-0 ~]$ ping gateway-10-0-3-0
ping: gateway-10-0-3-0: Name or service not known
[fedora@gateway-10-0-3-0 ~]$ ping gateway-10-0-3-0.mdickinson.dyndns.org
ping: gateway-10-0-3-0.mdickinson.dyndns.org: Name or service not known

[fedora@gateway-10-0-3-0 ~]$ ifconfig -a | grep "inet 10"
        inet 10.0.3.12  netmask 255.255.255.0  broadcast 10.0.3.255
[fedora@gateway-10-0-3-0 ~]$ ping host-10-0-3-12
PING host-10-0-3-12.mdickinson.dyndns.org (10.0.3.12) 56(84) bytes of data.
64 bytes from host-10-0-3-12.mdickinson.dyndns.org (10.0.3.12): icmp_seq=1 ttl=64 time=1.78 ms
64 bytes from host-10-0-3-12.mdickinson.dyndns.org (10.0.3.12): icmp_seq=2 ttl=64 time=0.168 ms
64 bytes from host-10-0-3-12.mdickinson.dyndns.org (10.0.3.12): icmp_seq=3 ttl=64 time=0.178 ms
64 bytes from host-10-0-3-12.mdickinson.dyndns.org (10.0.3.12): icmp_seq=4 ttl=64 time=0.195 ms
^C
--- host-10-0-3-12.mdickinson.dyndns.org ping statistics ---
4 packets transmitted, 4 received, 0% packet loss, time 3005ms

While this is irritating for manually created one-off instances it is even more painfull when a group of servers are started as a stack as until the servers are built/spawned/running the provate ip-address is not known so it is not possible to even use cloud-init scripts to create a /etc/hosts file for each instance in the stack.

As the port information is now correctly configured in Matika I suppose it would be possible to automate updates to each instances /etc/hosts file periodically…

  • probably in a cloud-init script chown the /etc/hosts file to the default cloud image user on each instance built
  • you don’t want to assign a floating ip to every instance so you would need to run a proxy service on one of the intances to allow access to other instances on the tenant private network (or access all the instances directly from the network namespace for the tenant network (if using linux namespaces))
  • periodically use neutron port commands to get all the hostname/ip-addr info to build a master “hosts” file, and push that out to every instance
  • issues would be making sure only instances on the correct tenant subnet are updated that way (heat stacks would use a different keypair to manually created instances (as admin users cannot be heat_owners) so you would have to know how each instance was created to know what keypairs to use (you may need individual user keypairs if users have been creating instances [no, do not put a common root keypair in your images !]); and generally why bother as it is on the todo path and you would have to remove all those hosts files when it finally gets implemented

I don’t have any complicated stacks so can manage hosts files manually. Also the RDO project site now has the newton release available which may have already resolved this issue.

It may be fixed in newton, so…

One of my next posts will be on a quick and dirty upgrade from Matika to the “newton” release on my test nodes; all my nodes are being upgraded as I type this post. I will mention if the issue is resolved.

The most likely result will be that overwrite/upgrade in place while all openstack components are running is going to make a mess :-).

Posted in OpenStack | Comments Off on OpenStack Matika DNS resolution, or lack of

OpenStack RDO using packstack again, Matika release

Apologies in advance for the formatting. The WP template I am using at the moment does not handle PRE correctly and wraps it around rather than letting it scroll. But onward.

I finally got my second compute node working correctly with the RDO release of OpenStack Matika

  • an instance on the second node is assigned a private ip-address correctly
  • an instance on the second node can contact the metadata namespace on the control node (which is also the first compute node)
  • instances on the second compute node can ping instances on the first compute node across the tenant private network, so the VXLAN configuration is now working
  • instance migration still does not work

The only real issue in getting it working is that I really cannot see any difference between my origional non-working setup and the working one apart from having to disable the install of some of the components origionally installed. And that seems to have been due to packaging changes breaking the packagekit install rather than errors in setup.
One change I did make was remove bridging from the internal interface on the second compute node, I did leave bridging on the first compute node even though it does break the packagekit install (workaround is mention a long way below) as all documentation I can find indicates it should be bridged.

Anyway the packages that had to be changed from install to non-install (y to n) in the answers file were GNOCHHI (metering) and AHDO (alarming). I also disabled Celiometer as I was getting sick of having to try to get in and edit one of the files puppet installed in the few seconds between puppet installing it and the install trying to start httpd, simply because if the few second window was missed the entire install has to be started again for another attempt to edit the damb file. All issues briefly covered below.

AHDO errors

The issue with AHDO is that it now issues an invalid command as part of the install attempt.

ERROR : Error appeared during Puppet run: 192.168.1.172_keystone.pp
Error: /Stage[main]/Aodh::Keystone::Auth/Keystone::Resource::Service_identity[aodh]/Keystone_user[aodh]: Could not evaluate: Execution of '/usr/bin/openstack user show --format shell aodh --domain default' returned 1: Could not find resource default (tried 0, for a total of 0 seconds)
You will find full trace in log /var/tmp/packstack/20170119-155608-RVbObm/manifests/192.168.1.172_keystone.pp.log
Please check log file /var/tmp/packstack/20170119-155608-RVbObm/openstack-setup.log for more information

[root@region1server1 RDO(keystone_admin)]# /usr/bin/openstack user show --format shell aodh --domain default
usage: openstack user show [-h]
                           [-f {html,json,json,shell,table,value,yaml,yaml}]
                           [-c COLUMN] [--max-width ] [--noindent]
                           [--prefix PREFIX]
                           
openstack user show: error: unrecognized arguments: --domain default
[root@region1server1 RDO(keystone_admin)]# 

GNOCCHI errors

This was an extremely weird one. I changed the answers file to give it a DB password, after running packagekit the answers file had reverted to no password, so install failed. So I changed the answers file to give it a DB password, rebooted, checked and a password was still set in the answers file, but after running packagekit again the answers file had reverted back to having no password. Had to set the GNOCCHI install to “n” also to get past that error.

Errors caused during Celiometer install

This is a minor issue; well minor apart from it causing the install to fail. It is a conflict introduced into the condiguration during the install.

Puppet creates a /etc/httpd/ports.conf file and one of the entries inserted is 8777, which just happens to also be used by celiometer. Once celiometer starts it is no longer possible to start httpd until that entry is removed.

During puppet reruns the easiest workaround I found for that was to constantly grep the ports.conf file and as soon as puppet updated it I would vi it and comment the line out; if done quickly enough before the install script got to the point of starting httpd then the install would continue to completion. An alternative I suppose would be to stop all celiometer services prior to every packagekit run but the root issue is that the packagekit install does cause conflicts if celiometer is installed. But the root issue is if you want celiometer you need to be aware of this issue.

Another issue to be aware of

All of the web sites google pointed be at when I was trying to get my second node correctly networked indicated that is L2 propogation is to be used the parameter CONFIG_NEUTRON_ML2_VXLAN_GROUP in the answers file should be given a value (ie:CONFIG_NEUTRON_ML2_VXLAN_GROUP=239.1.1.2).

Do not set a value !. While I am sure in a commercial network with GB pipes everywhere that might be usefull mu home network uses 10/100 switches, and setting that value in the answers file and running packagekit resulted in total non-responsiveness for my network, “iftop” on a machine two hops away from the server I was running packagekit on showed everything was being totally swamped by broadcast traffic, even ping packets were only reaching other servers about 5% of the time, usefull conectivity was non-existent.

Everything works fine without setting that value, so don’t set that value in the answers file unless you have a commercial grade network that can cope and you have a real need for it.

And an issue pherhaps of my own making, conflicting documentation ?

All the documentation I have come across seems to indicate that the internal interface to be used for GRE or VXLAN networking should be bridged, so my initial setup used a bridged eth1 on the combined controller/network/compute1 node and when I was adding the second compute node made it bridged there as well. The issue is that when installing openstack RDO using packagekit it cannot handle bridged interfaces on compute nodes.

My solution was to retain the bridged interface on the controller/network/compute1 node and use a non-bridged normal interface on the compute2 node, so I only had to fiddle about on the first compute node when rerunning packagekit.

The exact issue is that using br-eth1 in the answers file results in an error that says the install scriot was unable to obtain an ip-address from the interface, so I assume it uses some command other than ifconfig to get it. Anyway my solution is simply that before running packagekit, on the main node I use ifconfig to set an address on eth1 matching the address on br-eth1, so briefly there are two identical ip-addresses on the server but as the internal network is not used for the install that doesn’t seem to cause any problems; and always reboot after the install so it gets cleaned up again anyway.

The final issue to be aware of, if using openvswitch

If you run packagekit more than one ensure that you use ovs-vsctl to del-port all the vxlan interfaces before running packagekit. It will quite happily create new vxlan entries for you and it can get damb confusing to try and figure out which are obsolete. So delete them all, run packagekit, reboot; and after the reboot only the “real” ones will be re-created.

If you don’t do that, no matter how many times you reboot the obsolete entries will not be removed and will cause problems.

My working two node setup

controller/network/compute1

[root@region1server1 ~]# ifconfig -a
br-eth1: flags=4163  mtu 1500
        inet 172.16.0.172  netmask 255.255.255.0  broadcast 172.16.0.255
        inet6 fe80::78bc:faff:fe6e:4143  prefixlen 64  scopeid 0x20
        ether 7a:bc:fa:6e:41:43  txqueuelen 0  (Ethernet)
        RX packets 14631  bytes 1066026 (1.0 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 19  bytes 1502 (1.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br-ex: flags=4163  mtu 1500
        inet 192.168.1.172  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::50fe:34ff:fee8:a444  prefixlen 64  scopeid 0x20
        ether 52:fe:34:e8:a4:44  txqueuelen 0  (Ethernet)
        RX packets 17737  bytes 1578768 (1.5 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 3067  bytes 436548 (426.3 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br-int: flags=4098  mtu 1500
        ether 42:b3:b5:7f:b7:47  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 14414  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br-tun: flags=4098  mtu 1500
        ether 9a:ae:38:59:b8:4e  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163  mtu 1500
        inet6 fe80::5054:ff:fecb:ef4c  prefixlen 64  scopeid 0x20
        ether 52:54:00:cb:ef:4c  txqueuelen 1000  (Ethernet)
        RX packets 14820  bytes 1366266 (1.3 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5982  bytes 648978 (633.7 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163  mtu 1500
        inet6 fe80::5054:ff:fe13:c79  prefixlen 64  scopeid 0x20
        ether 52:54:00:13:0c:79  txqueuelen 1000  (Ethernet)
        RX packets 3363  bytes 256154 (250.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 11287  bytes 811456 (792.4 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 0  (Local Loopback)
        RX packets 108122  bytes 17743740 (16.9 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 108122  bytes 17743740 (16.9 MiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ovs-system: flags=4098  mtu 1500
        ether b2:aa:a6:05:1d:4c  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@region1server1 ~]# ovs-vsctl show
2eaca3fa-3b18-49e7-baf4-af2293ee59a6
    Bridge br-tun
        fail_mode: secure
        Port br-tun
            Interface br-tun
                type: internal
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
        Port "vxlan-ac1000a2"
            Interface "vxlan-ac1000a2"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.0.172", out_key=flow, remote_ip="172.16.0.162"}
    Bridge "br-eth1"
        Port "eth1"
            Interface "eth1"
        Port "phy-br-eth1"
            Interface "phy-br-eth1"
                type: patch
                options: {peer="int-br-eth1"}
        Port "br-eth1"
            Interface "br-eth1"
                type: internal
    Bridge br-int
        fail_mode: secure
        Port "int-br-eth1"
            Interface "int-br-eth1"
                type: patch
                options: {peer="phy-br-eth1"}
        Port br-int
            Interface br-int
                type: internal
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port "tapb013c23e-53"
            tag: 1
            Interface "tapb013c23e-53"
                type: internal
        Port int-br-ex
            Interface int-br-ex
                type: patch
                options: {peer=phy-br-ex}
        Port "qr-06edd289-55"
            tag: 1
            Interface "qr-06edd289-55"
                type: internal
    Bridge br-ex
        Port "eth0"
            Interface "eth0"
        Port "qg-a07dff9d-1a"
            Interface "qg-a07dff9d-1a"
                type: internal
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port br-ex
            Interface br-ex
                type: internal
    ovs_version: "2.4.0"
[root@region1server1 ~]#
[root@region1server1 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 br-ex
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1003   0        0 eth1
169.254.0.0     0.0.0.0         255.255.0.0     U     1005   0        0 br-ex
169.254.0.0     0.0.0.0         255.255.0.0     U     1008   0        0 br-eth1
172.16.0.0      0.0.0.0         255.255.255.0   U     0      0        0 br-eth1
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 br-ex
[root@region1server1 ~]# ip netns
qrouter-b1a8ae3c-f48d-4688-8fb9-823d4e3717d8
qdhcp-d5c92bf4-8831-4ac8-9465-4da50e71435e
[root@region1server1 ~]# ip net exec qrouter-b1a8ae3c-f48d-4688-8fb9-823d4e3717d8 ifconfig -a
lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 0  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qg-a07dff9d-1a: flags=4163  mtu 1500
        inet 192.168.1.234  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::f816:3eff:fe01:b402  prefixlen 64  scopeid 0x20
        ether fa:16:3e:01:b4:02  txqueuelen 0  (Ethernet)
        RX packets 315  bytes 19458 (19.0 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 16  bytes 1200 (1.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

qr-06edd289-55: flags=4163  mtu 1450
        inet 10.0.3.1  netmask 255.255.255.0  broadcast 10.0.3.255
        inet6 fe80::f816:3eff:fe17:1643  prefixlen 64  scopeid 0x20
        ether fa:16:3e:17:16:43  txqueuelen 0  (Ethernet)
        RX packets 17  bytes 1212 (1.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 10  bytes 864 (864.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@region1server1 ~]# ip net exec qrouter-b1a8ae3c-f48d-4688-8fb9-823d4e3717d8 route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 qg-a07dff9d-1a
10.0.3.0        0.0.0.0         255.255.255.0   U     0      0        0 qr-06edd289-55
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 qg-a07dff9d-1a

[root@region1server1 ~]# ip net exec qdhcp-d5c92bf4-8831-4ac8-9465-4da50e71435e ifconfig -a
lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 0  (Local Loopback)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

tapb013c23e-53: flags=4163  mtu 1450
        inet 10.0.3.50  netmask 255.255.255.0  broadcast 10.0.3.255
        inet6 fe80::f816:3eff:fe01:3e1c  prefixlen 64  scopeid 0x20
        ether fa:16:3e:01:3e:1c  txqueuelen 0  (Ethernet)
        RX packets 2  bytes 220 (220.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 8  bytes 648 (648.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@region1server1 ~]# ip net exec qdhcp-d5c92bf4-8831-4ac8-9465-4da50e71435e route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         10.0.3.1        0.0.0.0         UG    0      0        0 tapb013c23e-53
10.0.3.0        0.0.0.0         255.255.255.0   U     0      0        0 tapb013c23e-53
[root@region1server1 ~]# 
[root@region1server1 ~]# source keystonerc_admin
[root@region1server1 ~(keystone_admin)]# neutron agent-list
+---------------------------+--------------------+---------------------------+-------------------+-------+----------------+---------------------------+
| id                        | agent_type         | host                      | availability_zone | alive | admin_state_up | binary                    |
+---------------------------+--------------------+---------------------------+-------------------+-------+----------------+---------------------------+
| 704aaa62-94ea-4be0-8aff-  | DHCP agent         | region1server1.mdickinson | nova              | :-)   | True           | neutron-dhcp-agent        |
| 6e9c702e0132              |                    | .dyndns.org               |                   |       |                |                           |
| 8447df3d-f681-4f1e-       | L3 agent           | region1server1.mdickinson | nova              | :-)   | True           | neutron-l3-agent          |
| af90-b1693474a115         |                    | .dyndns.org               |                   |       |                |                           |
| 92766a15-6902-42a5-85db-  | Open vSwitch agent | region1server1.mdickinson |                   | :-)   | True           | neutron-openvswitch-agent |
| 3b128468666b              |                    | .dyndns.org               |                   |       |                |                           |
| d6f1c717-8db4-4c80-b435-0 | Metering agent     | region1server1.mdickinson |                   | :-)   | True           | neutron-metering-agent    |
| 437de81b8e3               |                    | .dyndns.org               |                   |       |                |                           |
| ef8c7c37-f973-4db0-8041-b | Metadata agent     | region1server1.mdickinson |                   | :-)   | True           | neutron-metadata-agent    |
| 43feb1b4841               |                    | .dyndns.org               |                   |       |                |                           |
| fe280959-f0f7-4c3b-8376-d | Open vSwitch agent | region1compute2.mdickinso |                   | :-)   | True           | neutron-openvswitch-agent |
| 8b9a462034b               |                    | n.dyndns.org              |                   |       |                |                           |
+---------------------------+--------------------+---------------------------+-------------------+-------+----------------+---------------------------+
[root@region1server1 ~(keystone_admin)]# 

compute2

[root@region1compute2 ~]# ifconfig -a
br-int: flags=4098  mtu 1500
        ether 6a:57:6d:ac:ba:43  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

br-tun: flags=4098  mtu 1500
        ether 36:4b:86:9c:bd:46  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth0: flags=4163  mtu 1500
        inet 192.168.1.162  netmask 255.255.255.0  broadcast 192.168.1.255
        inet6 fe80::5054:ff:fea3:16f7  prefixlen 64  scopeid 0x20
        ether 52:54:00:a3:16:f7  txqueuelen 1000  (Ethernet)
        RX packets 19536  bytes 1754775 (1.6 MiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 5739  bytes 928722 (906.9 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

eth1: flags=4163  mtu 1500
        inet 172.16.0.162  netmask 255.255.255.0  broadcast 172.16.0.255
        inet6 fe80::5054:ff:fe78:a3bf  prefixlen 64  scopeid 0x20
        ether 52:54:00:78:a3:bf  txqueuelen 1000  (Ethernet)
        RX packets 1640  bytes 138116 (134.8 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 12  bytes 816 (816.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

lo: flags=73  mtu 65536
        inet 127.0.0.1  netmask 255.0.0.0
        inet6 ::1  prefixlen 128  scopeid 0x10
        loop  txqueuelen 0  (Local Loopback)
        RX packets 2285  bytes 119977 (117.1 KiB)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 2285  bytes 119977 (117.1 KiB)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

ovs-system: flags=4098  mtu 1500
        ether 6e:8b:ea:a5:6b:55  txqueuelen 0  (Ethernet)
        RX packets 0  bytes 0 (0.0 B)
        RX errors 0  dropped 0  overruns 0  frame 0
        TX packets 0  bytes 0 (0.0 B)
        TX errors 0  dropped 0 overruns 0  carrier 0  collisions 0

[root@region1compute2 ~]# ovs-vsctl show
8e09970f-3505-4edd-92ed-c9f00de03dad
    Bridge br-tun
        fail_mode: secure
        Port "vxlan-ac1000ac"
            Interface "vxlan-ac1000ac"
                type: vxlan
                options: {df_default="true", in_key=flow, local_ip="172.16.0.162", out_key=flow, remote_ip="172.16.0.172"}
        Port br-tun
            Interface br-tun
                type: internal
        Port patch-int
            Interface patch-int
                type: patch
                options: {peer=patch-tun}
    Bridge br-int
        fail_mode: secure
        Port patch-tun
            Interface patch-tun
                type: patch
                options: {peer=patch-int}
        Port br-int
            Interface br-int
                type: internal
    ovs_version: "2.5.0"
[root@region1compute2 ~]# route -n
Kernel IP routing table
Destination     Gateway         Genmask         Flags Metric Ref    Use Iface
0.0.0.0         192.168.1.1     0.0.0.0         UG    0      0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1002   0        0 eth0
169.254.0.0     0.0.0.0         255.255.0.0     U     1003   0        0 eth1
172.16.0.0      0.0.0.0         255.255.255.0   U     0      0        0 eth1
192.168.1.0     0.0.0.0         255.255.255.0   U     0      0        0 eth0
[root@region1compute2 ~]# ip netns
[root@region1compute2 ~]# 

answers file on the controller/network/compute1 server

[root@region1server1 RDO(keystone_admin)]# cat answers5.txt | grep -v "^#" | while read xx
> do
> if [ "${xx}." != "." ];
> then
> echo "${xx}"
> fi
> done
[general]
CONFIG_SSH_KEY=/root/.ssh/id_rsa.pub
CONFIG_DEFAULT_PASSWORD=password
CONFIG_SERVICE_WORKERS=%{::processorcount}
CONFIG_MARIADB_INSTALL=y
CONFIG_GLANCE_INSTALL=y
CONFIG_CINDER_INSTALL=y
CONFIG_MANILA_INSTALL=n
CONFIG_NOVA_INSTALL=y
CONFIG_NEUTRON_INSTALL=y
CONFIG_HORIZON_INSTALL=y
CONFIG_SWIFT_INSTALL=y
CONFIG_CEILOMETER_INSTALL=n
CONFIG_AODH_INSTALL=n
CONFIG_GNOCCHI_INSTALL=n
CONFIG_SAHARA_INSTALL=n
CONFIG_HEAT_INSTALL=y
CONFIG_TROVE_INSTALL=n
CONFIG_IRONIC_INSTALL=n
CONFIG_CLIENT_INSTALL=y
CONFIG_NTP_SERVERS=pool.ntp.org
CONFIG_NAGIOS_INSTALL=y
EXCLUDE_SERVERS=
CONFIG_DEBUG_MODE=n
CONFIG_CONTROLLER_HOST=192.168.1.172
CONFIG_COMPUTE_HOSTS=192.168.1.172,192.168.1.162
CONFIG_NETWORK_HOSTS=192.168.1.172
CONFIG_VMWARE_BACKEND=n
CONFIG_UNSUPPORTED=n
CONFIG_USE_SUBNETS=n
CONFIG_VCENTER_HOST=
CONFIG_VCENTER_USER=
CONFIG_VCENTER_PASSWORD=
CONFIG_VCENTER_CLUSTER_NAMES=
CONFIG_STORAGE_HOST=192.168.1.172
CONFIG_SAHARA_HOST=192.168.1.172
CONFIG_USE_EPEL=n
CONFIG_REPO=
CONFIG_ENABLE_RDO_TESTING=n
CONFIG_RH_USER=
CONFIG_SATELLITE_URL=
CONFIG_RH_SAT6_SERVER=
CONFIG_RH_PW=
CONFIG_RH_OPTIONAL=y
CONFIG_RH_PROXY=
CONFIG_RH_SAT6_ORG=
CONFIG_RH_SAT6_KEY=
CONFIG_RH_PROXY_PORT=
CONFIG_RH_PROXY_USER=
CONFIG_RH_PROXY_PW=
CONFIG_SATELLITE_USER=
CONFIG_SATELLITE_PW=
CONFIG_SATELLITE_AKEY=
CONFIG_SATELLITE_CACERT=
CONFIG_SATELLITE_PROFILE=
CONFIG_SATELLITE_FLAGS=
CONFIG_SATELLITE_PROXY=
CONFIG_SATELLITE_PROXY_USER=
CONFIG_SATELLITE_PROXY_PW=
CONFIG_SSL_CACERT_FILE=/etc/pki/tls/certs/selfcert.crt
CONFIG_SSL_CACERT_KEY_FILE=/etc/pki/tls/private/selfkey.key
CONFIG_SSL_CERT_DIR=~/packstackca/
CONFIG_SSL_CACERT_SELFSIGN=y
CONFIG_SELFSIGN_CACERT_SUBJECT_C=--
CONFIG_SELFSIGN_CACERT_SUBJECT_ST=State
CONFIG_SELFSIGN_CACERT_SUBJECT_L=City
CONFIG_SELFSIGN_CACERT_SUBJECT_O=openstack
CONFIG_SELFSIGN_CACERT_SUBJECT_OU=packstack
CONFIG_SELFSIGN_CACERT_SUBJECT_CN=region1_server1.mdickinson.dyndns.org
CONFIG_SELFSIGN_CACERT_SUBJECT_MAIL=admin@region1_server1.mdickinson.dyndns.org
CONFIG_AMQP_BACKEND=rabbitmq
CONFIG_AMQP_HOST=192.168.1.172
CONFIG_AMQP_ENABLE_SSL=n
CONFIG_AMQP_ENABLE_AUTH=n
CONFIG_AMQP_NSS_CERTDB_PW=PW_PLACEHOLDER
CONFIG_AMQP_AUTH_USER=amqp_user
CONFIG_AMQP_AUTH_PASSWORD=PW_PLACEHOLDER
CONFIG_MARIADB_HOST=192.168.1.172
CONFIG_MARIADB_USER=root
CONFIG_MARIADB_PW=cd4c212457984dc7
CONFIG_KEYSTONE_DB_PW=91881e2de37e4a57
CONFIG_KEYSTONE_DB_PURGE_ENABLE=True
CONFIG_KEYSTONE_REGION=RegionOne
CONFIG_KEYSTONE_ADMIN_TOKEN=186993d2e8e644ed809d886dfc1a48b8
CONFIG_KEYSTONE_ADMIN_EMAIL=root@localhost
CONFIG_KEYSTONE_ADMIN_USERNAME=admin
CONFIG_KEYSTONE_ADMIN_PW=password
CONFIG_KEYSTONE_DEMO_PW=8eb8f7da8e434ce4
CONFIG_KEYSTONE_API_VERSION=v2.0
CONFIG_KEYSTONE_TOKEN_FORMAT=UUID
CONFIG_KEYSTONE_SERVICE_NAME=httpd
CONFIG_KEYSTONE_IDENTITY_BACKEND=sql
CONFIG_KEYSTONE_LDAP_URL=ldap://192.168.1.172
CONFIG_KEYSTONE_LDAP_USER_DN=
CONFIG_KEYSTONE_LDAP_USER_PASSWORD=
CONFIG_KEYSTONE_LDAP_SUFFIX=
CONFIG_KEYSTONE_LDAP_QUERY_SCOPE=one
CONFIG_KEYSTONE_LDAP_PAGE_SIZE=-1
CONFIG_KEYSTONE_LDAP_USER_SUBTREE=
CONFIG_KEYSTONE_LDAP_USER_FILTER=
CONFIG_KEYSTONE_LDAP_USER_OBJECTCLASS=
CONFIG_KEYSTONE_LDAP_USER_ID_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_USER_NAME_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_USER_MAIL_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_USER_ENABLED_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_USER_ENABLED_MASK=-1
CONFIG_KEYSTONE_LDAP_USER_ENABLED_DEFAULT=TRUE
CONFIG_KEYSTONE_LDAP_USER_ENABLED_INVERT=n
CONFIG_KEYSTONE_LDAP_USER_ATTRIBUTE_IGNORE=
CONFIG_KEYSTONE_LDAP_USER_DEFAULT_PROJECT_ID_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_USER_ALLOW_CREATE=n
CONFIG_KEYSTONE_LDAP_USER_ALLOW_UPDATE=n
CONFIG_KEYSTONE_LDAP_USER_ALLOW_DELETE=n
CONFIG_KEYSTONE_LDAP_USER_PASS_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_USER_ENABLED_EMULATION_DN=
CONFIG_KEYSTONE_LDAP_USER_ADDITIONAL_ATTRIBUTE_MAPPING=
CONFIG_KEYSTONE_LDAP_GROUP_SUBTREE=
CONFIG_KEYSTONE_LDAP_GROUP_FILTER=
CONFIG_KEYSTONE_LDAP_GROUP_OBJECTCLASS=
CONFIG_KEYSTONE_LDAP_GROUP_ID_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_GROUP_NAME_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_GROUP_MEMBER_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_GROUP_DESC_ATTRIBUTE=
CONFIG_KEYSTONE_LDAP_GROUP_ATTRIBUTE_IGNORE=
CONFIG_KEYSTONE_LDAP_GROUP_ALLOW_CREATE=n
CONFIG_KEYSTONE_LDAP_GROUP_ALLOW_UPDATE=n
CONFIG_KEYSTONE_LDAP_GROUP_ALLOW_DELETE=n
CONFIG_KEYSTONE_LDAP_GROUP_ADDITIONAL_ATTRIBUTE_MAPPING=
CONFIG_KEYSTONE_LDAP_USE_TLS=n
CONFIG_KEYSTONE_LDAP_TLS_CACERTDIR=
CONFIG_KEYSTONE_LDAP_TLS_CACERTFILE=
CONFIG_KEYSTONE_LDAP_TLS_REQ_CERT=demand
CONFIG_GLANCE_DB_PW=2422b6ba1b3d4cd1
CONFIG_GLANCE_KS_PW=fd5814c32bc54791
CONFIG_GLANCE_BACKEND=file
CONFIG_CINDER_DB_PW=ea18a88c9ab24b93
CONFIG_CINDER_DB_PURGE_ENABLE=True
CONFIG_CINDER_KS_PW=f331e1eb6a69434a
CONFIG_CINDER_BACKEND=lvm
CONFIG_CINDER_VOLUMES_CREATE=y
CONFIG_CINDER_VOLUMES_SIZE=20G
CONFIG_CINDER_GLUSTER_MOUNTS=
CONFIG_CINDER_NFS_MOUNTS=
CONFIG_CINDER_NETAPP_LOGIN=
CONFIG_CINDER_NETAPP_PASSWORD=
CONFIG_CINDER_NETAPP_HOSTNAME=
CONFIG_CINDER_NETAPP_SERVER_PORT=80
CONFIG_CINDER_NETAPP_STORAGE_FAMILY=ontap_cluster
CONFIG_CINDER_NETAPP_TRANSPORT_TYPE=http
CONFIG_CINDER_NETAPP_STORAGE_PROTOCOL=nfs
CONFIG_CINDER_NETAPP_SIZE_MULTIPLIER=1.0
CONFIG_CINDER_NETAPP_EXPIRY_THRES_MINUTES=720
CONFIG_CINDER_NETAPP_THRES_AVL_SIZE_PERC_START=20
CONFIG_CINDER_NETAPP_THRES_AVL_SIZE_PERC_STOP=60
CONFIG_CINDER_NETAPP_NFS_SHARES=
CONFIG_CINDER_NETAPP_NFS_SHARES_CONFIG=/etc/cinder/shares.conf
CONFIG_CINDER_NETAPP_VOLUME_LIST=
CONFIG_CINDER_NETAPP_VFILER=
CONFIG_CINDER_NETAPP_PARTNER_BACKEND_NAME=
CONFIG_CINDER_NETAPP_VSERVER=
CONFIG_CINDER_NETAPP_CONTROLLER_IPS=
CONFIG_CINDER_NETAPP_SA_PASSWORD=
CONFIG_CINDER_NETAPP_ESERIES_HOST_TYPE=linux_dm_mp
CONFIG_CINDER_NETAPP_WEBSERVICE_PATH=/devmgr/v2
CONFIG_CINDER_NETAPP_STORAGE_POOLS=
CONFIG_IRONIC_DB_PW=PW_PLACEHOLDER
CONFIG_IRONIC_KS_PW=PW_PLACEHOLDER
CONFIG_NOVA_DB_PURGE_ENABLE=True
CONFIG_NOVA_DB_PW=ae248be65adb4e7c
CONFIG_NOVA_KS_PW=3abacd67d9094bde
CONFIG_NOVA_SCHED_CPU_ALLOC_RATIO=16.0
CONFIG_NOVA_SCHED_RAM_ALLOC_RATIO=1.5
CONFIG_NOVA_COMPUTE_MIGRATE_PROTOCOL=tcp
CONFIG_NOVA_COMPUTE_MANAGER=nova.compute.manager.ComputeManager
CONFIG_VNC_SSL_CERT=
CONFIG_VNC_SSL_KEY=
CONFIG_NOVA_PCI_ALIAS=
CONFIG_NOVA_PCI_PASSTHROUGH_WHITELIST=
CONFIG_NOVA_COMPUTE_PRIVIF=
CONFIG_NOVA_NETWORK_MANAGER=nova.network.manager.FlatDHCPManager
CONFIG_NOVA_NETWORK_PUBIF=eth0
CONFIG_NOVA_NETWORK_PRIVIF=
CONFIG_NOVA_NETWORK_FIXEDRANGE=192.168.32.0/22
CONFIG_NOVA_NETWORK_FLOATRANGE=10.3.4.0/22
CONFIG_NOVA_NETWORK_AUTOASSIGNFLOATINGIP=n
CONFIG_NOVA_NETWORK_VLAN_START=100
CONFIG_NOVA_NETWORK_NUMBER=1
CONFIG_NOVA_NETWORK_SIZE=255
CONFIG_NEUTRON_KS_PW=3241291348464592
CONFIG_NEUTRON_DB_PW=18245d0413aa4fff
CONFIG_NEUTRON_L3_EXT_BRIDGE=br-ex
CONFIG_NEUTRON_METADATA_PW=f6886849c84d4fc8
CONFIG_LBAAS_INSTALL=n
CONFIG_NEUTRON_METERING_AGENT_INSTALL=y
CONFIG_NEUTRON_FWAAS=n
CONFIG_NEUTRON_VPNAAS=n
CONFIG_NEUTRON_ML2_TYPE_DRIVERS=vxlan
CONFIG_NEUTRON_ML2_TENANT_NETWORK_TYPES=vxlan
CONFIG_NEUTRON_ML2_MECHANISM_DRIVERS=openvswitch
CONFIG_NEUTRON_ML2_FLAT_NETWORKS=*
CONFIG_NEUTRON_ML2_VLAN_RANGES=
CONFIG_NEUTRON_ML2_TUNNEL_ID_RANGES=
CONFIG_NEUTRON_ML2_VXLAN_GROUP=
CONFIG_NEUTRON_ML2_VNI_RANGES=10:100
CONFIG_NEUTRON_L2_AGENT=openvswitch
CONFIG_NEUTRON_ML2_SUPPORTED_PCI_VENDOR_DEVS=['15b3:1004', '8086:10ca']
CONFIG_NEUTRON_ML2_SRIOV_AGENT_REQUIRED=n
CONFIG_NEUTRON_ML2_SRIOV_INTERFACE_MAPPINGS=
CONFIG_NEUTRON_LB_INTERFACE_MAPPINGS=
CONFIG_NEUTRON_OVS_BRIDGE_MAPPINGS=physnet1:br-eth1
CONFIG_NEUTRON_OVS_BRIDGE_IFACES=br-eth1:eth1
CONFIG_NEUTRON_OVS_TUNNEL_IF=eth1
CONFIG_NEUTRON_OVS_TUNNEL_SUBNETS=
CONFIG_NEUTRON_OVS_VXLAN_UDP_PORT=4789
CONFIG_MANILA_DB_PW=PW_PLACEHOLDER
CONFIG_MANILA_KS_PW=PW_PLACEHOLDER
CONFIG_MANILA_BACKEND=generic
CONFIG_MANILA_NETAPP_DRV_HANDLES_SHARE_SERVERS=false
CONFIG_MANILA_NETAPP_TRANSPORT_TYPE=https
CONFIG_MANILA_NETAPP_LOGIN=admin
CONFIG_MANILA_NETAPP_PASSWORD=
CONFIG_MANILA_NETAPP_SERVER_HOSTNAME=
CONFIG_MANILA_NETAPP_STORAGE_FAMILY=ontap_cluster
CONFIG_MANILA_NETAPP_SERVER_PORT=443
CONFIG_MANILA_NETAPP_AGGREGATE_NAME_SEARCH_PATTERN=(.*)
CONFIG_MANILA_NETAPP_ROOT_VOLUME_AGGREGATE=
CONFIG_MANILA_NETAPP_ROOT_VOLUME_NAME=root
CONFIG_MANILA_NETAPP_VSERVER=
CONFIG_MANILA_GENERIC_DRV_HANDLES_SHARE_SERVERS=true
CONFIG_MANILA_GENERIC_VOLUME_NAME_TEMPLATE=manila-share-%s
CONFIG_MANILA_GENERIC_SHARE_MOUNT_PATH=/shares
CONFIG_MANILA_SERVICE_IMAGE_LOCATION=https://www.dropbox.com/s/vi5oeh10q1qkckh/ubuntu_1204_nfs_cifs.qcow2
CONFIG_MANILA_SERVICE_INSTANCE_USER=ubuntu
CONFIG_MANILA_SERVICE_INSTANCE_PASSWORD=ubuntu
CONFIG_MANILA_NETWORK_TYPE=neutron
CONFIG_MANILA_NETWORK_STANDALONE_GATEWAY=
CONFIG_MANILA_NETWORK_STANDALONE_NETMASK=
CONFIG_MANILA_NETWORK_STANDALONE_SEG_ID=
CONFIG_MANILA_NETWORK_STANDALONE_IP_RANGE=
CONFIG_MANILA_NETWORK_STANDALONE_IP_VERSION=4
CONFIG_MANILA_GLUSTERFS_SERVERS=
CONFIG_MANILA_GLUSTERFS_NATIVE_PATH_TO_PRIVATE_KEY=
CONFIG_MANILA_GLUSTERFS_VOLUME_PATTERN=
CONFIG_MANILA_GLUSTERFS_TARGET=
CONFIG_MANILA_GLUSTERFS_MOUNT_POINT_BASE=
CONFIG_MANILA_GLUSTERFS_NFS_SERVER_TYPE=gluster
CONFIG_MANILA_GLUSTERFS_PATH_TO_PRIVATE_KEY=
CONFIG_MANILA_GLUSTERFS_GANESHA_SERVER_IP=
CONFIG_HORIZON_SSL=n
CONFIG_HORIZON_SECRET_KEY=f62f4e70de1a4ce3a1f1bf0b77801615
CONFIG_HORIZON_SSL_CERT=
CONFIG_HORIZON_SSL_KEY=
CONFIG_HORIZON_SSL_CACERT=
CONFIG_SWIFT_KS_PW=7422d67090b14226
CONFIG_SWIFT_STORAGES=
CONFIG_SWIFT_STORAGE_ZONES=1
CONFIG_SWIFT_STORAGE_REPLICAS=1
CONFIG_SWIFT_STORAGE_FSTYPE=ext4
CONFIG_SWIFT_HASH=0b12807a286040c1
CONFIG_SWIFT_STORAGE_SIZE=2G
CONFIG_HEAT_DB_PW=password
CONFIG_HEAT_AUTH_ENC_KEY=5100379e6c4b41f6
CONFIG_HEAT_KS_PW=password
CONFIG_HEAT_CLOUDWATCH_INSTALL=n
CONFIG_HEAT_CFN_INSTALL=n
CONFIG_HEAT_DOMAIN=heat
CONFIG_HEAT_DOMAIN_ADMIN=heat_admin
CONFIG_HEAT_DOMAIN_PASSWORD=password
CONFIG_PROVISION_DEMO=n
CONFIG_PROVISION_TEMPEST=n
CONFIG_PROVISION_DEMO_FLOATRANGE=172.24.4.224/28
CONFIG_PROVISION_IMAGE_NAME=cirros
CONFIG_PROVISION_IMAGE_URL=http://download.cirros-cloud.net/0.3.4/cirros-0.3.4-x86_64-disk.img
CONFIG_PROVISION_IMAGE_FORMAT=qcow2
CONFIG_PROVISION_IMAGE_SSH_USER=cirros
CONFIG_TEMPEST_HOST=
CONFIG_PROVISION_TEMPEST_USER=
CONFIG_PROVISION_TEMPEST_USER_PW=PW_PLACEHOLDER
CONFIG_PROVISION_TEMPEST_FLOATRANGE=172.24.4.224/28
CONFIG_PROVISION_TEMPEST_REPO_URI=https://github.com/openstack/tempest.git
CONFIG_PROVISION_TEMPEST_REPO_REVISION=master
CONFIG_RUN_TEMPEST=n
CONFIG_RUN_TEMPEST_TESTS=smoke
CONFIG_PROVISION_OVS_BRIDGE=y
CONFIG_GNOCCHI_DB_PW=PW_PLACEHOLDER
CONFIG_GNOCCHI_KS_PW=PW_PLACEHOLDER
CONFIG_CEILOMETER_SECRET=a1ad745f12b94af2
CONFIG_CEILOMETER_KS_PW=PW_PLACEHOLDER
CONFIG_CEILOMETER_SERVICE_NAME=httpd
CONFIG_CEILOMETER_COORDINATION_BACKEND=redis
CONFIG_CEILOMETER_METERING_BACKEND=database
CONFIG_MONGODB_HOST=192.168.1.172
CONFIG_REDIS_MASTER_HOST=192.168.1.172
CONFIG_REDIS_PORT=6379
CONFIG_REDIS_HA=n
CONFIG_REDIS_SLAVE_HOSTS=
CONFIG_REDIS_SENTINEL_HOSTS=
CONFIG_REDIS_SENTINEL_CONTACT_HOST=
CONFIG_REDIS_SENTINEL_PORT=26379
CONFIG_REDIS_SENTINEL_QUORUM=2
CONFIG_REDIS_MASTER_NAME=mymaster
CONFIG_AODH_KS_PW=PW_PLACEHOLDER
CONFIG_TROVE_DB_PW=PW_PLACEHOLDER
CONFIG_TROVE_KS_PW=PW_PLACEHOLDER
CONFIG_TROVE_NOVA_USER=trove
CONFIG_TROVE_NOVA_TENANT=services
CONFIG_TROVE_NOVA_PW=PW_PLACEHOLDER
CONFIG_SAHARA_DB_PW=PW_PLACEHOLDER
CONFIG_SAHARA_KS_PW=PW_PLACEHOLDER
CONFIG_NAGIOS_PW=8b42e5beb2444ec0
Posted in OpenStack | Comments Off on OpenStack RDO using packstack again, Matika release

OpenStack – Playing with heat patterns

My attempts to figure out OpenStack are happily producing more headaches for me. This is still on tha Matika release.

I have temporarily suspended my attempts to get migration between compute nodes working as I have been sidetracked by heat templates and stacks. Although in testing for this post I discovered some instances were being deployed on the second compute node, identified by everything starting on the second compute node complaining it cannot contact the metadata service so the vxlan setup may not be correct even tho instances can be started there, but without networking; so disabled that compute node while I play with templates. But that will be another post; this is on using a template to create a stack.

Template issues in general

The major unexpected issues in using heat templates I have encountered already are…

  • users even with admin authority to an openstack project are not permitted to create stacks with the “openstack stack create” command for the projects, it errors with ‘not heat_stack_owner’
  • the only workaround is to create a new user for the project the stack is being created for with a role of ‘heat_stack_owner’; and of course creating a new ssh key-pair for the user as a ssh key is needed to access servers created by the stack and as the stack is created by the new user that user has no access to any other keys
  • once the stack is created it seems other members of the project can manage the stack, at least other members with admin roles can delete the stack after it is created anyway

So the issue is that it seems that the use of ‘stacks’ is by design intended to be used by users seperate from the day-to-day management of instances; or basically admin users cannot create stacks and a user can have only one role so in a test environment where I am the only user I need a minimum of two userids just to do admin functions. It is just inconvenient for a single user system, but I guess in the real world there may be a need to prevent admin users from creating stacks; but personally I think the admin role should be renamed from ‘admin’ if it is not a god role to avoid confusion.

YAML syntax is a bit of a pain as well, took me about an hour to get rid of unexpected start/end block errors in my test template for ‘openstack stack create’ playtime, the solution was… I was missing a SPACE character before a parameter; indentation onto exact columns is a requirement, don’t miss out those leading spaces.

Fedora supplied cloud image (Fedora-Cloud-Base-24-1.2.x86_64)

The only other major issue encountered was not with OpenStack itself but with the Fedora supplied cloud image (Fedora-Cloud-Base-24-1.2.x86_64) for use with OpenStack. The issue there being that the F24 cloud image is shipped with python3 and the cloud-init scripts that handle user provided data to an image try to use python, so all fail to run complaining that the ‘python’ command is not found.

The fix for that is to create a symbolic link for python pointing to the phyton3 symbolic link, in a new image. Which painfully is to fire up an instance using the Fedora supplied F24 cloud image, login to it, and as root “cd /usr/bin;ln -s python3 python”; then shutdown the instance, snapshot it, use the “glance image-download” command to retrieve the snapshot as a qcow2 disk image, virt-sysprep that new qcow2 file, and use glance to install that new qcow2 file as a new image with a name meaningfull to you; and use that new image to launch future instances that require user data to be passed to them. However as while doing that you can also install additional packages to the image to make it more useful to you is not really a wasted effort.

It also does not have the “wc_notify” command so is presumably missing some heat packages in the cloud image; but that is not causing me any issues as long as I ensure I do not use that command in any of the user_data scripts.

My example template

This template is for my customised environment, all the parameters can be overidden at stack creation time. The reason the some parameters differ from the supplied users and flavours in a new openstack install are

  • as noted earlier in the post, a new user with heat_owner role was required to be created with a new keypair, so the key pair used is mark_heatstack_owner_keypair as that is a key created for that user, and the only key the user has access to
  • it uses my OpenStack network configuration, the project is my personal project so the private network is tenant-mark-10-0-3-0 with subnet tenant-mark-10-0-3-0-subnet1; and the external network available to this project is ext-net-192-flat
  • as noted earlier in the post the Fedora supplied F24 cloud image cannot process user_data passed to cloud-init, so the image used in my example is F24-CloudBase-24-1.2.x86_64-python which is a copy of the F24 image with the only change from the origional being the python link has been created so cloud-init can find python and process the user_data
  • the instance types are my customised ones, override with your own, the cloud base virtual disk needs 3Gb minimum, and has no swap partition so needs 256Mb memory minimum or 128Mb memory and a swap filesystem used in the flavor

This template example creates two instances and customises them as follows to make them usable. The idea is all access to instances in the stack are via the public ip-address on server1; in the real world there would probably be a proxy service on server1 providing passthru to other instances in the stack if the instances were to be running for more than a few days… but cloud instances are supposed to be throw up and tear down, not long running.

  • server1 is assigned a floating ip-address as well as a private ip-address, server2 only has a private ip-address; so access to the instances in the stack is via ssh key to server1
  • the root password is set for both servers (this is for troubleshooting from the console, you would not normally do that)
  • server2 has the sshd_config altered to allow login using a password instead of the default of only allowing login via ssh keys; this is to avoid having to scp/sftp to ssh keys to server1 to be able to login to server2 from server1
  • server2 has a password set for the fedora user, that can then be used when logging on from server1
  • from the dashboard stack overview page after the stack is created the outputs are ip-addreses assigned and passwords used

Filename is test_two_servers.yaml and the command to create the stack is therefore “openstack stack create –template test_two_servers.yaml mark_test_stack01”, running it with environment settings for a user with a heat_owner role. And the contents of the yaml file are below.


heat_template_version: 2016-04-08

description: >
  Test of stack deploy that sets root password,
  using my private network and a floating ip assigned.

parameters:
  key_name:
    type: string
    label: Key Name
    description: Name of key-pair to be used for compute instance
    default: mark_heatstack_owner_keypair
  image_id:
    type: string
    label: Image ID
    description: Image to be used for compute instance
    default: F24-CloudBase-24-1.2.x86_64-python
  instance_type:
    type: string
    label: Instance Type
    description: Type of instance (flavor) to be used
    default: marks.tiny
    constraints:
      - allowed_values: [ marks.tiny, marks.small, m1.tiny ]
        description: Value must be one of marks.tiny, marks.small or m1.tiny.
  root_password:
    type: string
    label: Root User Password
    description: Password to be used for root user
    hidden: true
    default: letmein
    constraints:
      - length: { min: 6, max: 8 }
        description: Password length must be between 6 and 8 characters.
      - allowed_pattern: "[a-zA-Z0-9]+"
        description: Password must consist of characters and numbers only.
  user_password:
    type: string
    label: Fedora Cloud User Password
    description: Password to be used for fedora user on servers with only private ips
    hidden: true
    default: password
    constraints:
      - length: { min: 6, max: 8 }
        description: Password length must be between 6 and 8 characters.
      - allowed_pattern: "[a-zA-Z0-9]+"
        description: Password must consist of characters and numbers only.
  net:
    description: name of network used to launch instance.
    type: string
    default: tenant-mark-10-0-3-0
  subnet:
    description: name of subnet within network used to launch instance.
    type: string
    default: tenant-mark-10-0-3-0-subnet1
  public_network:
    description: name of the public network to associate floating ip from.
    type: string
    default: ext-net-192-flat

resources:
  my_server1:
    type: OS::Nova::Server
    properties:
      name: mark-server1
      key_name: { get_param: key_name }
      image: { get_param: image_id }
      flavor: { get_param: instance_type }
      networks: 
        - network: { get_param: net }
      user_data: 
         str_replace:
            template: |
              #!/bin/bash
              echo "Customising system image..."
              # For troubleshooting use a known password for console login
              echo "$ROOTPSWD" | passwd root --stdin
              echo "...end of install"
              # wc_notify not in F24 cloud image base
              # wc_notify --data-binary '{"status": "SUCCESS"}'
              exit 0
              # ... done
            params: 
              $ROOTPSWD: { get_param: root_password }
  floating_ip:
    type: OS::Neutron::FloatingIP
    properties:
      floating_network: {get_param: public_network}
  association:
    type: OS::Neutron::FloatingIPAssociation
    properties:
      floatingip_id: { get_resource: floating_ip }
      port_id: {get_attr: [my_server1, addresses, {get_param: net}, 0, port]}

  my_server2:
    type: OS::Nova::Server
    properties:
      name: mark-server2
      key_name: { get_param: key_name }
      image: { get_param: image_id }
      flavor: { get_param: instance_type }
      networks: 
        - network: { get_param: net }
      user_data: 
         str_replace:
            template: |
              #!/bin/bash
              echo "Customising system image..."
              # For troubleshooting use a known password for console login
              echo "$ROOTPSWD" | passwd root --stdin
              # Servers with only private ips can allow password logins
              # (or all the ssh keys would have to be copied to all servers
              # in the private network that might want to logon to this server).
              cd /etc/ssh
              cat sshd_config | sed -e 's/PasswordAuthentication no/PasswordAuthentication yes/' > sshd_config.new
              datenow=`date +"%Y%m%d"`
              mv sshd_config sshd_config.${datenow}
              mv sshd_config.new sshd_config
              chmod 644 sshd_config
              service sshd restart
              # But we leave "PermitRootLogin no" so set the Fedora cloud
              # user password here as that is the user they ssh keys would
              # normally be used to login to, so we can use that
              echo "$USERPSWD" | passwd fedora --stdin
              #
              echo "...end of install"
              exit 0
              # ... done
            params: 
              $ROOTPSWD: { get_param: root_password }
              $USERPSWD: { get_param: user_password }

outputs:
  instance_private_ip_server1:
    description: Private IP address of server1
    value: { get_attr: [my_server1, networks, {get_param: net}, 0] }
  instance_public_ip_server1:
    description: Public IP address of server1
    value: { get_attr: [my_server1, networks, {get_param: net}, 1] }
  instance_private_ip_server2:
    description: Private IP address of server2
    value: { get_attr: [my_server2, networks, {get_param: net}, 0] }
  instance_keypair:
    description: SSH Key-Pair to be used to access the instances
    value: { get_param: key_name }
  instance_rootpw:
    description: Root password for both servers at instance creation
    value: { get_param: root_password }
  instance_userpw:
    description: User fedora password for servers with only private ips
    value: { get_param: user_password }

top tip: for testing use the –dry-run option (ie:openstack stack create –dry-run –template test_two_servers.yaml mark_test_stack01) and that will (if the template passes validation) produce a detailed list of all settings available should you want to tweak it further.

Posted in Home Life | Comments Off on OpenStack – Playing with heat patterns

qemu qcow2 converting to old format

This post is mainly to remind me how to to it without having to to get the qemu-img help display up all the time.

In one of the later releases of Fedora (24 I think) the qcow2 image format changed. The only reason that was an issue for me (or that I even noticed) was the work provided laptop runs the company standard which is a customised RedHat 6; and all of a sudden disk images I build at home were unusable on Redhat 6.

So there is an occasional need to be able to convert disk images to the older format, so this is how.

[root@vmhost3 kvm]# qemu-img info fc23-x64-cloud-3G-ext4.qcow2 
image: fc23-x64-cloud-3G-ext4.qcow2
file format: qcow2
virtual size: 3.0G (3221225472 bytes)
disk size: 1.4G
cluster_size: 65536
Format specific information:
    compat: 1.1
    lazy refcounts: false
    refcount bits: 16
    corrupt: false

[root@vmhost3 kvm]# qemu-img convert \
  -f qcow2 fc23-x64-cloud-3G-ext4.qcow2 \
  -O qcow2 -o compat=0.10 fc23-x64-cloud-3G-ext4-oldformat.qcow2

[root@vmhost3 kvm]# qemu-img info fc23-x64-cloud-3G-ext4-oldformat.qcow2
image: fc23-x64-cloud-3G-ext4-oldformat.qcow2
file format: qcow2
virtual size: 3.0G (3221225472 bytes)
disk size: 1.3G
cluster_size: 65536
Format specific information:
    compat: 0.10
    refcount bits: 16

Obviously it can only be run on an OS that understands the new format, as the option to convert between formats does not exist on machines that do not support the newer format.

Posted in Virtual Machines | Comments Off on qemu qcow2 converting to old format

F24 to F25 upgrade notes

Fedora 25 released, most things seem to work apart from…

My target environment was my main test machine, intel x64 with 8 cores and 32Gb of memory and LUKS encrypted disks; and a lot of KVM VMs on it. But the upgrade was to the host machine itself not to one of the guests.

Upgrade/Downgrade notes

For a change I used the recomended method to upgrade from F24 to F25, the DNF upgrade plugin. For a change it worked flawlessly.

It is not possible to rollback to F24 using

dnf distro-sync --releasever=24 --allowerasing

It downloaded 1.7Gb, and failed with debendencies on gdb and gdb-libs, a few other unimportant libs as well. Tried to remove those packages but a no-go, to many other things depended on them (systemd being one of them and I have an idea trying to remove that package would have 100s of dependencies stopping it being removed).

So if you upgrade to F25 you cannot get back to F24 (unless you have a Clonezilla backup or simlilar to bare-metal restore from).

All Issues Found after two weeks use

These are the issues found running it on a server that is also a backup desktop so includes a GUI user interface I can use.

No real ‘desktop’ applications have been tested as it is primarily used as a server; the only GUI application tested was the synergy client; the synergy server component I have not tested on F25 as my server is a client :-).

  1. Synergy Client (needs QT4, does Gnome now use QT5 ?)

    • distro version does not work under the new Gnome desktop (no keyboard/mouse events processed)
    • latest version from github does not compile on F25 (complains header files that do exist cannot be found)
    • tried ‘dnf groupinstall kde’; and synergy works logging on using plasma (logon prompt option for kde)
    • works if logging on using Gnome classic (logon prompt option)
    • works is logging on using Gnome using X-Org (logon prompt option)
    • So if using synergy do not use the new ‘default’ Gnome desktop
  2. Hypervisor running KVM instances (or killing the host machine)

    • virsh commands will occasionally hang, causing a GUI interface hang
    • hang can also be triggered occasionally by running virt-manager, it will freeze unable to list the virsh instances and also hang the GUI
    • when the system has hung
    • it is possible to ssh into the machine but the commands ‘shutdown -h now’, ‘reboot’ and ‘halt’ just return to the command prompt; ‘systemctl reboot’ logs a few timeout messages as it tries to chat to something and returns to the command prompt. Using the physical machine reset/power button is the only way to resolve this which is a pain if multiple VMs are running

  3. note this is intermittent, but has happened randomly multiple times, has happened at ‘virsh start’, virt-install and virt-manager commands. After the physical machine has been reset or power recycled exactly the same commands work; so it is not repeatable on demand but is happening a lot
  4. via the normal ‘dnf update’ I have installed all available updates (including kernel) many times over the last few weeks but the problem persists… but I am sure it will be eventually sorted out, in the meantime F25 is not production ready (as of 14Dec2017) for any server using KVM guests
  5. Not yet tested

    • hercules, that will be tested in a VM as I run it in KVM machines now
    • 99% of GUI applications. On my test machine the only apps I start under Gnome are ‘terminal’, the synergy client and occasionally virt-manager to get a local console to a VM if remote VNC console connections fail for some reason
    • custom SELinux rules. My existing rules survived the upgrade and I have no need to compile new rules at this time

Summary

I will leave F25 on my main test machine and live with the ‘freezes’. I will not install it onto any of my other machines until the ‘freezes’ stop via one of the eventual updates that are bound to occur.

All the core functions seem to work just fine

Posted in Home Life | Comments Off on F24 to F25 upgrade notes

OpenStack woes continue

Still playing with OpenStack; Matika currently.

After lots of swearing, and reading, a second compute node now appears to be running correctly, using vxlan and openvswitch as the transport layer between the two. At least the hypervisor and compute node displays show the second compute node as available now; and the ovs-vsctl show commands on both servers show they are finally pointing at each other. [ Getting the openstack openvswitch and nova tasks running on the second compute node was a pain; and I’m still not sure which of the many changes made got them working, lost count of reboots; or they may have finally pushed a patch ]

Of course to test it I will need to migrate an instance from the primary compute node to the second, as there does not appear to be a way of selecting what compute node to use when launcing an instance.

Which leads onto a new problem, with the same frustrating issues in resolving the problem

  • the same frustrating issues with resolving it that there are lots of log files, logging lots of information, but when there is an error message it is totally out of context making it virtually useless in diagnosing a problem
  • the new problem is that migration does not work; and yes you guessed it the error message has no context

The nova compute log on the main server shows

2016-12-16 16:44:51.272 2094 INFO nova.compute.manager [req-2ee445a5-18c0-4087-93df-a9869e8a82de 83c4ed3d4df24391b4b1627e0ef69fc6 0b083d9b40a74894a8d841e16e888d2b - - -] [instance: e38855c4-b6cd-48b8-9288-6f59098b920b] Successfully reverted task state from None on failure for instance.
2016-12-16 16:44:51.277 2094 ERROR oslo_messaging.rpc.dispatcher [req-2ee445a5-18c0-4087-93df-a9869e8a82de 83c4ed3d4df24391b4b1627e0ef69fc6 0b083d9b40a74894a8d841e16e888d2b - - -] Exception during message handling: Resize error: not able to execute ssh command: Unexpected error while running command.
Command: ssh 192.168.1.162 mkdir -p /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
Exit code: 255
Stdout: u''
Stderr: u'Host key verification failed.\r\n'

The problem is that running the command manually works perfectly well; the SSH keys are perfectly correct !. So as below commands via ssh just work as expected.

root@region1server1 nova]# ssh 192.168.1.162 mkdir -p /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
[root@region1server1 nova]# ssh 192.168.1.162 ls -la /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
total 8
drwxr-xr-x. 2 root root 4096 Dec 16 16:46 .
drwxr-xr-x. 5 nova nova 4096 Dec 16 16:46 ..
[root@region1server1 nova]# ssh 192.168.1.162 rmdir /var/lib/nova/instances/e38855c4-b6cd-48b8-9288-6f59098b920b
[root@region1server1 nova]# 

So what caused the error, somewhere before the error the command must have changed environment somehow, of course that is not recorded in the nova log that had the error message; in one of the many other logs pherhaps; if really lucky.

When OpenStack is working it’s great. The smallest hiccup requires months of trawling through log files and configuration files looking for a needle in a haystack.

Anyway, I will play with this in my spare time over the next few months as I would like to get migration (and ‘live migration’ although thats not recomended) working.
It is a low priority for me as I am still using native KVM for real guest workloads so do not need this yet; and it is pointless in a home lab as I don’t have the hardware to HA openstack region and network servers as well. It started as a curiosity thing; now I just don’t want to let it beat me; but no hurry, will probably be on the next release of openstack before I get there.

Posted in Virtual Machines | Comments Off on OpenStack woes continue

Still trying to get the latest RDO openstack working

Actually, it is probably not the latest release now, I have been working on this for a while.

This post will be updated/deleted/re-added a lot as I work through it. Maybe a final post on what I did wrong for others to avoid when I figure it out.

I installed the RDO release using packstack (not a default install, I generated and edited heavily the answers file) and made a lot of tweaks to the config as recomended by a jolly old ‘here is how it works’ tutorial in training videos on safari online.

Lots of tweaks for vxlan and the ml3 configuration, nova networking disabled and only neutron used, no local or flat networking as I want to figure out vxlan and openvswitch.

Initial problem was instances unable to contact metadata service

That issue was… if when launching an instance you allocate an internal and external network to the instance it just cannot contact the metadata service. If you only assign an internal tenant network it will contact the metadata service. The problem was repeatable with both the cirros test image and the F23 cloud image.
Note: I had enabled auto-allocation so an external ip was assigned to the instances tested if the external network was assigned to the instance.

So launching a new F23 cloud image or Cirros instance with only the internal network assigned no problems contacting the metadata service, and once running I associated a floating ip with the instance, shut it down and restarted the instance. It was still able to contact the metadata service.

That is still possibly an issue as the auto-allocation of an ip when the external net was also attached to the instance at creation time worked correctly. Query: does it screw up the instance network routing ?.

Anyway, never assigning the external network but just using floating ips still works.

Lack of external connectivity using vxlan

That was user (my) error. A type in one of the config files somewhere set the br-tun vxlan port remote_ip to 176… instead of 172… which is used everywhere else, like on the actual bridge for the interface.
ovs-vsctl does not have an “update” function; deleted and re-added the port to fix that.

[root@region1server1 ~]# ovs-vsctl del-port vxlan-b01000ac
[root@region1server1 ~]# ovs-vsctl add-port br-tun vxlan-b01000ac -- set Interface vxlan-b01000ac type=vxlan options:{df_default="true",in_key=flow,local_ip="172.16.0.172",out_key=flow,remote_ip="172.6.0.172"}

That allowed the qrouter namespace to ping the external network ip-address.
However the ping test did not survive a reboot even though the config came back with my changes… so still troubleshooting here. Damb, so close.

Posted in Virtual Machines | Comments Off on Still trying to get the latest RDO openstack working