Adventures in OpenStack: Intro to OpenFlow, and Network Namespaces

I’m digging into the backlog today! I’ve had these thoughts jotted down since trying to solve a problem on another OpenStack all-in-one box a few weeks ago, and I’m glad to finally get it finished. So without further ado, let’s jump in!

The Questions

I have already covered Open vSwitch and OpenStack networking in the following two articles:

There have been some unanswered questions for me however:

Why VLAN trunking anomalies seem to be present on patch ports

If one looks at the output of ovs-vsctl show, some confusion may ensue. For example, there are several VLAN tags there, but if all of them are trunked across (as is the behaviour of a patch port), which VLAN wins? Do any? How is this even working?

    Bridge br-int
        fail_mode: secure
        Port "foo"
            tag: 4
            Interface "foo"
                type: internal
        Port "bar"
            tag: 3
            Interface "bar"
                type: internal
        Port "jack"
            tag: 1
            Interface "jack"
        Port "jill"
            tag: 2
            Interface "jill"
                type: internal
        Port br-int
            Interface br-int
                type: internal
        Port "int-br-ex"
            Interface "int-br-ex"
                type: patch
                options: {peer="phy-br-ex"}
    Bridge br-ex
        Port "enp2s0"
            Interface "enp2s0"
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}
        Port br-ex
            Interface br-ex
                type: internal
    ovs_version: "2.1.3"

How does OpenStack handle several layer 3 networks over the same router

My other question was – observing that OpenStack does not create any sort of VM for routing or what not – how does routing even work? I mean, managing ultimately what could be thousands of tenant networks and possibly dozens or even hundreds of external networks can get pretty messy, I would imagine.

The answers were pretty clear, once I dug a bit deeper.

Open vSwitch, and Integration to External Bridge Mapping

The OpenStack integration bridge will maps to two kinds of bridges, depending on where in the architecture is looked at:

  • The external bridge (as shown above) – this is generally done on network nodes and my all-in-one setup
  • The tunnel bridge (not shown above to save space) – this is done on regular compute nodes, for example

This is specifically denoted by the two patch ports in each bridge:

        # br-int
        Port "int-br-ex"
            Interface "int-br-ex"
                type: patch
                options: {peer="phy-br-ex"}
        # br-ex
        Port phy-br-ex
            Interface phy-br-ex
                type: patch
                options: {peer=int-br-ex}

As mentioned, all VLANs are passed over a bridge. Think of it as a trunk port on a physical switch that is set to match all VLANs. The vSwitch layer of OVS does not perform any sort of selective VLAN mapping.

So if all VLANs going over this port are tagged, then how do we make sense of what we see in the external bridge, which has no tags at all? All ports are either untagged or are trunks, so just looking at this at face value, it would seem that like a bad configuration.

Not necessarily.

OpenFlow Magic on External Bridge

The switch layer is only half the story when deal with Open vSwitch. The second part is what happens with OpenFlow on the external bridge:

# ovs-ofctl dump-flows br-ex
NXST_FLOW reply (xid=0x4):
 cookie=0x0, duration=3776.846s, table=0, n_packets=5106, n_bytes=1142456, idle_age=0, priority=1 actions=NORMAL
 cookie=0x0, duration=3654.201s, table=0, n_packets=0, n_bytes=0, idle_age=3654, priority=4,in_port=2,dl_vlan=1 actions=strip_vlan,NORMAL
 cookie=0x0, duration=3776.341s, table=0, n_packets=132, n_bytes=10608, idle_age=3703, priority=2,in_port=2 actions=drop

The second rule is the specific one we want to pay attention to. This rule contains the strip_vlan action, which actually removes any tags outgoing on this port, matching off VLAN 1. So any traffic coming into port 2 on the external bridge (basically the peer port to the integration bridge) off of VLAN 1 (which one would assume is the external network), will have its VLAN stripped before being forwarded.

And hence, mystery solved! Now moving on to the other issue – routing.

Network Namespaces

As previously mentioned, one would imagine that networking would get pretty messy when implementing the routing of several several tenant networks over a single router – consider the amount of networks, interfaces, and routes (including default routes) that these nodes would have to manage, and the head may spin pretty quickly.

So how to manage all of these routes in a sane fashion? Enter network namespaces.

Network namespaces are a fairly recent addition to the Linux kernel. Introduced in version 2.6.24, I have found the easiest way to think about the feature is to think about it in the context of the work that has been done on containers in the last few years (to support things like LXC, CoreOS, and Docker). Each network namespace is its own individual pseudo-container, an island of networking, pretty much its own individual virtual router.

These map to OpenStack pretty visibly. For example:

# neutron router-list -F id -F name
+--------------------------------------+---------------------------+
| id                                   | name                      |
+--------------------------------------+---------------------------+
| f44651e2-0aab-435b-ad11-7ad4255825c7 | r.lab.vcts.local          |
+--------------------------------------+---------------------------+

Above is the router ID for my current lab network. Perhaps, in the name of good convention, this has a matching namespace?

# ip netns show | grep f44651e2-0aab-435b-ad11-7ad4255825c7
qrouter-f44651e2-0aab-435b-ad11-7ad4255825c7

Why yes, yes it does!

Now, there are tons of things that can be done within a network namespace, but I’m not going to cover them all, as they are not necessarily relevant within the context of a fully working OpenStack implementation, as everything is already going to be set up.

One of the best ways to troubleshoot a namespace is to enter it using ip netns exec. Note that this is not a fully separate container. Instead, commands are just executed within the context of that specific network namespace, the idea being that commands can be run that are not necessarily namespace aware.

Commands can be ran individually, but it may just be easier to run a shell within the target context, like so:

# ip netns exec qrouter-f44651e2-0aab-435b-ad11-7ad4255825c7 /bin/bash
# ip route show
default via 192.168.0.1 dev qg-0c4c9d04-f0 
172.16.0.0/24 dev qr-da3efe6d-a2  proto kernel  scope link  src 172.16.0.1 
192.168.0.0/24 dev qg-0c4c9d04-f0  proto kernel  scope link  src 192.168.0.99 

When the above is looked at, some pieces may start fitting together. And even though I haven’t covered it here, it will make sense from the above: There is the internal interface qr-da3efe6d-a2, which has the internal network 172.16.0.0/24. The external interface has been bound thru OpenStack controls to 192.168.0.99/24 on qg-0c4c9d04-f0, which then allows general outbound through the default route, and 1-1 nat for floating IP addresses.

Plenty of other commands can be run within this bash shell to get useful information, such as ip addr, ifconfig, and iptables, to get information on which IP addresses are bound to the router and how the firewall and NAT is set up.

Additional Reading

Hopefully the above gives you lots of insight into how networking works on OpenStack. For further reading, check out Networking in too Much Detail, the page that served as a starting point for a lot of this research. LWN.net also has a pretty awesome article explaining namespaces here.

Adventures in OpenStack – Nova Limits

This morning I was setting up some test instances on my home OpenStack deployment and ran into some issues. The first instance launched okay, but the second failed, telling me that no suitable host was found.

I have not really oversubscribed, or so I thought. Turns out I was oversubscribed on both RAM and Disk.

OpenStack has defaults surrounding this:
* Disk is not oversubscribed by default. There is a 1-1 subscription per host. In a production setup, this should never be changed – an instance launching on a host that could run out of space and take down all the instances on that node is definitely a bad thing.
* Memory is oversubscribed 1.5 times.
* CPU is oversubscribed 16 times (cores).

Disk and memory allocation are not so much as critical on standalone setups. These will inevitably need changing as more instances are stood up as OpenStack still counts shut down instances towards the limit, even for RAM and CPU.

Changing the defaults

The defaults be changed: just edit your Nova configuration (ie: /etc/nova/nova.conf) and look for:
* disk_allocation_ratio for disk
* ram_allocation_ratio for Memory
* cpu_allocation_ratio for CPU.

Change these limits as you see fit and then just restart Nova (ie: openstack-service restart nova).

Reference: http://docs.openstack.org/developer/nova/devref/filter_scheduler.html

Adventures in OpenStack – Neutron Setup Errata and Multiple External Networks

So, a bit of a confession. The article Adventures in OpenStack – Networking has a an issue in how to set up external networks. The section that mentioned what to enter in the [ovs] section actually does nothing.

Why? Because by default, the Layer 3 (L3) agent only allows one external network, hardcoded in the config and defaulted to br-ex:

# Name of bridge used for external network traffic. This should be set to
# empty value for the linux bridge. when this parameter is set, each L3 agent
# can be associated with no more than one external network.
# external_network_bridge = br-ex

As such, the first network flagged as “external”, as long as the name is available as a valid network name in the ML2 plugin config, will function and be mapped to the external bridge listed above.

This came up for me when I tried to add a second external network. Before a few versions of OpenStack ago, this needed to be done by running a second Layer 3 agent. But now with the ML2 plugin it is possible to run them out of the same agent.

The right way to do this is:

  • Make sure that have all the external networks listed in the ML2 config. If this is the flat network type, it would look like so:
[ml2_type_flat]
flat_networks = external,external2
  • Then, head over to l3_agent.ini and ensure that you have cleared the following as blank: 1) the external gateway ID, and the external network bridge name:
gateway_external_network_id =
external_network_bridge =

I make sure these are explicitly defined (uncommented and set correctly), to avoid issues.

  • Finally, add the OVS bits to the right config which should be in /etc/neutron/plugins/openvswitch/ovs_neutron_plugin.ini:
network_vlan_ranges = external,external2
bridge_mappings = external:br-ex,external2:br-ex2

Even though network_vlan_ranges might seem redundant or misleading for flat networks, the definition is still required.

You are now ready to restart Neutron. There is a shortcut for this, by the way:

openstack-service restart neutron

One Last Thing

If you are having issues with your existing external networks after performing the above, check for this in your Neutron logs:

openvswitch-agent.log:2015-02-22 18:27:39.369 22377 WARNING neutron.plugins.openvswitch.agent.ovs_neutron_agent [req-2fb3e41a-f068-4737-97f8-580af5ddad27 None] Device 598f6434-06ff-4280-af5c-6ce7c312ca5b not defined on plugin
server.log:2015-02-22 18:27:37.393 22455 WARNING neutron.plugins.ml2.rpc [req-2fb3e41a-f068-4737-97f8-580af5ddad27 None] Device 598f6434-06ff-4280-af5c-6ce7c312ca5b requested by agent ovs-agent-beefhost.vcts.local on network d8f7930b-d4e9-422d-b789-7c2866ab81e8 not bound, vif_type: binding_failed

If you see this, you may need to re-create your external networks. There may be another way to fix this, seeing as it may have to do with a bad external network/bridge association, but if you are in a (small enough) lab, it should be trivial to just tear down and rebuild the network and router gateway bindings (only the external networks, subnets, and gateway associations need to be fixed).

Adventures in OpenStack – On-Disk Anatomy of an Instance

When an instance is set up and started on OpenStack, the base image will be copied over and stored the local compute node from Glance. Here is what it looks like on disk, from a test instance of mine:

/var/lib/nova/instances/646d10cb-a5f5-4669-a155-cc78b5d59406
# ls -al 
total 65628
drwxr-xr-x. 2 nova nova       69 Feb 12 23:30 .
drwxr-xr-x. 5 nova nova       93 Feb 12 23:30 ..
-rw-rw----. 1 root root    14796 Feb 13 00:17 console.log
-rw-r--r--. 1 root root 67174400 Feb 13 00:17 disk
-rw-r--r--. 1 nova nova       79 Feb 12 23:29 disk.info
-rw-r--r--. 1 nova nova     2629 Feb 12 23:43 libvirt.xml

And checking the disk file, we see the following:

# qemu-img info disk
image: disk
file format: qcow2
virtual size: 5.0G (5368709120 bytes)
disk size: 64M
cluster_size: 65536
backing file: /var/lib/nova/instances/_base/acc45e1bf0ad2336505dcad48418ce2564b701c4
Format specific information:
    compat: 1.1
    lazy refcounts: false

Also, while the user instance disk is qcow2, the backing file is *not*:

image: /var/lib/nova/instances/_base/acc45e1bf0ad2336505dcad48418ce2564b701c4
file format: raw
virtual size: 2.0G (2147483648 bytes)
disk size: 1.1G

Remember that instance storage for Nova needs to be engineered properly: /var/lib/nova/instances could be local, but should be shared or replicated storage to ensure proper availability levels.

packstack – Cinder and Nagios Options, and Advanced Operation

After getting a grasp on all of the concepts that I have discussed regarding OpenStack, I have decided to give my dev server a re-deploy, namely to set up LVM for proper use with Cinder.

I noticed a few things that should probably be noted when using packstack to configure an OpenStack host.

First off, packstack is not necessarily just for setting up lab environments. By running packstack –help, the program will output a plethora of options that can be used for controlling how packstack runs. This can be used to limit deployment to just a few options, so that, for example, it only deploys a compute node, or a storage node, etc. It also allows for answer file use. With fine-tuning of the options, packstack can be used to configure entire clouds, or at the very least more sophisticated multi-server labs.

Another thing to note is that there are several options that are set by default that may not be desired. For example, ultimately my second run at at an OpenStack all-in-one is looking like this:

packstack --allinone --provision-all-in-one-ovs-bridge=n --cinder-volumes-create=n --nagios-install=n

This excludes the following:

  • The Open vSwitch bridge setup. This is because I want to re-configure the Open vSwitch bridges again as per this article.
  • Creation of the test Cinder volume group – I already have one created this time around that I want to use with Cinder. This is named cinder-volumes as per the default volume group that Cinder looks for, and is also the volume group that packstack will create with raw file on the file system, which is not suitable for production use. If you have this volume group set up already and do not select this flag, packstack will ultimately fail.
  • Disabling the installation of a full Nagios monitoring suite on the host, as I plan to set up monitoring later – and not with Nagios, mind you!

Remember that you can check out a brief rundown on how to install and use packstack at the quickstart page on RDO.

Adventures in OpenStack – Launching an Instance, and Attaching a Floating IP

Now that I have all the basic building blocks set up, I can proceed to setting up an instance.

Incidentally, packstack set most of this up already, but I actually ripped out most of the networking stuff created by packstack, and also removed the demo tenant and set all that stuff up from scratch. I’m glad I did, as I have a lot better understanding as to what is going on with that part now. Understanding of Cinder and Swift is not necessarily nearly as important on a single server cloud as far as I’m concerned, and can be looked into later.

In any case, time to get started.

Make sure that the correct tenant is selected before doing any of this – admin is not the correct one! I may address this in a later article, but you either want to make sure that admin has correct permission for the project, or log in as the project’s admin user.

Adding a Key Pair

Instances cannot be logged into over SSH if there is no key pair.

Go to Projects -> Compute -> Access and Security. Under Key Pairs, a key pair can either be created, or imported. Importing brings up this window which you can put in your own OpenSSH public key:

OpenStack - Import Key Pair

Creating a key pair will create a key pair, and then automatically download it.

Adding SSH Access Rules

Under Projects -> Compute -> Access and Security -> Security Groups, a firewall rule needs to be added for SSH. Click Add Rule after selecting the default security group and enter the following:

OpenStack - Security Group Rule

This allows SSH globally. Rules can also be made to link security groups, allowing you to create more sophisticated access schemes across different networks.

Instance Creation

Time to create the instance.

Head to Project -> Compute -> Instances and click Launch Instance:

OpenStack - Create Instance 1

OpenStack - Create Instance 2

OpenStack - Create Instance 3

Note the options. I am using preconfigured flavours with the Debian Jessie image I uploaded. I am also using a test key pair, and the network I created.

Post-Creation gives you a place to add any post-installation scripts and what not, and Advanced Options allows you to set up custom partitioning if you want. I don’t need either of those, so I did not include them or modify them.

After clicking Launch, and waiting a bit of time, the instance should be visible:

OpenStack - Running Instance

I can now proceed to attaching a floating IP, so that the instance is publicly accessible.

Adding a Floating IP Address

Click the drop down box at the far right of the instance listing and select Associate Floating IP:

OpenStack - Floating IP 1

Since I do not have any IPs allocated, I have to request one, by clicking the + (plus) button:

OpenStack - Floating IP 2

I can then use the allocation given to me:

OpenStack - Floating IP 3

And now the IP address will be visible in the instance, after clicking Associate of course:

OpenStack - Floating IP 4This instance is now ready to log into! By connecting using the SSH key I was given, I can connect to 172.16.0.13 over SSH and start working on the instance.

The Debian image’s default login at this time is debian. Most instances will NOT let you log in as root, so consult the image documentation for any specific default user that is created. Sometimes, logging in as root will give you the correct user to log in as and then disconnect you (such is the case with the Debian image).

Adventures in OpenStack – External Networks

When I left off yesterday, I just finished discussing the basics of setting up Neutron on a single-server OpenStack deployment. I set up a flat network – external – and mapped it to my external bridge – br-ex.

I am now going to discuss how to set up an external network for use by OpenStack tenants. Note that these are not usually used directly, but are taken by tenants as floating IP addresses, that are NATed from specific hosts in tenant networks. How floating IP addresses are specifically allocated will be discussed in a later article when I start to launch instances.

Planning the Network

There’s only a few things to consider here. Remember that the general best practice for OpenStack right now assumes that external traffic is sent to a router that is independent of the OpenStack deployment.

Hence, in a production setup, there are a couple of real-world scenarios:

  • IP space is fully managed and routed by the cloud administrator and realistically a full range will be available to assign to the external network, which can then be given out as floating IPs.
  • IP space is managed by a hosting provider that provides services to the cloud administrator, and a specific, probably smaller, range will be able to be assigned to the external network.

In both of these scenarios the setup is the same, the only thing that is really different is how the traffic is handled after it leaves the cloud, which is out of the scope of this article.

If a development server is being set up, on the other hand, sometimes other considerations need to be taken into account, such as any existing DHCP ranges that will affect the range that I give to OpenStack.

Assuming a network of 172.16.0.0/24:

Network address		- 172.16.0.0
Network mask		- 255.255.255.0
Broadcast address	- 172.16.0.255
Addresses in network	- 256
Network range		- 172.16.0.0 - 172.16.0.255
Usable range		- 172.16.0.1 - 172.16.0.254

Generally, the low addresses (less than or possibly equal to 10) are reserved for network devices. Say this is an existing network as well with a DHCP range of 172.16.0.100-172.16.0.199, which cannot be encroached on. I need a significant range of addresses for floating IP addresses, and router IPs for tenant networks.

Based off of this, 172.16.0.11-172.16.0.50 may be a good start.

External Network Setup in the OpenStack Web Interface

This can be done in both the CLI and also the web interface. The manuals discuss how to do it in the CLI (see here), but I will discuss the web interface, as it is perfectly capable of doing everything that the CLI can do for this task.

Head over to Admin -> System -> Networks and click Create Network;

OpenStack - External Net Creation

Make sure the admin project owns this network as it is crucial for routing. Mark the network as external, and ensure the physical network reads external as well as this is the physical network that was mapped to the external bridge in the Neutron configuration. If another flat network name was chosen for the mapping, use that name. Admin state should be up, unless you need it disabled at creation time for admin purposes.

After the network is created, I can proceed to subnet creation. Click on the newly created network and then click Create Subnet:OpenStack External Network - Subnets 1OpenStack External Networks - Subnets 2

Note that DHCP is not enabled. This is not needed for external networks and is generally kept off. Otherwise, the rest of the setup is pretty straightforward. One thing to note is how the ranges are entered: ranges are entered on separate lines, with a comma separating the first and the last IP in the range.

Name servers and host routes are generally entered on the tenant network, as those settings are added to instances.

After the subnet is created, network deployment can proceed to creation of the internal network.