Diving into OpenStack Network Architecture - Part 2 - Basic Use Cases

Posted by Ronen Kofman on Oracle Blogs See other posts from Oracle Blogs or by Ronen Kofman
Published on Thu, 5 Jun 2014 05:36:51 +0000 Indexed on 2014/06/05 9:34 UTC
Read the original article Hit count: 382

Filed under:

/OpenStack

In the previous post we reviewed several network components including Open vSwitch, Network Namespaces, Linux Bridges and veth pairs. In this post we will take three simple use cases and see how those basic components come together to create a complete SDN solution in OpenStack. With those three use cases we will review almost the entire network setup and see how all the pieces work together. The use cases we will use are:

1. Create network – what happens when we create network and how can we create multiple isolated networks

2. Launch a VM – once we have networks we can launch VMs and connect them to networks.

3. DHCP request from a VM – OpenStack can automatically assign IP addresses to VMs. This is done through local DHCP service controlled by OpenStack Neutron. We will see how this service runs and how does a DHCP request and response look like.

In this post we will show connectivity, we will see how packets get from point A to point B. We first focus on how a configured deployment looks like and only later we will discuss how and when the configuration is created. Personally I found it very valuable to see the actual interfaces and how they connect to each other through examples and hands on experiments. After the end game is clear and we know how the connectivity works, in a later post, we will take a step back and explain how Neutron configures the components to be able to provide such connectivity.

We are going to get pretty technical shortly and I recommend trying these examples on your own deployment or using the Oracle OpenStack Tech Preview. Understanding these three use cases thoroughly and how to look at them will be very helpful when trying to debug a deployment in case something does not work.

Use case #1: Create Network

Create network is a simple operation it can be performed from the GUI or command line. When we create a network in OpenStack the network is only available to the tenant who created it or it could be defined as “shared” and then it can be used by all tenants. A network can have multiple subnets but for this demonstration purpose and for simplicity we will assume that each network has exactly one subnet. Creating a network from the command line will look like this:

# neutron net-create net1

Created a new network:

+---------------------------+--------------------------------------+

| Field | Value |

+---------------------------+--------------------------------------+

| admin_state_up | True |

| id | 5f833617-6179-4797-b7c0-7d420d84040c |

| name | net1 |

| provider:network_type | vlan |

| provider:physical_network | default |

| provider:segmentation_id | 1000 |

| shared | False |

| status | ACTIVE |

| subnets | |

| tenant_id | 9796e5145ee546508939cd49ad59d51f |

+---------------------------+--------------------------------------+

Creating a subnet for this network will look like this:

# neutron subnet-create net1 10.10.10.0/24

Created a new subnet:

+------------------+------------------------------------------------+

| Field | Value |

+------------------+------------------------------------------------+

| allocation_pools | {"start": "10.10.10.2", "end": "10.10.10.254"} |

| cidr | 10.10.10.0/24 |

| dns_nameservers | |

| enable_dhcp | True |

| gateway_ip | 10.10.10.1 |

| host_routes | |

| id | 2d7a0a58-0674-439a-ad23-d6471aaae9bc |

| ip_version | 4 |

| name | |

| network_id | 5f833617-6179-4797-b7c0-7d420d84040c |

| tenant_id | 9796e5145ee546508939cd49ad59d51f |

+------------------+------------------------------------------------+

We now have a network and a subnet, on the network topology view this looks like this:

Now let’s dive in and see what happened under the hood. Looking at the control node we will discover that a new namespace was created:

# ip netns list

qdhcp-5f833617-6179-4797-b7c0-7d420d84040c

The name of the namespace is qdhcp-<network id> (see above), let’s look into the namespace and see what’s in it:

# ip netns exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c ip addr

1: lo: <LOOPBACK,UP,LOWER_UP> mtu 65536 qdisc noqueue state UNKNOWN

link/loopback 00:00:00:00:00:00 brd 00:00:00:00:00:00

inet 127.0.0.1/8 scope host lo

inet6 ::1/128 scope host

valid_lft forever preferred_lft forever

12: tap26c9b807-7c: <BROADCAST,UP,LOWER_UP> mtu 1500 qdisc noqueue state UNKNOWN

link/ether fa:16:3e:1d:5c:81 brd ff:ff:ff:ff:ff:ff

inet 10.10.10.3/24 brd 10.10.10.255 scope global tap26c9b807-7c

inet6 fe80::f816:3eff:fe1d:5c81/64 scope link

valid_lft forever preferred_lft forever

We see two interfaces in the namespace, one is the loopback and the other one is an interface called “tap26c9b807-7c”. This interface has the IP address of 10.10.10.3 and it will also serve dhcp requests in a way we will see later. Let’s trace the connectivity of the “tap26c9b807-7c” interface from the namespace. First stop is OVS, we see that the interface connects to bridge “br-int” on OVS:

# ovs-vsctl show

8a069c7c-ea05-4375-93e2-b9fc9e4b3ca1

Bridge "br-eth2"

Port "br-eth2"

Interface "br-eth2"

type: internal

Port "eth2"

Interface "eth2"

Port "phy-br-eth2"

Interface "phy-br-eth2"

Bridge br-ex

Port br-ex

Interface br-ex

type: internal

Bridge br-int

Port "int-br-eth2"

Interface "int-br-eth2"

Port "tap26c9b807-7c"

tag: 1

Interface "tap26c9b807-7c"

type: internal

Port br-int

Interface br-int

type: internal

ovs_version: "1.11.0"

In the picture above we have a veth pair which has two ends called “int-br-eth2” and "phy-br-eth2", this veth pair is used to connect two bridge in OVS "br-eth2" and "br-int". In the previous post we explained how to check the veth connectivity using the ethtool command. It shows that the two are indeed a pair:

# ethtool -S int-br-eth2

NIC statistics:

peer_ifindex: 10

.

#ip link

.

10: phy-br-eth2: <BROADCAST,MULTICAST,UP,LOWER_UP> mtu 1500 qdisc pfifo_fast state UP qlen 1000

.

Note that “phy-br-eth2” is connected to a bridge called "br-eth2" and one of this bridge's interfaces is the physical link eth2. This means that the network which we have just created has created a namespace which is connected to the physical interface eth2. eth2 is the “VM network” the physical interface where all the virtual machines connect to where all the VMs are connected.

About network isolation:

OpenStack supports creation of multiple isolated networks and can use several mechanisms to isolate the networks from one another. The isolation mechanism can be VLANs, VxLANs or GRE tunnels, this is configured as part of the initial setup in our deployment we use VLANs. When using VLAN tagging as an isolation mechanism a VLAN tag is allocated by Neutron from a pre-defined VLAN tags pool and assigned to the newly created network. By provisioning VLAN tags to the networks Neutron allows creation of multiple isolated networks on the same physical link. The big difference between this and other platforms is that the user does not have to deal with allocating and managing VLANs to networks. The VLAN allocation and provisioning is handled by Neutron which keeps track of the VLAN tags, and responsible for allocating and reclaiming VLAN tags. In the example above net1 has the VLAN tag 1000, this means that whenever a VM is created and connected to this network the packets from that VM will have to be tagged with VLAN tag 1000 to go on this particular network. This is true for namespace as well, if we would like to connect a namespace to a particular network we have to make sure that the packets to and from the namespace are correctly tagged when they reach the VM network.

In the example above we see that the namespace interface “tap26c9b807-7c” has vlan tag 1 assigned to it, if we examine OVS we see that it has flows which modify VLAN tag 1 to VLAN tag 1000 when a packet goes to the VM network on eth2 and vice versa. We can see this using the dump-flows command on OVS for packets going to the VM network we see the modification done on br-eth2:

# ovs-ofctl dump-flows br-eth2

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=18669.401s, table=0, n_packets=857, n_bytes=163350, idle_age=25, priority=4,in_port=2,dl_vlan=1 actions=mod_vlan_vid:1000,NORMAL

cookie=0x0, duration=165108.226s, table=0, n_packets=14, n_bytes=1000, idle_age=5343, hard_age=65534, priority=2,in_port=2 actions=drop

cookie=0x0, duration=165109.813s, table=0, n_packets=1671, n_bytes=213304, idle_age=25, hard_age=65534, priority=1 actions=NORMAL

For packets coming from the interface to the namespace we see the following modification:

# ovs-ofctl dump-flows br-int

NXST_FLOW reply (xid=0x4):

cookie=0x0, duration=18690.876s, table=0, n_packets=1610, n_bytes=210752, idle_age=1, priority=3,in_port=1,dl_vlan=1000 actions=mod_vlan_vid:1,NORMAL

cookie=0x0, duration=165130.01s, table=0, n_packets=75, n_bytes=3686, idle_age=4212, hard_age=65534, priority=2,in_port=1 actions=drop

cookie=0x0, duration=165131.96s, table=0, n_packets=863, n_bytes=160727, idle_age=1, hard_age=65534, priority=1 actions=NORMAL

To summarize we can see that when a user creates a network Neutron creates a namespace and this namespace is connected through OVS to the “VM network”. OVS also takes care of tagging the packets from the namespace to the VM network with the correct VLAN tag and knows to modify the VLAN for packets coming from VM network to the namespace. Now let’s see what happens when a VM is launched and how it is connected to the “VM network”.

Use case #2: Launch a VM

Launching a VM can be done from Horizon or from the command line this is how we do it from Horizon:

Attach the network:

And Launch

Once the virtual machine is up and running we can see the associated IP using the nova list command :

# nova list

+--------------------------------------+--------------+--------+------------+-------------+-----------------+

+--------------------------------------+--------------+--------+------------+-------------+-----------------+

+--------------------------------------+--------------+--------+------------+-------------+-----------------+

The nova list command shows us that the VM is running and that the IP 10.10.10.2 is assigned to this VM. Let’s trace the connectivity from the VM to VM network on eth2 starting with the VM definition file. The configuration files of the VM including the virtual disk(s), in case of ephemeral storage, are stored on the compute node at/var/lib/nova/instances/<instance-id>/. Looking into the VM definition file ,libvirt.xml, we see that the VM is connected to an interface called “tap53903a95-82” which is connected to a Linux bridge called “qbr53903a95-82”:

</interface>

Looking at the bridge using the brctl show command we see this:

# brctl show

bridge name bridge id STP enabled interfaces

qbr53903a95-82 8000.7e7f3282b836 no qvb53903a95-82

tap53903a95-82

The bridge has two interfaces, one connected to the VM (“tap53903a95-82 “) and another one ( “qvb53903a95-82”) connected to “br-int” bridge on OVS:

# ovs-vsctl show

83c42f80-77e9-46c8-8560-7697d76de51c

Bridge "br-eth2"

Port "br-eth2"

Interface "br-eth2"

type: internal

Port "eth2"

Interface "eth2"

Port "phy-br-eth2"

Interface "phy-br-eth2"

Bridge br-int

Port br-int

Interface br-int

type: internal

Port "int-br-eth2"

Interface "int-br-eth2"

Port "qvo53903a95-82"

tag: 3

Interface "qvo53903a95-82"

ovs_version: "1.11.0"

As we showed earlier “br-int” is connected to “br-eth2” on OVS using the veth pair int-br-eth2,phy-br-eth2 and br-eth2 is connected to the physical interface eth2. The whole flow end to end looks like this:

VM è tap53903a95-82 (virtual interface)è qbr53903a95-82 (Linux bridge) è qvb53903a95-82 (interface connected from Linux bridge to OVS bridge br-int) è int-br-eth2 (veth one end) è phy-br-eth2 (veth the other end) è eth2 physical interface.

The purpose of the Linux Bridge connecting to the VM is to allow security group enforcement with iptables. Security groups are enforced at the edge point which are the interface of the VM, since iptables nnot be applied to OVS bridges we use Linux bridge to apply them. In the future we hope to see this Linux Bridge going away rules.

VLAN tags: As we discussed in the first use case net1 is using VLAN tag 1000, looking at OVS above we see that qvo41f1ebcf-7c is tagged with VLAN tag 3. The modification from VLAN tag 3 to 1000 as we go to the physical network is done by OVS as part of the packet flow of br-eth2 in the same way we showed before.

To summarize, when a VM is launched it is connected to the VM network through a chain of elements as described here. During the packet from VM to the network and back the VLAN tag is modified.

Use case #3: Serving a DHCP request coming from the virtual machine

In the previous use cases we have shown that both the namespace called dhcp-<some id> and the VM end up connecting to the physical interface eth2 on their respective nodes, both will tag their packets with VLAN tag 1000.We saw that the namespace has an interface with IP of 10.10.10.3. Since the VM and the namespace are connected to each other and have interfaces on the same subnet they can ping each other, in this picture we see a ping from the VM which was assigned 10.10.10.2 to the namespace:

The fact that they are connected and can ping each other can become very handy when something doesn’t work right and we need to isolate the problem. In such case knowing that we should be able to ping from the VM to the namespace and back can be used to trace the disconnect using tcpdump or other monitoring tools.

To serve DHCP requests coming from VMs on the network Neutron uses a Linux tool called “dnsmasq”,this is a lightweight DNS and DHCP service you can read more about it here. If we look at the dnsmasq on the control node with the ps command we see this:

dnsmasq --no-hosts --no-resolv --strict-order --bind-interfaces --interface=tap26c9b807-7c --except-interface=lo --pid-file=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/pid --dhcp-hostsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host --dhcp-optsfile=/var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/opts --leasefile-ro --dhcp-range=tag0,10.10.10.0,static,120s --dhcp-lease-max=256 --conf-file= --domain=openstacklocal

The service connects to the tap interface in the namespace (“--interface=tap26c9b807-7c”), If we look at the hosts file we see this:

# cat /var/lib/neutron/dhcp/5f833617-6179-4797-b7c0-7d420d84040c/host

fa:16:3e:fe:c7:87,host-10-10-10-2.openstacklocal,10.10.10.2

If you look at the console output above you can see the MAC address fa:16:3e:fe:c7:87 which is the VM MAC. This MAC address is mapped to IP 10.10.10.2 and so when a DHCP request comes with this MAC dnsmasq will return the 10.10.10.2.If we look into the namespace at the time we initiate a DHCP request from the VM (this can be done by simply restarting the network service in the VM) we see the following:

# ip netns exec qdhcp-5f833617-6179-4797-b7c0-7d420d84040c tcpdump -n

19:27:12.191280 IP 0.0.0.0.bootpc > 255.255.255.255.bootps: BOOTP/DHCP, Request from fa:16:3e:fe:c7:87, length 310

19:27:12.191666 IP 10.10.10.3.bootps > 10.10.10.2.bootpc: BOOTP/DHCP, Reply, length 325

To summarize, the DHCP service is handled by dnsmasq which is configured by Neutron to listen to the interface in the DHCP namespace. Neutron also configures dnsmasq with the combination of MAC and IP so when a DHCP request comes along it will receive the assigned IP.

Summary

In this post we relied on the components described in the previous post and saw how network connectivity is achieved using three simple use cases. These use cases gave a good view of the entire network stack and helped understand how an end to end connection is being made between a VM on a compute node and the DHCP namespace on the control node. One conclusion we can draw from what we saw here is that if we launch a VM and it is able to perform a DHCP request and receive a correct IP then there is reason to believe that the network is working as expected. We saw that a packet has to travel through a long list of components before reaching its destination and if it has done so successfully this means that many components are functioning properly.

In the next post we will look at some more sophisticated services Neutron supports and see how they work. We will see that while there are some more components involved for the most part the concepts are the same.

@RonenKofman

Developer IT