In this post we will talk about how to use Vagrant, Ansible, and Cumulus Linux to build a virtual layer 2 extension across an IP fabric. We will use the standard protocol MP-BGP and its EVPN extension to run the control plane, and VXLAN to provide the virtual data plane.
Each technology deserves a brief introduction prior making things happen.
Vagrant up!
Vagrant is an orchestration tool that uses “providers” to deploy virtual machines to a variety of platforms. Vagrant is probably most famous for its integration with Virtualbox. It’s quite common for cloud and application developers to use Vagrant to build their application stacks locally. Vagrant then makes it easier to then deploy them to production in cloud environments, in an automated fashion. We will use Vagrant to deploy the switches and servers in our topology.
Ansible.
Ansible is another orchestration tool that helps you manage your infrastructure as code. It is agent-less and has probably the best tool box for network automation out of the various orchestration tools. In this post we will use ansible to provision, or configure, our devices.
Cumulus Linux.
Cumulus Networks is a virtual networking company that has made available a machine image running their networking software. This is a linux image that will be run as the switches in our topology. In the real world, you can run this software on a physical switch and operate the topology in the same way, just replace Vagrant with your OPs guy installing and cabling them. They publish the CumulusVX image, along with several automated topologies, so that you can do exactly what we are doing in this post, automate your network with Cumulus.
Last but not least….EVPN.
EVPN stands for Ethernet Virtual Private Network. Its another address-family under the MP-BGP standard. The technology is used as the control plane for transporting and updating the MAC address table to your switches across an IP fabric.
Another technology that should be mentioned, that is not in the title, is VXLAN. We will use VXLAN in the dataplane for encapsulating and transporting our ethernet frames.
Ok, here is the topology we are going to build:
On the bottom are our servers. Each server is connected to a single leaf switch in a mode 2 port channel. Many of the labs that Cumulus publishes dual attach the servers to two leaf switches and bonds them using their m-lag implementation. This requires lacp, which unfortunately, I was not able to get working locally. To keep the project moving forward, I implemented a work around which was to change the physical connectivity, and configure all of the bonds as mode 2 (balance-xor).
Cumulus provides some great tools for building these topologies, one of them is their topology converter.
Using their python script topology_converter.py, I took the following “.dot” file and converted it into a Vagrantfile. Vagrant will use this to build, and most importantly connect, all of the instances.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
graph dc1 { "spine1" [function="spine" config="./helper_scripts/extra_switch_config.sh"] "spine2" [function="spine" config="./helper_scripts/extra_switch_config.sh"] "leaf1" [function="leaf" config="./helper_scripts/extra_switch_config.sh"] "leaf2" [function="leaf" config="./helper_scripts/extra_switch_config.sh"] "server1" [function="host" config="./helper_scripts/extra_server_config.sh"] "server2" [function="host" config="./helper_scripts/extra_server_config.sh"] "spine1":"swp3" -- "leaf1":"swp3" "spine1":"swp4" -- "leaf2":"swp4" "spine2":"swp5" -- "leaf1":"swp5" "spine2":"swp6" -- "leaf2":"swp6" "leaf1":"swp40" -- "leaf2":"swp40" "leaf1":"swp50" -- "leaf2":"swp50" "server1":"eth1" -- "leaf1":"swp1" "server1":"eth2" -- "leaf1":"swp2" "server2":"eth1" -- "leaf2":"swp1" "server2":"eth2" -- "leaf2":"swp2" } |
The Vagrant file that it creates is fairly long and took a number of modifications for what I wanted. Luckily, I’m going to share this with you… so clone this repo and take a look at it yourself. The repo will be needed to complete the build anyway.
The modifications that I made to the file involve commenting out some of the helper scripts, and using Vagrant’s ansible integration to run the playbooks.
1 2 3 4 5 |
~$ git clone https://github.com/tsimson/cumulus_evpn.git ~$ ~$ cd cumulus_evpn/ ~/cumulus_evpn$ ~/cumulus_evpn$ |
Lets run a few commands:
1 2 3 4 5 6 7 8 9 10 11 12 13 |
~/cumulus_evpn$ vagrant status Current machine states: spine1 not created (virtualbox) spine2 not created (virtualbox) leaf1 not created (virtualbox) leaf2 not created (virtualbox) server1 not created (virtualbox) server2 not created (virtualbox) This environment represents multiple VMs. The VMs are all listed above with their current state. For more information about a specific VM, run `vagrant status NAME`. |
This shows the instances we are about to create…so lets build and provision them!
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 |
~/cumulus_evpn$vagrant up ... output ommitted ... ~/cumulus_evpn$vagrant status Current machine states: spine1 running (virtualbox) spine2 running (virtualbox) leaf1 running (virtualbox) leaf2 running (virtualbox) server1 running (virtualbox) server2 running (virtualbox) This environment represents multiple VMs. The VMs are all listed above with their current state. For more information about a specific VM, run `vagrant status NAME`. ~/cumulus_evpn$ ~/cumulus_evpn$ |
Our VMs are up…cool!
If you watched vagrant do its magic, you will see that it also ran the ansible playbooks in a consumption model. Each time a machine build is completed, the playbooks are run, the dynamic inventory is matched against the new machine, and ansible deploys the new configuration to the machine. The next machine that is built is provisioned in the same way, but the ansible does not provision any of the previous instances because it has already completed them.
Lets jump one of our servers and verify its working as expected. Here we log into server1 (10.1.1.101) and ping server2 (10.1.1.102).
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 |
~/cumulus_evpn$ vagrant ssh server1 Welcome to Ubuntu 16.04 LTS (GNU/Linux 4.4.0-22-generic x86_64) * Documentation: https://help.ubuntu.com/ 271 packages can be updated. 159 updates are security updates. New release '18.04.1 LTS' available. Run 'do-release-upgrade' to upgrade to it. Last login: Tue Jan 29 06:24:03 2019 from 10.0.2.2 vagrant@server1:~$ vagrant@server1:~$ vagrant@server1:~$ ping 10.1.1.102 PING 10.1.1.102 (10.1.1.102) 56(84) bytes of data. 64 bytes from 10.1.1.102: icmp_seq=1 ttl=64 time=1.55 ms 64 bytes from 10.1.1.102: icmp_seq=2 ttl=64 time=1.88 ms ^C --- 10.1.1.102 ping statistics --- 2 packets transmitted, 2 received, 0% packet loss, time 1002ms rtt min/avg/max/mdev = 1.555/1.719/1.884/0.169 ms vagrant@server1:~$ |
It works! If you’ve gotten this far congratulations! You have layer-2 connected a couple of hosts across a layer-3 IP fabric….all using EVPN as the control plane, and vxlan as the virtual dataplane!
Now its time to let our networking geek-birds fly. Lets view the configurations, verify the control and dataplane functionality using command line, and dig even deeper with a review of network captures from the IP fabric.
Here is the leaf control plane configuration. Notice how the neighbor statements refer to interfaces, and not actual neighbor IP addresses. This is because the neighbors are established using an IPv6 link local addressing. Using this strategy, you simply specify the interface and the peer addressing is derived from IPv6 ND/RD.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 |
vrf RED vni 104001 exit-vrf ! vrf BLUE vni 104002 exit-vrf ! router bgp 65101 bgp router-id 10.255.255.11 neighbor swp4 interface remote-as external neighbor swp5 interface remote-as external ! address-family ipv4 unicast redistribute connected route-map LOOPBACK_ROUTES exit-address-family ! address-family l2vpn evpn neighbor swp4 activate neighbor swp5 activate advertise-all-vni exit-address-family ! route-map LOOPBACK_ROUTES permit 10 match interface lo |
Here is the interface configuration:
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 |
vagrant@leaf1:~$ cat /etc/network/interfaces auto lo iface lo inet loopback address 10.255.255.11/32 auto vagrant iface vagrant inet dhcp auto eth0 iface eth0 inet dhcp auto swp1 iface swp1 auto swp2 iface swp2 ... ###### ### Define VRF and L3VNI ###### auto RED iface RED vrf-table auto auto SERVER01 iface SERVER01 bond-slaves swp2 swp3 bond-mode balance-xor bridge-access 10 ##### ### Define bridge ##### auto bridge iface bridge bridge-ports SERVER01 VXLAN10 bridge-vids 10 bridge-vlan-aware yes ##### ### Define VXLAN interfaces ##### auto VXLAN10 iface VXLAN10 bridge-access 10 bridge-arp-nd-suppress on bridge-learning off mstpctl-bpduguard yes mstpctl-portbpdufilter yes vxlan-id 10010 vxlan-local-tunnelip 10.255.255.11 auto vlan10 iface vlan10 address 10.1.1.2/24 address-virtual 00:00:00:00:00:1a 10.1.1.1/24 vlan-id 10 vlan-raw-device bridge vrf RED |
In the above snippet, we combine our vlan10, VXLAN10, and SERVER01 into one bridge domain….named “bridge”. Our Layer3 interface, vlan 10, is assigned to the RED vrf.
So there’s the config, lets verify it at the command line.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 |
~/cumulus_evpn$ vagrant ssh leaf1 Welcome to Cumulus VX (TM) Cumulus VX (TM) is a community supported virtual appliance designed for experiencing, testing and prototyping Cumulus Networks' latest technology. For any questions or technical support, visit our community site at: http://community.cumulusnetworks.com The registered trademark Linux (R) is used pursuant to a sublicense from LMI, the exclusive licensee of Linus Torvalds, owner of the mark on a world-wide basis. vagrant@leaf1:~$ sudo net show bgp evpn route BGP table version is 20, local router ID is 10.255.255.11 Status codes: s suppressed, d damped, h history, * valid, > best, i - internal Origin codes: i - IGP, e - EGP, ? - incomplete EVPN type-2 prefix: [2]:[ESI]:[EthTag]:[MAClen]:[MAC]:[IPlen]:[IP] EVPN type-3 prefix: [3]:[EthTag]:[IPlen]:[OrigIP] EVPN type-5 prefix: [5]:[ESI]:[EthTag]:[IPlen]:[IP] Network Next Hop Metric LocPrf Weight Path Route Distinguisher: 10.255.255.11:3 *> [2]:[0]:[0]:[48]:[44:38:39:00:00:09] 10.255.255.11 32768 i *> [2]:[0]:[0]:[48]:[44:38:39:00:00:09]:[32]:[10.1.1.101] 10.255.255.11 32768 i *> [2]:[0]:[0]:[48]:[44:38:39:00:00:09]:[128]:[fe80::4638:39ff:fe00:9] 10.255.255.11 32768 i *> [2]:[0]:[0]:[48]:[46:38:39:00:00:07] 10.255.255.11 32768 i *> [2]:[0]:[0]:[48]:[46:38:39:00:00:09] 10.255.255.11 32768 i *> [3]:[0]:[32]:[10.255.255.11] 10.255.255.11 32768 i Route Distinguisher: 10.255.255.12:3 *> [2]:[0]:[0]:[48]:[44:38:39:00:00:0d] 10.255.255.12 0 65201 65102 i * [2]:[0]:[0]:[48]:[44:38:39:00:00:0d] 10.255.255.12 0 65201 65102 i *> [2]:[0]:[0]:[48]:[44:38:39:00:00:0d]:[32]:[10.1.1.102] 10.255.255.12 0 65201 65102 i * [2]:[0]:[0]:[48]:[44:38:39:00:00:0d]:[32]:[10.1.1.102] 10.255.255.12 0 65201 65102 i *> [2]:[0]:[0]:[48]:[44:38:39:00:00:0d]:[128]:[fe80::4638:39ff:fe00:d] 10.255.255.12 0 65201 65102 i * [2]:[0]:[0]:[48]:[44:38:39:00:00:0d]:[128]:[fe80::4638:39ff:fe00:d] 10.255.255.12 0 65201 65102 i * [2]:[0]:[0]:[48]:[46:38:39:00:00:0d] 10.255.255.12 0 65201 65102 i *> [2]:[0]:[0]:[48]:[46:38:39:00:00:0d] 10.255.255.12 0 65201 65102 i * [2]:[0]:[0]:[48]:[46:38:39:00:00:11] 10.255.255.12 0 65201 65102 i *> [2]:[0]:[0]:[48]:[46:38:39:00:00:11] 10.255.255.12 0 65201 65102 i * [3]:[0]:[32]:[10.255.255.12] 10.255.255.12 0 65201 65102 i *> [3]:[0]:[32]:[10.255.255.12] 10.255.255.12 0 65201 65102 i Displayed 12 prefixes (18 paths) vagrant@leaf1:~$ |
Look closely, the bgp evpn family is talking about MAC addresses. Its associating each MAC with a loopback address as the next hop. VXLAN will use this information to establish the overlay in the dataplane. Based on this output, the control plane seems to be working.
Next….we capture the dataplane and test fault tolerance. We are using ECMP across our uplinks so we will have to shut one path down to make sure our captures are plentiful.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 |
~/cumulus_evpn$ vagrant destroy spine1 -f ==> spine1: Forcing shutdown of VM... ==> spine1: Destroying VM and associated drives... tsimson@GMTI-Desktop-Tims-MacBook-Pro:~/cumulus_evpn$ vagrant ssh server1 Welcome to Ubuntu 16.04 LTS (GNU/Linux 4.4.0-22-generic x86_64) * Documentation: https://help.ubuntu.com/ 271 packages can be updated. 159 updates are security updates. New release '18.04.1 LTS' available. Run 'do-release-upgrade' to upgrade to it. Last login: Tue Jan 29 08:35:20 2019 from 10.0.2.2 vagrant@server1:~$ ping 10.1.1.1 PING 10.1.1.1 (10.1.1.1) 56(84) bytes of data. 64 bytes from 10.1.1.1: icmp_seq=1 ttl=64 time=0.388 ms ^C --- 10.1.1.1 ping statistics --- 1 packets transmitted, 1 received, 0% packet loss, time 0ms rtt min/avg/max/mdev = 0.388/0.388/0.388/0.000 ms vagrant@server1:~$ ping 10.1.1.102 PING 10.1.1.102 (10.1.1.102) 56(84) bytes of data. 64 bytes from 10.1.1.102: icmp_seq=1 ttl=64 time=1.93 ms 64 bytes from 10.1.1.102: icmp_seq=2 ttl=64 time=1.66 ms |
We toasted spine1 and the network continues to roll…awesome.
We can now see that we are single threaded through spine2…and can be sure we are capturing everything through a single interface.
1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 |
~/cumulus_evpn$ vagrant ssh leaf1 vagrant@leaf1:~$ sudo net show bgp sum show bgp ipv4 unicast summary ============================= BGP router identifier 10.255.255.11, local AS number 65101 vrf-id 0 BGP table version 7 RIB entries 3, using 456 bytes of memory Peers 2, using 39 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd spine1(swp4) 4 65201 2354 2363 0 0 0 00:15:48 Connect spine2(swp5) 4 65201 2648 2657 0 0 0 01:07:25 1 Total number of neighbors 2 show bgp ipv6 unicast summary ============================= % No BGP neighbors found show bgp l2vpn evpn summary =========================== BGP router identifier 10.255.255.11, local AS number 65101 vrf-id 0 BGP table version 0 RIB entries 3, using 456 bytes of memory Peers 2, using 39 KiB of memory Neighbor V AS MsgRcvd MsgSent TblVer InQ OutQ Up/Down State/PfxRcd spine1(swp4) 4 65201 2354 2363 0 0 0 00:15:48 Connect spine2(swp5) 4 65201 2648 2657 0 0 0 01:07:25 6 Total number of neighbors 2 vagrant@leaf1:~$ |
Time to take captures.
1 2 3 4 5 6 |
vagrant@leaf1:~$ sudo tcpdump -npi swp5 -w BB5.pcap && ping 10.1.1.102 tcpdump: listening on swp5, link-type EN10MB (Ethernet), capture size 262144 bytes ... vagrant@leaf1:~$ scp BB5.pcap tsimson@10.0.2.2: |
Here’s our capture. I’ve filtered everything but some ICMP traffic. Check out the VXLAN header, more specifically the VNI. As defined by RFC-7348:
“Each VXLAN segment is identified through a 24-bit segment ID, termed the “VXLAN Network Identifier (VNI)”. This allows up to 16 M VXLAN segments to coexist within the same administrative domain. The VNI identifies the scope of the inner MAC frame originated by the individual VM.”
Once again we made it to the end of another pretty cool post. The use cases for combining Vagrant, Ansible, and Cumulus Linux are vast. In future posts, I hope to build on this topology by establishing routing between networks, to external networks, and by implementing security within the fabric.
I had an absolute blast building and sharing this environment. I… and hopefully we… learned a ton!