Introduction to Virtual Extensible LAN (VXLAN)

I fixed this, somehow a draft post got published :slight_smile: Thanks for letting us know!

Hi
Thanks for this lesson.

<Traditional layer 2 networks have issues because of three main reasons:

Spanning-tree.
Limited amount of VLANs.
Large MAC address tables.>

My understanding is as follows. Could you please explain ?

  1. Spine-leaf topology solved the Spanning tree issue. STP is not an issue in VxLAN.
  2. Limited amount of VLANs : This is the essential why vxlan needs.
  3. Large MAC address tables : I’am not sure how vxlan solve this problem. This issue is common in virtualization.

Spine-leaf was not ip network. It becomes ip underlay to bind MAC learning to a multicast group and to deliver VxLAN frame efficiently. Is it true ?

Thanks
Michael

Hello Michael

First of all, we must make a distinction between VXLAN and spine-leaf. The first is a tunnelling protocol while the second is a network topology/architecture. They can work together, but they don’t necessarily have to be implemented together.

Actually, the spine-leaf topology doesn’t solve the inefficiency of STP. If you create an L2 spine-leaf topology, without any additional configuration parameters, STP will severely limit the available bandwidth. The solutions to the STP problem (regardless of the topology underneath) is:

  • to employ VXLAN
  • to break the network into smaller network segments/subnets
  • to introduce what is known as Transparent Interconnection of Lots of Links (TRILL) or shortest path bridging (SPB) to replace STP.

You can choose which solution to use depending on your requirements and what your equipment supports.

This is one of the basic problems that VXLAN solves, allowing extensively large datacenters to have more than the 4096 VLANs that normal Ethernet provides.

Because you are able to segment the datacenter network into tens of thousands of subnets, you can keep the number of hosts per subnet very small. The result is that the MAC address tables created for each subnet remain small.

Again, spine-leaf is a particular architecture, not necessarily associated with VXLAN. If you choose to use it, you can. If not, you don’t have to. But if you do, yes, it becomes the underlay routed network.

I hope this has been helpful!

Laz

Hello,
Thank you for the lesson. It is very informative. This is very new topic for me.
I have a question about VXLAN frame format. With those added headers, total MTU size should be beyond 1500bytes. How do VTEP devices handle this?
My other question is about Packet Walkthrough picture. It looks like in the picture Host1 is doing the tunneling and encapsulate the packet with the tunnel IP address. (192.168.12.101). Shouldn’t be the tunneling done by VTEP ?
Thank you.

Hello Ike

That’s a very good observation, and yes, the underlay infrastructure must be configured in order to accommodate this overhead. The VXLAN RFC draft states the following:

VTEPs MUST not fragment VXLAN packets. Intermediate routers may
fragment encapsulated VXLAN packets due to the larger frame size.
The destination VTEP MAY silently discard such VXLAN fragments. To
ensure end to end traffic delivery without fragmentation, it is
RECOMMENDED that the MTUs (Maximum Transmission Units) across the
physical network infrastructure be set to a value that accommodates
the larger frame size due to the encapsulation. Other techniques
like Path MTU discovery (see [RFC1191 and [RFC1981]) MAY be used to
address this requirement as well.

Specifically, the underlay network must be able to accommodate at least 1554 bytes:

  • 1518 (max) for the inner Ethernet frame
  • 8 bytes for the VXLAN header
  • 8 for the outer UDP header
  • 20 bytes for the outer IPv4 header

Yes, you are correct, this seems to be a typo. I’ll let Rene know to correct this. Thanks for pointing that out!

I hope this has been helpful!

Laz

That is a typo indeed. I just fixed it, thanks!

1 Like

Thank you very much Laz for the clarify the issue.

1 Like

I believe there’s a typo in the diagram at 9:30 - the hosts IPs are shown as
192.168.1.101 should be -> 192.168.12.101
192.168.2.102 should be -> 192.168.12.102

Hello Doug

Thanks for pointing that out. I will let Rene know!

Laz

Hi Team,

I joined as a member to understand basic concepts in networking. The lessons are really helpful.

Coming to VxLan, i have one query.

In multisite environment, generally Stretched VLANs are needed for some clusters. At same time, it is not recommended. So what i understood using VxLAN we can overcome this.

Using VxLAN, we encapsulate L2 Frame and transport over L3 network across sites. While encapsulate the L2 Frame , the VTEP (virtual Tunnel Endpoint) IP and mac will be added as header . However, VTEP IP addresses mentioned both in Source & Target sites also shown as in “Same IP segment”. That means without routing the frame will be forwarded across sites. This will be faster. Is this understanding of mine is correct?

Then again, we need to have same IP segment both in source and target site for VTEP Devices right? is this also stretched across sites right? This will conflict the basic idea of stretched IP segment (VLAN) across sites ? Can you help to clarify this concept.

Hello Ramachandra

You’re right that best practice dictates that you don’t span your VLANs across multiple sites, however sometimes as you say, it is necessary. There’s really no “solution” to this, either you span your VLANs or you don’t. You must however try to keep it to a minimum.

What VXLAN provides is an underlay network that is composed of routers and switches, where in most cases, you will never have a VLAN span multiple sites. Site-to-site communication in the underlay network will take place using L3 communication with routing. However, the VXLAN in the overlay network, which is being tunneled over that underlay network, is still spanning multiple sites. So from the point of view of the overlay network, you still do have a broadcast domain (VLAN) that is spanning multiple sites. This cannot always be avoided, but it should be minimized as much as possible.

Typically in a campus network, you should be able to avoid it completely, however, it is in distributed datacenters that you most often need to span VLANs across sites. VXLAN doesn’t “solve” the issue, but it does provide you with a framework where you can more easily apply such a configuration. You should still do it sparingly…

I hope this has been helpful!

Laz

Thanks Laz for explaining it.

1 Like

Hello Team,
Why does UDP header is added at vTEP VXLAN?

Hello Vijay

Take a detailed look at this lesson:

In it you’ll find details about the headers and where they are added.

Remember that VXLANs use an overlay and an underlay network. The VXLAN header actually sits between the header information of the overlay and underlay networks. The underlay network carries the “tunneled” traffic and uses the UDP header with port 4789 or 8472 to carry that traffic.

Below you can see the headers involved with the implementation of the VXLAN.

The headers that are circled in red are those that belong to the underlay network. Like any network, it needs a layer 3 header and a layer 4 header, and that is why the UDP header is used there.

I hope this has been helpful!

Laz

Thanks, Rene for these videos, Do you always have to map a VLAN to a VNI ? how do vxlan make use of these 16 Millions VNIs when i have to map a VLAN to a vni. Please let me understand i am really confused about it. Thanks

Hello Patrick

Remember that the purpose of VXLANs is not only to surpass the limitation of 4K VLAN IDs, but also to add flexibility to deployments, allowing for the simpler spanning of VLANs across sites. When there is no need to expand beyond that 4K limit, typical customer deployments will use a 1:1 mapping of available VLANs for simplicity.

However, when scalability to reach beyond this limit is needed, then it is possible to do so by assigning the same VLAN IDs to multiple unique VNIs found on different VTEPs within the fabric. So for example, you could have

  • VLAN 456 mapped to VNI 123 using a subnet of 10.1.1.0/24
  • VLAN 456 mapped to VNI 321 using a subnet of 10.2.2.0/24

The same VLAN ID is used, but on different VNIs. You can theoretically have 16 million VNIs each having 4000+ VLANs mapped to them.

Typically, you would have one VNI on a single physical switch, so you do have a limitation of 4k per physical device, but that is typically not a limiting factor.

I hope this has been helpful!

Laz

Thanks so much for replying to this post is helpful but I have more question basically, is VNI and VTEP need to match on both sides? do you need a vtep for every VNI? can you map VINI to VNI with no VLAN involve? Your help is really appreciated. Thanks

Hello Patrick

The VNI is a globally significant value. Much like the VLAN ID, it must be the same throughout the communication between the VTEPs that serve it.

A VTEP is a device that terminates the VXLAN fabric. That is where you configure the VNIs, so VNIs owe their existence to the VTEP devices.

Take a look at this image:


When H1 sends its data, it is oblivious to any VXLAN VNI configuration. It sends its data, and when it reaches the VTEP, it is there that any VLAN IDs are added, based on the VLAN ID of the access port to which it is connected. (The VLAN ID may be added at a switch somewhere between the host and the VTEP if it exists). That VLAN ID information is encapsulated and mapped to a particular VNI.

Remember that VXLANs are at their most basic level, a tunneling or encapsulation mechanism. Whatever VLAN ID is in the frame, this will be encapsulated and used within the VXLAN infrastructure.

I hope this has been helpful!

Laz

Hi LAZ/RENE,

could you please brief us Vxlan with MP-BGP EVPN.

Please

BR//-
Ajay

Hello Ajay

Cisco has an excellent and comprehensive document on VXLAN and its use of an MP_BGP EVPN underlay network. The following is an excerpt from that document which gives a brief but very clear understanding of how these technologies work together:

The initial IETF VXLAN standards (RFC 7348) defined a multicast-based flood-and-learn VXLAN without a control plane. It relies on data-driven flood-and-learn behavior for remote VXLAN tunnel endpoint (VTEP) peer discovery and remote end-host learning. The overlay broadcast, unknown unicast, and multicast traffic is encapsulated into multicast VXLAN packets and transported to remote VTEP switches through the underlay multicast forwarding. Flooding in such a deployment can present a challenge for the scalability of the solution. The requirement to enable multicast capabilities in the underlay network also presents a challenge because some organizations do not want to enable multicast in their data centers or WAN networks.

To overcome the limitations of the flood-and-learn VXLAN as defined in RFC 7348, organizations can use Multiprotocol Border Gateway Protocol Ethernet Virtual Private Network (MP-BGP EVPN) as the control plane for VXLAN. MP-BGP EVPN has been defined by IETF as the standards-based control plane for VXLAN overlays. The MP-BGP EVPN control plane provides protocol-based VTEP peer discovery and end-host reachability information distribution that allows more scalable VXLAN overlay network designs suitable for private and public clouds. The MP-BGP EVPN control plane introduces a set of features that reduces or eliminates traffic flooding in the overlay network and enables optimal forwarding for both west-east and south-north traffic.

The following is the link from this documentation from which you can find out more information:

If you have any more specific questions about this technology, let us know!

I hope this has been helpful!

Laz