CEF (Cisco Express Forwarding)

lagapidis · April 29, 2022, 9:52am

Hello Giovanni

In general, software-based network devices, such as firewalls, SBCs, or even routers that run on “off the shelf” servers, either virtual or physical, typically don’t have the same capacity/speed/resources as specifically designed appliances such as routers, firewalls, etc…

This is because all of the processing of packets is done by the CPU itself using software rather than specialized independent high-speed hardware. Features such as CEF, and even fast switching are only possible because of the hardware architecture of the routers, an architecture that is not present on generic servers. Thus, CEF by definition cannot run on generic servers.

So as a rule of thumb, network devices running on off-the-shelf servers are almost always slower and more resource-intensive than specialized appliances such as routers, firewalls and others…

I hope this has been helpful!

Laz

Giovanni · April 29, 2022, 10:55am

Super useful, Thanks

nitin.28.arora · July 15, 2022, 12:26pm

Hello Laz,

In data center server and TOR switch architecture,
do packet out from server reaches TOR switch & comes back again to server..i am talking on south bound traffic.
where does mac plays its role?
i need in packet destination mac & source mac matches along with source & destination mac…

and do mac-learning limit command restricts dhcp behaviour on server for ipv4 or ipv6?

BR//
Nitin Arora

lagapidis · July 18, 2022, 4:57am

Hello Nitin

In a Top of Rack (ToR) datacenter architecture, the servers within the rack are typically connected directly to the ToR switch. This means that packets sent out of any server will indeed go to the ToR switch, even if that traffic is destined for another server within the same rack.

Now where does the MAC table come into play here? Well, it depends upon what role the ToR switch plays. Is it an L2 or L3 switch? If it’s L2, then the MAC address table will be used to determine the egress port. If it is L3, then routing will be employed to determine the egress port.

Hmm, I’m not sure what you mean here. Can you clarify your question?

I hope this has been helpful!

Laz

nitin.28.arora · July 18, 2022, 12:50pm

Okay ..its L3 & routing will be employed for the same.

then i suppose mac-learning-limit will not have any role here on dhcp assignment of IPv4 or IPv6 on servers.

correct me if i am wrong in above understanding.

BR//
Nitin Arora

lagapidis · July 21, 2022, 4:41am

Hello Nitin

Yes, you are correct. The MAC address learning limit on a switch will not affect the assignment of IPv4 or IPv6 addresses using DHCP.

I hope this has been helpful!

Laz

nitin.28.arora · July 21, 2022, 12:46pm

lagapidis:

Hello Nitin

In a Top of Rack (ToR) datacenter architecture, the servers within the rack are typically connected directly to the ToR switch. This means that packets sent out of any server will indeed go to the ToR switch, even if that traffic is destined for another server within the same rack.

Now where does the MAC table come into play here? Well, it depends upon what role the ToR switch plays. Is it an L2 or L3 switch? If it’s L2, then the MAC address table will be used to determine the egress port. If it is L3, then routing will be employed to determine the egress port.

nitin.28.arora:

i need in packet destination mac & source mac matches along with source & destination mac…

Hmm, I’m not sure what you mean here. Can you clarify your question?

I hope this has been helpful!

Laz

Thanks Laz, now understood

davidilles · September 6, 2023, 5:01pm

Hello, everyone!

An amazing lesson, however, I have some questions to strenghten my understanding.

The routing table isn’t very suitable for fast forwarding because we have to deal with recursive routing.

Technically, if we configure an IGP or a fully-specified static route, we could prevent recursive routing, correct?

Most of the IP packets can be forwarded by the data plane. However there are some “special” IP packets that can’t be forwarded by the data plane immediately and they are sent to the control plane, here are some examples: IP packets that are destined for one of the IP addresses of the multilayer switch.

If we were to send an SSH packet to our device, shouldn’t this technically also be sent to the Management Plane? I believe that Rene mentioned that the Management plane is actually a sub-set of the Control plane, so are both actually involved?

Back in the days…switching was done at hardware speed while routing was done in software. Nowadays both switching and routing is done at hardware speed. In the remaining of this lesson you’ll learn why.

What specifically does “software switching/forwarding” mean? Even with software switching, isn’t there the CPU which does all the necessary instructions and calculations which is a hardware component? The word “software” is what confuses me a little.

Information like MAC addresses, the routing table or access-lists are stored into these ASICs. The tables are stored in content-addressable memory (CAM) and ternary content addressable memory (TCAM).

I am a little confused. Are all these tables like the RIB and the MAC address table stored in ASICs or in the CAM?

And to confirm one more thing, anything destined to the router or any IGP/EGP-related traffic is sent to the Control Plane and the CPU for processing, correct? Is this also why we implement “Control Plane Policing”? To prevent potential DoS attacks which would cause the CPU to become overwhelmed and drop packets?

Thank you in advance.

lagapidis · September 11, 2023, 7:08am

Hello David

Here are my responses:

Yes that is correct. More about fully specified static routes can be found at this NetworkLessons note.

First of all, the term “management plane” is used in a different context and should not be confused with the idea of the control and data planes. The control and data planes in the context of what is being expressed in the lesson are the only “planes” that operate, and they deal with the processing of packets. The management plane is viewed in the context of network configuration, monitoring, management, updates, and other administrative tasks. If an SSH session with the local switch is to take place, the packets received will be sent to the control plane as described in the lesson. Looking at it from an administrative point of view, yes, this would use the management plane, but this is outside of the context we’re talking about here.

“Software forwarding” means that when a packet is received, it is processed using software that has been loaded into the RAM. The CPU will be involved in the processing, of course, however, such processing is inherently slow.

Hardware processing is all done on small specialized chips with hardwired lines of code that deal with how packets will be processed. These have everything (code, CPU, memory etc) on chip, so there is no delay in the communication between separate entities. Such hardware components are specially designed to do a single task, and cannot be modified or configured.

CAM and TCAM are distinct components of memory that are used within Cisco switches and routers. However, ASICs can be designed with integrated CAM and TCAM. This is done to further speed up the processing time.

Yes, this is correct. CoPP is indeed used to protect the resources made available to the control plane on networking devices.

I hope this has been helpful!

Laz

davidilles · September 11, 2023, 1:21pm

Hello Laz

Really apprecitate that you took your time to answer literally all of my questions, thank you I would just like you to elaborate more on this

First of all, the term “management plane” is used in a different context and should not be confused with the idea of the control and data planes. The control and data planes in the context of what is being expressed in the lesson are the only “planes” that operate, and they deal with the processing of packets. The management plane is viewed in the context of network configuration, monitoring, management, updates, and other administrative tasks. If an SSH session with the local switch is to take place, the packets received will be sent to the control plane as described in the lesson. Looking at it from an administrative point of view, yes, this would use the management plane, but this is outside of the context we’re talking about here.

So if I understand this correctly, only the data plane and the control plane are tasked with handling and processing packets? The management plane is not actually performing any packet-related functions, its just something that we refer to when we talk about configuring, monitoring or managing our devices in general from an administrative perspective?

David

lagapidis · September 18, 2023, 4:41am

Hello David

Yes, that is the case. The actual definitions of the planes involved really depend upon the context as well as who you ask. Others may have slightly different interpretations and approaches to some nuanced meaning of each term.

However, in the context of CEF and strictly speaking, the control and data planes are the only relevant entities concerning the processing and forwarding of packets.

The management plane is only relevant in the context of operations involving the accessing of devices via SSH, Telnet, SNMP, as well as processes involving logging, software updates, and network monitoring systems. The management plane is typically created as part of a network design, ensuring the appropriate management VLANs, management interfaces, CLI connectivity and NMS services are allcorrectly established.

I hope this has been helpful!

Laz

davidilles · June 7, 2024, 10:54am

Hello, everyone.

I have a lot of questions regarding CEF because I am not quite sure how the internal operation works..

The FIB is derived from the RIB and is optimized for faster lookups and easier instruction handling, so how is the FIB a data plane concept and not a control plane concept, then? Or in other words, I don’t think I understand why the RIB is a control plane thing while the FIB is data plane when they achieve and do the same thing.

If we are running process-switching, isn’t the RIB both a control and a data plane thing? I can only logically think of it being a control plane thing only when we are running CEF.

Where exactly is the FIB stored? I understand that the ASIC chip is responsible for super-fast forwarding unlike a regular CPU, but it has to load the data from the FIB somehow, right? So the FIB must be stored in some kind of memory. Is this the TCAM? If so, is the TCAM a separate hardware component or is it integrated into the ASIC?
If I disable CEF, why is everything proccess-switched? Is the ASIC inside the device just completely ignored? I thought that disabling CEF would only remove the FIB and the adjacency table but the ASIC would still be able to perform these functions.
I don’t quite get the three possible results that the TCAM can provide. If you check the forwarding table for an entry, let’s say 192.168.1.1, when would it return 1, when 0, and when “X”? And what does the X even mean in this context? I am not quite understanding the “don’t care” or “anything” explanations.
Can a device have only an ASIC chip without the TCAM? So we would have something like a ASIC - RAM communication
Where is the routing table stored? The RIB is stored in the RAM while the FIB should be in the TCAM, right?
Not trying to go too deep but from a simple perspective, why exactly and how is the TCAM faster than a RAM? How do the lookups differ?

Thank you.

David

lagapidis · June 10, 2024, 6:19am

Hello David

I’ll try to respond to each question as best I can.

The distinction between the control plane and the data plane lies in their functionality. The control plane is responsible for network-wide logic such as routing protocols and the creation of the Routing Information Base (RIB), while the data plane is responsible for the actual forwarding of packets, which is where the Forwarding Information Base (FIB) comes in.
The RIB and FIB don’t achieve the same thing. RIB is involved in making decisions about which path to use for data forwarding based on routing protocols (OSPF, EIGRP etc). Once these decisions are made, they are then used to populate the FIB, which is used for the actual forwarding of the data packets. The FIB can be considered a “hard wired set of rules” that are followed blindly. No processing takes place other than a single lookup. Therefore, RIB is a control plane concept because it deals with network-wide logic and decision making using control plane protocols for this purpose, while FIB is a data plane concept because it deals with the actual forwarding of the packets.

The FIB is typically stored in high-speed memory for quick access, and this is often implemented using TCAM. The TCAM can be a separate hardware component or it can be integrated into the ASIC, depending on the specific hardware design of the router or switch.

This is done by definition. The ASIC is still there and capable of performing its functions, but by disabling CEF, you’re telling the switch to send all packets to be examined by the CPU. You typically wouldn’t do this in any production network unless it is required for troubleshooting purposes.

Take a look at this NetworkLessons note about TCAM Lookups.

Yes, a device can indeed have only an ASIC chip without TCAM. In such a setup, the ASIC would typically interact with regular RAM or other types of memory/storage to perform its functions. But this depends highly on the design and architecture of the device.

Yes, in modern networking devices, the RIB and the FIB are typically stored in different types of memory to optimize performance and functionality. The RIB is stored in RAM because it is part of the control plane, and requires the main CPU to perform the related processes to populate it and keep it up to date. The FIB is stored in TCAM for the reasons we mentioned before (primarily speed).

Without going too deep into the topic, data structures in RAM will typically be searched sequentially. The search time in RAM depends on the complexity of the data structure. Typically, you would use one CPU cycle for each entry, so the search time corresponds to the number of entries in the table.

TCAM (and CAM) on the other hand performs parallel searches across all entries in the table simultaneously, rather than sequentially. This is due to the hardware design of TCAM, which allows each bit of the search word to be compared against the corresponding bit in all entries simultaneously. This parallel search typically takes one CPU cycle. This allows for very fast lookup times regardless of the number of entries.

I hope this has been helpful!

Laz

davidilles · June 10, 2024, 8:18am

Hello.

So the RAM follows a certain order and it typically compares one entry per clock cycle. So the more entries, the more processing needs to be done in order to successfuly perform a lookup.

However, this is basically impossible to see for a human eye, right? Since processors and forwarders can perform over billions of cycles per second.

And so the TCAM then basically can compare everything at once, correct? It doesn’t follow a certain order.

I’ve also read somewhere that the RAM requires a search algorithm while with TCAM, the ASIC feeds the entry into it and the TCAM sends back a matching entry.

David

lagapidis · June 14, 2024, 5:02am

Hello David

Yes this is correct.

You can’t really “see” it, however, it begins to become perceivable if you have a routing table with 50000 entries, and if your router receives thousands of packets per second. You will notice a slowing down of traffic as the CPU would become overburdened with sequential routing table lookups.

With RAM, the user supplies a memory address and the RAM returns the data word stored at that address. To search all of those addresses, you can apply search algorithms of various types in software to search the contents of those memory addresses.

CAM on the other hand is memory that has a hardwired search algorithm that is built in the hardware itself. Each memory address in CAM has a dedicated and fully parallel circuit that is used to simultaneously compare all addresses within the memory to detect a match between the stored bit and the input bit. The CAM sends back, in a single cycle, whether or not a match has been found. So CAM (and TCAM of course) is implemented largely in hardware, within the ACIS, and this is what makes it so fast.

I hope this has been helpful!

Laz

psf1575 · June 26, 2024, 9:31pm

Hello Laz,

A have a few questions below:

Assume we have a multilayer switch, with an SVI on VLAN 10 of 192.168.10.254/24. We also have a host, connected to the same switch, on an access port on VLAN 10 with IP 192.168.10.1/24. If the host wants to send traffic to that SVI (a ping for example), I understand that since the host and SVI are on the same VLAN, and have IPs in the same subnet, that the host would just ARP for the MAC of the SVI so that it can encapsulate whatever packets its sending into a frame with the SVIs MAC address as the destination. My question revolves around the response from the multilayer switch back to the host. When the multilayer switch responds back to the host (192.168.10.1), doesnt it have to look at its routing table to determine how to make a forwarding decision even though communication between the two is happening within the same VLAN and thus no routingis taking place? So, the switch would find an entry on its routing table to match the destination IP address of the host since its responding back to the ICMP echo request the host sent, and since we have an SVI on VLAN 10 of 192.168.10.254/24, we have a directly connected network that matches that destination IP of the packet we are generating. Then since its directly connected, the switch should just be able to ARP for the MAC of that destination IP and encapsulate the packet into a frame and forward it by using its mac address table correct?
I understand that the FIB is the data plane equivalent of the RIB which exists in the control plane, and that the FIB also includes the adjacency table derived from the ARP table in the control plane. The lesson explains how its used for faster forwarding when a multilayer switch or router has to forward a packet from one subnet to the next. But how is the CEF mechanism involved when communication is only happening within the same layer 2 network (VLAN)? will multilayer switches forwarding frames within the same VLAN still just use their MAC tables stored in the CAM, or is the FIB involved here as well? An example would be a switched network with a core multilayer switch, and multiple access layer switches connected to the core in a star topology. If hosts on the same VLAN, but different access switches, want to communicate with eachother, would the FIB be involved or are the switches just using their MAC address tables to forward frames? This ofcourse is assuming we have our layer 2 links configured properly for connectivity within the same VLAN across the access layer switches.
I never see an SVIs MAC adresss stored in the MAC table for the switch with that specific SVI. How does a switch know the destination MAC of a frame it received is for itself, if the SVI is not on the MAC address table, is there some other mechanism to determine this?

Thank You Laz

lagapidis · July 1, 2024, 5:09am

Hello Paul

I’ll do my best to answer each of your questions:

In such a case, the SVI would not be required to look up the destination IP address in the routing table. The important thing to note here is that the SVI is the actual destination of the ping, and it is also the source of the echo response. As such, it will function just like any other host. Since the destination IP is on the same subnet as the SVI itself, it will “realize” that no routing is necessary, and thus it will simply respond as you described, without referencing the routing table.

CEF is not involved in this case, and that’s because no routing takes place in such a scenario. Remember, the routing table in a Layer 3 switch will be used only if an IP packet reaches a Layer 3 interface (such as an SVI) with a destination IP address other than that of the SVI itself. The Layer 3 switch must determine the egress interface based on that destination IP address.

The scenario you describe in this question never has an IP packet reach a Layer 3 interface such as an SVI. Therefore, no routing occurs, and CEF is not involved in such communications.

You will never see a switch’s SVI MAC address in the MAC address table of that switch.
Remember, the MAC address table is populated by reading the SOURCE MAC address field in frames arriving on switchports. You should never see the MAC of your own SVI in the source address field of an incoming frame.

The MAC address table is used to determine out of which switchport a frame should be sent to reach a particular MAC address. You will never send traffic out of a switchport that is destined for the SVI of the switch.

So how does a switch know the destination MAC of a frame it receives is for itself? When it arrives on the switchport, the MAC address is read and if it matches the MAC of the SVI, it is simply sent to the SVI (assuming the switchport is on the same VLAN as the SVI). This is just the fundamental function of a switch.

I hope this has been helpful!

Laz

psf1575 · July 1, 2024, 6:44pm

Thank You for the response Laz,

Given the answer to my first question, allow me to rephrase the process described in the scenario I mentioned:

Host on VLAN 10 wants to send an ICMP echo request to the SVI on VLAN 10. The destination IP of the SVI is on the same subnet as the host, so it ARPs for the MAC of the SVI (assuming it does not already have this mapping on its ARP cache). The ARP broadcast frame is received on an access port on VLAN 10, so the switch forwards the frame across every other access port on VLAN 10, any trunk ports allowing frames from VLAN 10, and also in this case the SVI on VLAN 10 since its a virtual interface on VLAN 10. The switch at this point also populates the MAC table with the source MAC of the incoming frames from the host if it hasnt done it already.
The SVI on VLAN 10 responds back with its MAC address, and the host proceeds to encapsulate an ICMP echo request packet in an ethernet frame with the destination MAC of the SVI and sends the frame out.
Upon ingress on the switch, the destination MAC of the frame matches the SVI on VLAN 10, so its forwarded to the SVI and is de-encapsulated to look at the IP packet inside. The destination IP address of the packet matches the IP address of the SVI, so the packet is not sent to the RIB for further processing.
The L3 switch simply uses the SVI to source the ICMP echo reply, and since the destination IP of the host is on the same subnet as the IP address of the SVI, no routing table lookups are necessary. The switch just checks its ARP cache for a MAC address that matches the destination IP address of the host and it encapsulates the response packet into a frame with the hosts MAC address as the destination. Then the switch compares the entries it has on its MAC table with the destination MAC address of the frame in order to forward the frame towards the host, and all is good.

Assuming a different scenario, in which inter-VLAN routing does take place. If we had a topology with a core switch and multiple access layer switches with multiple VLANs configured, VLAN 10 and VLAN 20 for example. If a host A on VLAN 10 wants to send a packet to a host B on VLAN 20, and both hosts are on separate access switches, and also assuming the uplinks towards the core switch from the access switches are configured as trunk ports allowing traffic for all VLANs, the following steps will occur:

Lets assume host A already ARP’d for the SVI on VLAN 10’s MAC address, Host A encapsulates the packet in a frame with the destination MAC of the SVI on VLAN 10 and sends it out.
Access switch receives the frame, looks at the source MAC of the frame and makes an entry in the MAC table if it hasnt already. Then, the switch looks at the destination MAC, compares it to the MAC address table entries, and forwards the frame towards the core switch SVI. Lets assume theres an entry for the SVIs MAC on the table already.
The core switch receives the frame, and populates the source MAC on the MAC table if it hasnt already, then notices the frame is for the SVI since the destination MAC matches the MAC of the SVI, so upon de-encapsulation, the destination IP is not the IP of the SVI, so the packet is processed by the RIB and routed.
VLAN 10 and 20 have directly connected networks on the core switch, so the switch can just ARP directly for the MAC address that corresponds to the destination IP of host B on VLAN 20. Then finally the switch encapsulates the packet that host A sent, into a new frame with the source MAC of the SVI on VLAN 20, and the destination MAC of host B. The MAC table is then referenced for the destination MAC of this frame and forwarded appropriately to the access switch host B is on.
The access switch host B is on receives the frame, Populates the source MAC on the MAC table if it hasnt already, then looks at the destination MAC of the frame, and forwards the frame out of whatever port host B is on. All is good.

Correct?

As for the SVI’s MAC not being in the MAC table in my original third question, i cant believe I forgot something so basic for a second!

Thank You Laz

lagapidis · July 3, 2024, 5:31am

Hello Paul

The explanation of the first scenario is perfect and clear. It’s great how you are incorporating so many different aspects and functionalities in this process, describing them in detail. Good job. Just one clarification. You mention it, but I’d like to elaborate a bit more.

When the SVI of VLAN 10 receives the packet and decapsulates it and reads the destination IP address, it sees that it is not the IP address of the SVI, and it is routed. Essentially, the routing process determines the egress interface for that packet. Based on the routing table, the destination address belongs to a directly connected network, that of VLAN 20, and the egress interface is the SVI of VLAN 20.

Good job!

I hope this has been helpful!

Laz

psf1575 · July 12, 2024, 4:10am

Hello Laz,

I wanted to introduce a different scenario related to the discussion we were having, and discuss the process involved here.

Scenario 1:

Assume we have two routers (R11 and R12), directly connected to eachother on their G0/0 interfaces on a 10.1.1.0/24 subnet. R11 has IP 10.1.1.1/24 on its G0/0 interface, and R12 has 10.1.1.2/24 configured on its G0/0 interface. R11 also has a loopback address configured, 11.11.11.11/32 and R12 has a loopback as well 12.12.12.12/32.

My question is, we spoke about how when a multilayer/Router receives a frame with a destination MAC that matches its receiving interface, we de-encapsulate the frame, and if the destination IP of the packet is not the IP of the receiving interface, then we have to route the packet. But, what if a packet originates from R11 itself, destined to the 10.1.1.2 interface of R12? For example, if i were to login to the CLI of R11, and ping 10.1.1.2, would the RIB be involved here? I am sure that when R12 receives the packet, since it was destined for that interface, and the source IP of R12s response is on the same subnet as the destination IP of the response packet, then it just replies without using the RIB, but was curious on R11’s process on whether or not it should reference the routing table. I dont believe it does, since the destination IP of the packet is on the same subnet as R11s GI0/0 interface configured with 10.1.1.1/24. So even though both routers have a directly connected network on their routing tables of 10.1.1.0/24, any packets generated from either router going to either of their interfaces would not use the RIB.

The source of these questions are that when I have IP packet debugging, and I try this scenario on R11, i get “IP: tableid=0, s=10.1.1.2 (GigabitEthernet0/1), d=10.1.1.1 (GigabitEthernet0/1), routed via RIB”

So while its clear on how the routing process begins from the perspective of a router receiving a packet from a host that needs to be routed, how would we describe the routing process as it begins when the routers themselves generate a packet? My take on this is when the router generates a packet with a destination IP thats on the same subnet as one if its interfaces, then we do not need to reference the routing table and we just send the packet sourced from whatever interface is on that same subnet as the destination IP. But when the destination IP of the generated packet is not in the same subnet as one of the routers interfaces, then we have to reference the routing table and find a match.

are these explanations correct?

Scenario 2

What if the multilayer switch/router receives a frame on one of its interfaces, and upon de- encapsulation the destination IP of the packet is not for the interface itself, but its for another interface on that same router, what this scenario cause a routing table lookup?

Scenario 3

I wanted to shift and ask about the actual end user devices that send a packet to their default gateways. We always talk about how the end user devices determine whether to send a packet to their gateway or to try to ARP for a device it wants to communicate with by looking at the destination IP of the packet it generates, and comparing it to its own IP address and subnet mask. If the destination IP is on the same subnet it tries to ARP for the MAC of said IP address, and its its on a different subnet it ARPs for the default gateways MAC and sends the packet to the default gateway. My question is specifically related to a windows host running windows OS. Arent these functions done by checking the windows routing table that we get by doing “route print” on the command terminal? So for example, a windows host generating a packet would check its routing table to see if it has an entry that matches the destination IP. If theres an entry that matches the destination IP address of the packet “on-link” then we know that destination we want to reach is on the same subnet as one of our interfaces so we try to ARP. If there are no “on-link” entries then the destination IP is not an IP on the same subnet as any of our interfaces (since we may be using more than one network interface card) and we use the 0.0.0.0 0.0.0.0 entry which is our gateway. So am i correct in understanding that the windows routing table is always used for this process? I understand doing the windows routing table lookups provide the same end result as the explanation that a device compares he destination IP of the packet it generates to its IP address and subnet mask to determine how to handle that packet, but I just wanted to discuss this

Thank You Laz