CEF (Cisco Express Forwarding)

Hello Giovanni

In general, software-based network devices, such as firewalls, SBCs, or even routers that run on “off the shelf” servers, either virtual or physical, typically don’t have the same capacity/speed/resources as specifically designed appliances such as routers, firewalls, etc…

This is because all of the processing of packets is done by the CPU itself using software rather than specialized independent high-speed hardware. Features such as CEF, and even fast switching are only possible because of the hardware architecture of the routers, an architecture that is not present on generic servers. Thus, CEF by definition cannot run on generic servers.

So as a rule of thumb, network devices running on off-the-shelf servers are almost always slower and more resource-intensive than specialized appliances such as routers, firewalls and others…

I hope this has been helpful!

Laz

Super useful, Thanks :grinning:

1 Like

Hello Laz,

In data center server and TOR switch architecture,
do packet out from server reaches TOR switch & comes back again to server…i am talking on south bound traffic.
where does mac plays its role?
i need in packet destination mac & source mac matches along with source & destination mac…

and do mac-learning limit command restricts dhcp behaviour on server for ipv4 or ipv6?

BR//
Nitin Arora

Hello Nitin

In a Top of Rack (ToR) datacenter architecture, the servers within the rack are typically connected directly to the ToR switch. This means that packets sent out of any server will indeed go to the ToR switch, even if that traffic is destined for another server within the same rack.

Now where does the MAC table come into play here? Well, it depends upon what role the ToR switch plays. Is it an L2 or L3 switch? If it’s L2, then the MAC address table will be used to determine the egress port. If it is L3, then routing will be employed to determine the egress port.

Hmm, I’m not sure what you mean here. Can you clarify your question?

I hope this has been helpful!

Laz

Okay …its L3 & routing will be employed for the same.

then i suppose mac-learning-limit will not have any role here on dhcp assignment of IPv4 or IPv6 on servers.

correct me if i am wrong in above understanding.

BR//
Nitin Arora

Hello Nitin

Yes, you are correct. The MAC address learning limit on a switch will not affect the assignment of IPv4 or IPv6 addresses using DHCP.

I hope this has been helpful!

Laz

Thanks Laz, now understood

1 Like

Hello, everyone!

An amazing lesson, however, I have some questions to strenghten my understanding.

The routing table isn’t very suitable for fast forwarding because we have to deal with recursive routing.

Technically, if we configure an IGP or a fully-specified static route, we could prevent recursive routing, correct?

Most of the IP packets can be forwarded by the data plane. However there are some “special” IP packets that can’t be forwarded by the data plane immediately and they are sent to the control plane, here are some examples: IP packets that are destined for one of the IP addresses of the multilayer switch.

If we were to send an SSH packet to our device, shouldn’t this technically also be sent to the Management Plane? I believe that Rene mentioned that the Management plane is actually a sub-set of the Control plane, so are both actually involved?

Back in the days…switching was done at hardware speed while routing was done in software. Nowadays both switching and routing is done at hardware speed. In the remaining of this lesson you’ll learn why.

What specifically does “software switching/forwarding” mean? Even with software switching, isn’t there the CPU which does all the necessary instructions and calculations which is a hardware component? The word “software” is what confuses me a little.

Information like MAC addresses, the routing table or access-lists are stored into these ASICs. The tables are stored in content-addressable memory (CAM) and ternary content addressable memory (TCAM).

I am a little confused. Are all these tables like the RIB and the MAC address table stored in ASICs or in the CAM?

And to confirm one more thing, anything destined to the router or any IGP/EGP-related traffic is sent to the Control Plane and the CPU for processing, correct? Is this also why we implement “Control Plane Policing”? To prevent potential DoS attacks which would cause the CPU to become overwhelmed and drop packets?

Thank you in advance.

Hello David

Here are my responses:

Yes that is correct. More about fully specified static routes can be found at this NetworkLessons note.

First of all, the term “management plane” is used in a different context and should not be confused with the idea of the control and data planes. The control and data planes in the context of what is being expressed in the lesson are the only “planes” that operate, and they deal with the processing of packets. The management plane is viewed in the context of network configuration, monitoring, management, updates, and other administrative tasks. If an SSH session with the local switch is to take place, the packets received will be sent to the control plane as described in the lesson. Looking at it from an administrative point of view, yes, this would use the management plane, but this is outside of the context we’re talking about here.

“Software forwarding” means that when a packet is received, it is processed using software that has been loaded into the RAM. The CPU will be involved in the processing, of course, however, such processing is inherently slow.

Hardware processing is all done on small specialized chips with hardwired lines of code that deal with how packets will be processed. These have everything (code, CPU, memory etc) on chip, so there is no delay in the communication between separate entities. Such hardware components are specially designed to do a single task, and cannot be modified or configured.

CAM and TCAM are distinct components of memory that are used within Cisco switches and routers. However, ASICs can be designed with integrated CAM and TCAM. This is done to further speed up the processing time.

Yes, this is correct. CoPP is indeed used to protect the resources made available to the control plane on networking devices.

I hope this has been helpful!

Laz

1 Like

Hello Laz

Really apprecitate that you took your time to answer literally all of my questions, thank you :slight_smile: I would just like you to elaborate more on this

First of all, the term “management plane” is used in a different context and should not be confused with the idea of the control and data planes. The control and data planes in the context of what is being expressed in the lesson are the only “planes” that operate, and they deal with the processing of packets. The management plane is viewed in the context of network configuration, monitoring, management, updates, and other administrative tasks. If an SSH session with the local switch is to take place, the packets received will be sent to the control plane as described in the lesson. Looking at it from an administrative point of view, yes, this would use the management plane, but this is outside of the context we’re talking about here.

So if I understand this correctly, only the data plane and the control plane are tasked with handling and processing packets? The management plane is not actually performing any packet-related functions, its just something that we refer to when we talk about configuring, monitoring or managing our devices in general from an administrative perspective?

David

Hello David

Yes, that is the case. The actual definitions of the planes involved really depend upon the context as well as who you ask. Others may have slightly different interpretations and approaches to some nuanced meaning of each term.

However, in the context of CEF and strictly speaking, the control and data planes are the only relevant entities concerning the processing and forwarding of packets.

The management plane is only relevant in the context of operations involving the accessing of devices via SSH, Telnet, SNMP, as well as processes involving logging, software updates, and network monitoring systems. The management plane is typically created as part of a network design, ensuring the appropriate management VLANs, management interfaces, CLI connectivity and NMS services are allcorrectly established.

I hope this has been helpful!

Laz

1 Like

Hello, everyone.

I have a lot of questions regarding CEF because I am not quite sure how the internal operation works…

  1. The FIB is derived from the RIB and is optimized for faster lookups and easier instruction handling, so how is the FIB a data plane concept and not a control plane concept, then? Or in other words, I don’t think I understand why the RIB is a control plane thing while the FIB is data plane when they achieve and do the same thing.

If we are running process-switching, isn’t the RIB both a control and a data plane thing? I can only logically think of it being a control plane thing only when we are running CEF.

  1. Where exactly is the FIB stored? I understand that the ASIC chip is responsible for super-fast forwarding unlike a regular CPU, but it has to load the data from the FIB somehow, right? So the FIB must be stored in some kind of memory. Is this the TCAM? If so, is the TCAM a separate hardware component or is it integrated into the ASIC?

  2. If I disable CEF, why is everything proccess-switched? Is the ASIC inside the device just completely ignored? I thought that disabling CEF would only remove the FIB and the adjacency table but the ASIC would still be able to perform these functions.

  3. I don’t quite get the three possible results that the TCAM can provide. If you check the forwarding table for an entry, let’s say 192.168.1.1, when would it return 1, when 0, and when “X”? And what does the X even mean in this context? I am not quite understanding the “don’t care” or “anything” explanations.

  4. Can a device have only an ASIC chip without the TCAM? So we would have something like a ASIC - RAM communication

  5. Where is the routing table stored? The RIB is stored in the RAM while the FIB should be in the TCAM, right?

  6. Not trying to go too deep but from a simple perspective, why exactly and how is the TCAM faster than a RAM? How do the lookups differ?

Thank you.

David

Hello David

I’ll try to respond to each question as best I can.

The distinction between the control plane and the data plane lies in their functionality. The control plane is responsible for network-wide logic such as routing protocols and the creation of the Routing Information Base (RIB), while the data plane is responsible for the actual forwarding of packets, which is where the Forwarding Information Base (FIB) comes in.
The RIB and FIB don’t achieve the same thing. RIB is involved in making decisions about which path to use for data forwarding based on routing protocols (OSPF, EIGRP etc). Once these decisions are made, they are then used to populate the FIB, which is used for the actual forwarding of the data packets. The FIB can be considered a “hard wired set of rules” that are followed blindly. No processing takes place other than a single lookup. Therefore, RIB is a control plane concept because it deals with network-wide logic and decision making using control plane protocols for this purpose, while FIB is a data plane concept because it deals with the actual forwarding of the packets.

The FIB is typically stored in high-speed memory for quick access, and this is often implemented using TCAM. The TCAM can be a separate hardware component or it can be integrated into the ASIC, depending on the specific hardware design of the router or switch.

This is done by definition. The ASIC is still there and capable of performing its functions, but by disabling CEF, you’re telling the switch to send all packets to be examined by the CPU. You typically wouldn’t do this in any production network unless it is required for troubleshooting purposes.

Take a look at this NetworkLessons note about TCAM Lookups.

Yes, a device can indeed have only an ASIC chip without TCAM. In such a setup, the ASIC would typically interact with regular RAM or other types of memory/storage to perform its functions. But this depends highly on the design and architecture of the device.

Yes, in modern networking devices, the RIB and the FIB are typically stored in different types of memory to optimize performance and functionality. The RIB is stored in RAM because it is part of the control plane, and requires the main CPU to perform the related processes to populate it and keep it up to date. The FIB is stored in TCAM for the reasons we mentioned before (primarily speed).

Without going too deep into the topic, data structures in RAM will typically be searched sequentially. The search time in RAM depends on the complexity of the data structure. Typically, you would use one CPU cycle for each entry, so the search time corresponds to the number of entries in the table.

TCAM (and CAM) on the other hand performs parallel searches across all entries in the table simultaneously, rather than sequentially. This is due to the hardware design of TCAM, which allows each bit of the search word to be compared against the corresponding bit in all entries simultaneously. This parallel search typically takes one CPU cycle. This allows for very fast lookup times regardless of the number of entries.

I hope this has been helpful!

Laz

Hello.

So the RAM follows a certain order and it typically compares one entry per clock cycle. So the more entries, the more processing needs to be done in order to successfuly perform a lookup.

However, this is basically impossible to see for a human eye, right? Since processors and forwarders can perform over billions of cycles per second.

And so the TCAM then basically can compare everything at once, correct? It doesn’t follow a certain order.

I’ve also read somewhere that the RAM requires a search algorithm while with TCAM, the ASIC feeds the entry into it and the TCAM sends back a matching entry.

David

Hello David

Yes this is correct.

You can’t really “see” it, however, it begins to become perceivable if you have a routing table with 50000 entries, and if your router receives thousands of packets per second. You will notice a slowing down of traffic as the CPU would become overburdened with sequential routing table lookups.

With RAM, the user supplies a memory address and the RAM returns the data word stored at that address. To search all of those addresses, you can apply search algorithms of various types in software to search the contents of those memory addresses.

CAM on the other hand is memory that has a hardwired search algorithm that is built in the hardware itself. Each memory address in CAM has a dedicated and fully parallel circuit that is used to simultaneously compare all addresses within the memory to detect a match between the stored bit and the input bit. The CAM sends back, in a single cycle, whether or not a match has been found. So CAM (and TCAM of course) is implemented largely in hardware, within the ACIS, and this is what makes it so fast.

I hope this has been helpful!

Laz