Designing scalable networks and automation (Meta Interview Question)

neeraj.menon.m · March 26, 2024, 10:41pm

Hello All
This is my first post here, not sure if its in the right category! I have final interview rounds with Meta for a Network Production Engineer role coming up next week and the Design interview will have questions like the 2 examples I have shared below. I’m a CCNA R&S who worked at Cisco for 3 years, but mostly in a troubleshooting role and barely did any design. I’m looking for advice on how one would approach and brainstorm the questions:

45 minute interview - will require you to design a network component or network related large-scale system.
Example 1: Design a network for 100 hosts. (Ask about economics, scale, redundancy models etc). Follow up: upgrade to 1000 hosts. How would you monitor the network?
Example 2: Build an automated network design framework

Any help with this (even things like questions I might ask for clarifying the scenario or core ideas to keep in mind) would be greatly appreciated! I’m aware of architectures like 3 layer for campus and spine-leaf for data center, but not sure how to structure my approach since it is a pretty open ended question.

lagapidis · March 28, 2024, 6:00am

Hello Neeraj

It’s great to see you’re preparing for the interview. It’s a great opportunity and I wish you success. Concerning the interview questions, from what I understand in your post, you will be able to ask clarification questions, correct? If so, here’s how I would approach these questions. A lot of the below is me just thinking aloud.:

Example 1: Design a network for 100 hosts.
Firstly, you need to clarify the nature of the hosts - are they servers, workstations, IoT devices, etc.? Also, clarify what kind of traffic will be served by the network - time-sensitive traffic like VoIP and video and/or mission-critical services, file sharing, databases, or simple Internet surfing. These factors will influence the network design greatly because they will affect bandwidth requirements, QoS, security considerations, and the topology.

Network Topology: A typical design for 100 hosts could be based on a hierarchical network model with a core layer, distribution layer, and access layer. This design provides scalability, redundancy, and manageability.
IP Addressing: You could use a single subnet (e.g., 192.168.1.0/24) for 100 hosts. This will allow for 254 hosts, which leaves room for growth. However, you may want to subnet into smaller subnets based on some other categorization of hosts (i.e. services, departments, type of devices, security, etc…)
Redundancy: You could use protocols like HSRP/VRRP for gateway redundancy, and implement redundant links between switches to avoid single points of failure.
Security: If you have multiple subnets, implement VLANs to segment the network and reduce broadcast traffic. Also, consider firewalls, intrusion prevention systems, and access control lists for enhanced security.

Follow up: Upgrade to 1000 hosts.
This follow-up question seems to be for evaluating the scalability built into the original design. You will need to consider

your addressing scheme,
bandwidth requirements,
expansion of the access layer of the network while ensuring the distribution and core have enough bandwidth to handle the traffic

Monitoring: Network monitoring can be done using SNMP-based tools like SolarWinds, PRTG, etc. You can monitor network performance, and device status, and receive alerts for any anomalies. This will of course depend upon answers to some of the initial questions like what kind of hosts and what kind of traffic you can expect.

Example 2: Build an automated network design framework.
This is a broad question, and you might need to clarify what exactly they mean by an “automated network design framework”. If they’re talking about automating network configuration and management, then you could discuss technologies like:

Software-defined networking (SDN): This allows for centralized network management and configuration.
Network function virtualization (NFV): This can help to automate the deployment of network services.
Configuration management tools like Ansible, Puppet, Chef, etc. These tools can automate the deployment and configuration of network devices.
Intent-based networking (IBN): This is a form of network automation that uses machine learning and AI to automatically configure and manage networks.

Remember, the key to these design questions is to clarify the requirements first, and then propose a solution based on best practices and your own experience.

Out of the whole process, I as an interviewer, would evaluate what clarification questions were asked by the interviewee, rather than the final network design. The questions indicate the level of critical thinking involved in the process.

If you have further clarification questions, feel free to respond so that we can continue the conversation…

I hope this has been helpful!

Laz

neeraj.menon.m · March 30, 2024, 12:02am

Thank you for the detailed response!
I’ve a query regarding routing protocols. What questions would you ask when comparing two protocols to pick for an IGP? I’m looking for a pros vs cons / suitable scenarios for each. I’m aware of few comparisons like EIGRP converging faster than OSPF due to already having a feasible successor, but also that it’s potential to suffer from stuck in active cases makes it tougher to manage. What design criteria, practical considerations or real world use-cases could influence choosing between EIGRP vs OSPF or distance vector vs link state for an IGP? I’m not sure if this is an important question from an interview front, but I’ve been very curious about this!

I’m expected to be able to generally describe and contrast protocols. I’m prepared for questions like how loop prevention is achieved differently in eBGP, iBGP, OSPF and EIGRP, differences in types of packets, states, tables, neighborship and update mechanisms. With BGP, I’ve also covered things like the need for IBGP full mesh and using route reflectors or confederations to overcome it, using attributes for influencing paths and a little bit about communities. Are there any other major concepts I might be missing? Also, as an interviewer, what kind of questions would you ask to check for a competency in these topics?

Also, considering this is for a position at Meta, what considerations would you keep in mind when designing a network that caters to a social media platform like Facebook? With millions of globally distributed users to serve and large amounts of data and media going through the data centers, what would an end-to-end solution look like at a high level?

lagapidis · April 2, 2024, 5:53am

Hello Neeraj

Before I address your specific questions, keep in mind that the key to an interview is that the interviewer is attempting to determine if the interviewee has the skill set that they’re looking for. That’s the number one thing that determines the interview questions. If I were the interviewer, I would want the candidate to have a good balance between critical thinking and knowledge required for the job. When I ask a question, I want first to see that the candidate has at least the baseline level of knowledge that I’m looking for. If they’re not sure about a detail about a particular routing protocol, that’s not a dealbreaker. They can always look it up in a real-world scenario!

The more important thing I would look for is critical thinking. The kinds of clarification questions they ask, the thought process they would go through, and the way they would approach a problem are all vital to understanding how well they’d behave in the job position.

Like I said, if the job position demands it, I would try to find out what knowledge they have of routing protocols.

Here are some criteria that you can use to approach this issue. By no means is this exhaustive, but it’s a good start:

Network Size - OSPF is preferable for extensively large networks because of its hierarchical design.
Network topology - If the topology lends itself to naturally occurring areas, then OSPF would be better. If it does not have any naturally occurring separation into areas, EIGRP may be a better bet.
Vendor compatibility - OSPF is an open protocol while EIGRP, although supported by some vendors other than Cisco, is generally more a Cisco-oriented protocol.
Resource usage - On larger scale networks, OSPF can be more resource intensive, especially if you have many subnets.
Convergence time - EIGRP in most cases is generally faster… in order to achieve similar results from OSPF it requires additional tweaking and more complex configurations.

Those are some of the differences that can affect what you would use. This NetworkLessons note I just created also lists some more details in these differences. You may also find some useful information here:

I believe that the most important thing at this point is for them to see that you are someone who has a good baseline understanding of things. Typically, companies like Meta will want to invest in their workforce, so they will train you in the areas they are interested in. They want to know how quickly they can get an ROI on that investment. Because there is quite a bit of turnover in the industry, they want to make sure that you are quick to pick up things and use the knowledge and skillset that you obtained in their training, for several years.

Show them your competency and your willingness and capacity to learn, I think that’s the key. I hope I’ve helped you out and responded to most of your questions.

I hope this has been helpful!

Laz