Telecom IQ
Showing results for 
Search instead for 
Did you mean: 

Upgrade to Carrier Grade: The Nuts and Bolts of NFV


By: Tariq Khan - Chief Technologist SDN/NFV

Arun Thulasi - WW Lead Technologist, NFV POC Center of Excellence &

Master Solution Architect, Global Solution Engineering


Imagine if you are a certain John Smith, living in the suburbs of New York City. If you needed a car, you would buy one that fits your needs and constraints – commuting from home to work, picking up the kids from school and soccer, gas mileage and the like. A car that would meet your needs would look a lot like this.







On the other hand, if you are Bruce Wayne, watching over Gotham City and moonlighting as its Caped Crusader, your car would only share its name with its less illustrious cousin above and would be a completely different animal as below.

car 2.jpg


Enterprise clouds are akin to the John Smiths of our world. Their needs are centered on virtualizing large pools of infrastructure that are rapidly scalable to meet elastic workloads and provide significant cost reductions. Telco clouds, like the Bruce Wayne, are tellingly different.


A checklist for carrier grade


Communications service providers (CSPs) are exploring network functions virtualization (NFV) as a means to achieve leaner cost structures and cloud style agility. In choosing industry standard physical infrastructure components and overlaying them with a variety of industry tested open source technologies, CSPs can reduce capital and operating expenses.


However, unlike enterprise IT, Telco environments demand significantly higher levels of reliability, availability and resilience. Telco clouds require an enhanced architectural framework that provides these features at various levels of the architecture. This includes providing predictable levels of services, typically in the 5 to 6 9’s of availability, for both the control and compute planes. The control plane encompasses the various elements that manage the physical and virtual infrastructure while the compute plane hosts the various entities forming part of the target workload.


Beyond the various availability and reliability needs, Telco cloud environments also require higher levels of performance than enterprise clouds. This includes performance enhancements needed at the compute, memory, storage and networking layers.


To be considered carrier grade, a platform must provide IT-driven cost structures while also offering Telco-grade reliability and performance to be considered by CSPs.


Highly available control plane


CSPs expect a uniform availability metric across the control and compute planes. This requires that the control plane be deployed in a redundant manner similar to how virtual network functions (VNFs) are deployed redundantly.


To ensure an appropriate level of availability, a carrier grade platform needs to ensure that the critical services it deploys are stateless with an equally available persistent platform that tracks state. In a typical carrier grade environment, this would use a SQL or No-SQL database cluster deployed in way to avoid split-brain situations. To ensure consistent levels of availability across the virtualized platform, carrier grade environments require the use of a common hypervisor that is widely used. Kernel-based virtual machines (KVMs) provide an industry-tested hypervisor that fits the need.


Real-time extensions with pre-emptible kernel


Telco workloads require deterministic behavior from application processes that can be specified and modified by assigning priorities. In a traditional operating system, processes are allocated a time-slice (or a maximum execution time) that they spend executing their tasks. A user-space process with a higher priority has to wait until the time-slice expires for an executing user-space process. Additionally, with a non-pre-emptible kernel, a thread executing in the kernel has to either finish or voluntarily relinquish control for another process to take over. This feature impacts the reliability and real-time capabilities of a Telco-grade platform.


A carrier grade eligible operating system needs to provide real time extensions that provide a fully pre-emptible kernel. The scheduler is equipped to handle real-time priorities by pre-empting an existing process to execute another process with a higher priority. Linux has real-time extensions enabled in the 2.6 and 3.0 versions of the kernel.


CPU and memory enhancements for workload VMs


Cloud environments may add a hypervisor layer which acts as overhead when compared to traditional bare-metal systems. To ensure performance is not hampered by the usage of a hypervisor, cloud environments provide various CPU and memory enhancements for workload VMs.


A carrier grade environment allows the user to pin specific VMs to specific CPUs to ensure deterministic performance. Host operating systems also ideally enable large pages and reserve contiguous memory pools for VMs to reduce/remove memory fragmentation. Current-day servers have NUMA capabilities which provide low-latency access to memory that are within the same cell. By allowing libvirt to understand CPU maps, VMs can be loaded on specific cells thereby achieving locality of memory and increased performance.


Networking enhancements through virtual functions and kernel bypass for virtual switches


A virtualized environment relies on virtual switches to provide rich and secure networking capabilities to the workload VMs. Performance levels in the data plane are very critical for the success of a network-intensive environment like the Telco cloud. However, virtualization adds another layer of processing to the flow of networking traffic which has an expected impact on networking throughput. The latency added because of the hypervisor layer is a key performance challenge that would have to be addressed by the carrier grade platform.


The two widely used options to reduce networking latency in a carrier grade platform are

  1. Single root I/O virtualization(SR-IOV): a mechanism that allows a single networking device, that typically implements one physical function (PF), to be visible to the system as multiple networking devices, or in other words, implementing multiple virtual functions (VF)
  2. Data Plane Developers Kit (DPDK) -enabled vSwitch: a kernel-bypass mechanism that allows the virtual switches to bypass the kernel and communicate directly with a compatible NIC for fast packet processing.

IPv6-enabled Telco clouds


IPv4 addresses have either officially run out or on the verge of running out. It has been proven beyond a reasonable doubt that IPv6 is required to address the burgeoning address space needs and CSPs desperately needs this to be operational.


The cloud operating system, which forms a key component of the carrier grade platform, needs to be IPv6 enabled to achieve at least the following key requirements:

  1. IPv6 enabled control infrastructure: All elements of the control plane must have and listen on IPv6 addresses
  2. IPv6 enabled tenant infrastructure: Tenants must have the ability to define private and shared IPv6 networks
  3. Integration with existing IPv6 clouds: An NFV-enabled IPv6 environment must be able to communicate with existing IPv6 environments outside of its own
  4. Integration with legacy IPv4 clouds: An NFV-enabled IPv6 environment must be able to communicate with legacy IPv4 environments.



A carrier grade environment not only provides an open, cost-efficient, industry standard platform for CSPs to host their converged clouds, but it must meet the CSPs' stringent availability, reliability and manageability needs. By leveraging a bevy of technologies conceived and built by the open source community and adhering to the specifications set by the various standards bodies, a carrier grade platform provides a high-performing, standards-compliant environment for hosting network services.

0 Kudos
About the Author