Take Me to the Cloud!
November 23, 2021
By Teresa Monteiro
Director of Solutions, Software and Automation
Cloudifying embedded infrastructure
If you had the chance to attend one of the recent events focusing on network transformation and network automation, such as FutureNet Asia that took place a few weeks ago or Layer123 World Congress 2021 just last week, you will have noticed one common shift. The terms SDN and NFV, which have reigned at these types of events for almost a decade, are heard less and less. But these concepts and ideas have not simply been discarded nor replaced – on the contrary, they have evolved and matured. Networks that were expanding from physical to virtual are now moving to cloud. And cloud is indeed the latest buzzword in automation!
There is much discussion on how cloud-native network functions can help service providers move beyond connectivity services. There is also debate on whether the use of multi-cloud and hybrid cloud architectures for distributed cloud computing are key to achieve the scale and agility imposed by 5G and IoT.
In this blog, I will also focus on cloud in network automation – but I will do it from yet another perspective, that of cloudifying embedded infrastructure.
Cloud-native down to the network infrastructure
When we think of network automation, we typically think of the network management and control layer, and of automation applications enabled by an optical domain controller – automation applications such as network discovery and inventory, path computation, or service restoration.
But let’s not forget that there are also important automation applications implemented in the network operating system (NOS), the operating system running on the individual network elements. And when that NOS adopts a cloud-native architecture, it facilitates agile delivery and deployment of such applications.
Figure 1: Cloud-native network operating system
By cloud-native architectures and technologies, we specifically mean a NOS that is microservices-based and can be deployed in containers with the support of a container management system. This choice of software architecture has many well-known benefits; today, I will focus on the fact that it allows for software modules, developed and compiled elsewhere, to run autonomously in a network element environment, deployed in what is called a guest container.
In simple words, a guest container is an isolated component within the NOS that can host and execute a software agent. This software agent has access to open, exposed interfaces, but not to any other internal NOS parameters.
The deployment of software agents within guest containers enables the extension of NOS features, accelerates the introduction of innovative automation applications, and supports the development of customized functionality.
A NOS-agnostic software agent can be implemented and compiled independently, in a foreign development environment, by an operator or a third-party supplier, and, once downloaded, will run smoothly in a cloud-native NOS.
Furthermore, since a containerized architecture offers a variety of deployment options, the same agent can be deployed on the fly and run locally on a network element, on a server, or in the cloud. It can be ported easily across platforms: to the cloud, when an application needs to scale, to the network element processor when there are latency constraints.
What about adaptive streaming telemetry?
Let me describe a concrete example: an automation application named adaptive streaming telemetry that extends and improves the standard streaming telemetry mechanism and has been successfully implemented as a NOS-agnostic software agent.
Streaming telemetry is a network monitoring methodology where an external system subscribes to a specific network element data stream, among all the monitoring data that the equipment is able to expose. From there on, the network element pushes all corresponding data, in an almost continuous manner, to the server that subscribed it. Streaming telemetry ensures low-latency, high-performance data collection, enabling near real-time access to large volumes of network data.
Figure 2: The principles of streaming telemetry
However, standard streaming telemetry can still be improved. In a modern network, there are plenty of network parameters available to be streamed; under normal operation, many are redundant and monitoring them does not add meaningful information, while also imposing an unnecessary load on the system. This is where adaptive streaming telemetry, a solution that adjusts dynamically to the network status and evolves with the network’s needs, is useful:
- Under normal network operations, a fixed, limited set of parameters is included in each data stream and pushed to the data collectors at a moderate frequency. These are the parameters that best summarize the network status.
- Upon network status change, the streaming frequency and the content of the data stream are adjusted: the push frequency may be increased, or more parameters can be added to the telemetry stream, for further insight.
This approach decreases the load on the data communication network compared to standard telemetry, but it continues to support fine-grained visibility when and where needed. It also contributes to better overall data quality, which, in turn, allows for better compliance to SLAs, improves characterization of a network element’s health, and unleashes the predictive power of analytics and machine learning.
Figure 3: The power of adaptive streaming telemetry
Infinera has worked jointly with Oracle and Microsoft in adaptive streaming telemetry solutions. Earlier this year at OFC, we demonstrated an extension to the standard gRPC-based streaming telemetry implemented via a NOS-agnostic software agent in the Go open source programming language, running in a NOS guest container. The same agent was successfully deployed and tested in Infinera’s optical network operating systems and in SONiC, an open-source network operating system that includes strong support for routing protocols. The use of the same software across various technologies and equipment vendors ensures that the behavior of adaptive streaming telemetry is uniform and consistent.
Many faces, common technologies
Automation applications that, like adaptive streaming telemetry, intelligently observe the network are key ingredients for implementing intent-based cognitive networking. I had the pleasure of presenting this innovative cloud-native application at Layer123 World Congress 2021, and I encourage you to watch the replay here. However, what really impressed me at the Congress this year was seeing that adaptive streaming telemetry is part of a fast-growing ecosystem of automation applications that are leveraging cloud technologies to bring operators closer to the vision of a network that is self-adapting, self-healing, and self-optimizing.