How Not to Fail at Network Automation
By Christian Uremovic
Director of Marketing
A Strategic Approach to Successfully Deploying Network Automation
The global network automation market is expected to grow at a CAGR of over 48.7% because the benefits can rapidly transform your entire organization and your bottom line. But many network operators are growing frustrated as they struggle to implement a cohesive automation plan. After years of working with small and large companies alike, we have noticed that network operators typically fail at network automation when they are:
- Not building a step-by-step plan, instead assuming an “all-or-nothing” mindset
- Not having a systematic approach to collecting data from the network
- Not having good debugging processes
- Starting with high-risk, complex problems
- Over-engineering the solution
But let’s take a step back and look at the overall goals of automating your network. When implemented properly, network automation should:
- Reduce OpEx by improving operational efficiency
- Reduce CapEx by more effectively utilizing network resources
- Improve the customer experience and reduce churn with fewer service and network outages
- Sharpen competitive edge through service velocity and performance
Now that we know the benefits and challenges of implementing network automation, let’s create a step-by-step plan that can help get any network automation project off to a strong start.
Step 1 – Data Collection and Preparation
Getting the right data from the network in a unified and systematic way is a must in network automation, as a database of information will provide a benchmarking tool that can be used to properly evaluate performance and network behavior over time. But getting the data can be complicated. Different vendors have different information and data models.
The industry is tackling this problem; however, we are still developing competing standards, which further complicates the task. Open ROADM and open CONFIG are two data models that have shown great progress in transport networking, and by utilizing NETCONF and REST interfaces, equipment data can be collected in unified way.
Moreover, streaming telemetry and gRPC are further improving data collection from the network. For older, legacy network devices with proprietary software management tools, you may be able to capture data through software controllers or directly through network element interfaces. One tip: when evaluating software systems for your network, look for a solution that can harvest data not only from new, more standardized models and but from proprietary legacy devices as well.
Step 2 – Network and Services Visualization
Once a database is created, you can create your first network automation. A typical first step is to visualize the multi-layer and multi-vendor network in its various views: logical views, layered views, abstracted views, views around synchronization, link topology views, and finally, most importantly, the network services view. Just being able to visualize your network is a huge step forward, and valuable because your engineers and technicians are no longer “flying blind.” Just by being able to visualize the network, they can:
- Make decisions faster as they recognize visual patterns within the network
- Make more accurate decisions by having all the data in one tool
- Improve intelligence because they can see interactivity between devices and layers
- Improve reporting and overall organization as sharing data is now easy
- Improve network migration planning
Step 3 – Creating Your First Automation
Now that you have a database and can see the network and services from end to end, it is time to create your first dynamic automation.
Part of your software should include a multi-layer, multi-vendor path computation element (PCE) that utilizes live network information to calculate the best path for a given network goal or condition. PCEs can look for the shortest paths, lowest latency paths, paths with the least congestion, underutilized paths, and lowest cost paths. More advanced PCEs even support weighting and prioritizing multiple path selection criteria to improve overall efficiency.
Create your dynamic automation based on your own criteria for a given goal across a section of your network. One good example might be to find the shortest path in the network, with the least amount of congestion, prioritizing congestion as the more important criteria, as we will see later.
Once your first dynamic automation is behind you, many choose to create network slices by assigning ports, nodes, wavelengths, OTN channels, Ethernet tunnels, and more into groups or slices. Network slicing allows operators to simplify operations by creating multiple virtual networks for different purposes or applications on the same network hardware devices.
Step 4 – Closed Loop Automation
Now the real fun starts!
In the past you could set up a new 10G service and forget about it as the traffic was consistent and predictable. But as network demands are getting more dynamic, with the massive increase in the amount of traffic and the sheer growth in the number of devices coming onto the network, the old “set it and forget it” approach is obsolete. And think about what happens as the Internet of Things (IoT) and 5G services continue to grow – more devices looking for greater capacity and higher performance demands.
Now is the time to start dynamically allocating bandwidth across various points in your network to keep ahead of demand and performance requirements.
Closed-loop automation can be your answer, but don’t overdo it! Closed-loop automations have a set of rules that once set up are constantly analyzed in real time and adjusted based on network conditions. Start with some low-risk use cases. Pick a single domain or network segment to focus on and monitor it for a few days or weeks for performance.
Recently, a Tier 1 network operator created automation in a multi-vendor network segment where the traffic load was evenly balanced across the network segment dynamically.
By simply using congestion and delay as the path selection criteria, the operator was able to program the network to reroute services once a specific congestion level was reached. By doing so, it was able to eliminate frame loss and load balance the network automatically, without any human interaction. As shown below, the operator was able to more evenly balance traffic and improve utilization by as much as 30 to 60 percentage points on some routes.
As an added benefit, the total traffic load was increased by 25% in this part of the network without having to purchase or install any new hardware.
Once you gain some comfort level, look for ways to reach through to other domains. The door is now open for all kinds of creative thinking, from designing new revenue-generating services to improving time to market. The options are endless, but we’re not done yet.
Step 5 – Machine Learning and AI
If humans can do it, so can machines and learning algorithms.
Analytics, machine learning, and artificial intelligence (AI) are starting to play increasingly important roles in network automation. While still early in their application to networking, machine learning techniques can be used to analyze network telemetry and traffic information to identify and predict short-term traffic bursts and potential congestion events.
Armed with this predictive information, closed-loop automation software can then take proactive measures, including rerouting traffic, switching services, and load balancing the network before a service-affecting event even happens.
Operators should be taking steps to build their skills in network automation and develop a path toward a fully autonomous network now.