Technical Issues

Thanks to the rapid growth in network bandwidth and connectivity, networks and distributed systems have become critical infrastructures that underpin much of today’s Internet services. However, networks are highly complex, dynamic and time-varying systems, such that the statistical properties of networks and network traffic cannot be easily modeled. In the last few years, the journey of networks to get highly automated and more operationally efficient is being driven by both the academia and the industry. Network automation requires i) smart network components and ii) an efficient orchestration of the decisions taken by the Artificial Intelligence (AI)/Machine Learning (ML) models adopted. Although there are many smart network elements that came up in the last few years with enhanced capabilities, i.e., adopting ML techniques such as Deep Reinforcement Learning (DRL) to provide smart congestion control, an improved traffic classification and prediction, etc., such capabilities are exploited individually as per-device. However, a correct orchestration of such smart devices in terms of protocols and architectural designs, without impacting the network performance, is required to achieve an efficient step toward network automation. The provisioning of intelligence to computer networks results in many benefits, such as the ability of taking decisions before critical events happen (e.g., link faults, router malfunctioning) or adapting the network configuration according to traffic prediction in order to improve the network performance. However, from the application of AI/ML techniques to networking, several questions such as the following ones emerge:

  1. Is the network operating correctly after the application of the AI/ML techniques? Is it robust? Does it make fair decisions?
  2. What is the impact of a wrong decision? What is the reduction in the network performance derived from such wrong decision?

Furthermore, the training process of AI/ML techniques and the real world need to be aligned. The training process is normally done on simulation environments and its reproducibility on even slightly different scenarios impacts the accuracy of the predictions. In this sense, several issues need to be addressed:

  1. What is the cost of getting a highly accurate AI/ML model? The best accuracy when defining an ML model is desired, but it presents a tremendous burden on the infrastructure, both on the computing side and on the data side.
  2. Which is the environmental cost (in terms of the carbon footprint) impact of training and re-training the ML models because of the data drift issue? Sustainability of AI/ML solutions is an important challenge to be tackled.
  3. Is it really necessary to apply a supervised model with a very big number of features? Or a simplification in the model by using, e.g., an unsupervised model would achieve similar results and a reduction in the computational complexity?
  4. Is it possible to train on edge devices due to the high complexity of large models? Hierarchical models can be used to reduce the overall overhead and the memory bottlenecks.

The 3rd IPSN workshop aims to shed light to all these questions and other that will come up from the participants. The extensive experience of NOMS/IM community on network operations and optimization is an invaluable to our research efforts to identify opportunities to improve the management of computer networks when AI/ML techniques are applied, while maintaining a consistently high QoS. Indeed, the main topic of the conference, ”Towards intelligent, reliable, and sustainable network and service management”, perfectly fits the main goal of the proposed workshop. By exploiting AI/ML techniques, it is possible to act in advance when data gathered from softwarized networks suggest potential performance degradation, hence providing resilience to the network and to the services deployed on it.