Cover Page

Revised and Updated 2nd Edition

Advanced Networks Set

coordinated by
Guy Pujolle

Volume 1

Software Networks

Virtualization, SDN, 5G and Security

Guy Pujolle

Wiley Logo

Introduction

Currently, networking technology is experiencing its third major wave of revolution. The first was the move from circuit-switched mode to packet-switched mode, the second from hardwired to wireless mode, and finally the third revolution, which we will examine in this book, is the move from hardware to software mode. Let us briefly examine these three revolutions, before focusing more particularly on the third, which will be studied in detail in this book.

I.1. The first two revolutions

A circuit is a collection of hardware and software elements, allocated to two users – one at each end of the circuit. The resources of that circuit belong exclusively to those two users; nobody else can use them. In particular, this mode has been used in the context of the public switched telephone network (PSTN). Indeed, telephone voice communication is a continuous application for which circuits are very appropriate.

A major change in traffic patterns brought about the first great revolution in the world of networks, pertaining to asynchronous and non-uniform applications. The data transported for these applications make only very incomplete use of circuits, but are appropriate for packet-switched mode. When a message needs to be sent from a transmitter to a receiver, the data for transmission are grouped together in one or more packets, depending on the total size of the message. For a short message, a single packet may be sufficient; however, for a long message, several packets are needed. The packets then pass through intermediate transfer nodes between the transmitter and the receiver, and ultimately make their way to the endpoint. The resources needed to handle the packets include memories, links between the nodes and sender/receiver. These resources are shared between all users. Packet-switched mode requires a physical architecture and protocols – i.e. rules – to achieve end-to-end communication. Many different architectural arrangements have been proposed, using protocol layers and associated algorithms. In the early days, each hardware manufacturer had their own architecture (e.g. SNA, DNA, DecNet, etc.). Then, the OSI (Open System Interconnection) model was introduced in an attempt to make all these different architectures mutually compatible. The failure of compatibility between hardware manufacturers, even with a common model, led to the re-adoption of one of the very first architectures introduced for packet-switched mode: TCP/IP (Transport Control Protocol/Internet Protocol).

The second revolution was the switch from hardwired mode to wireless mode. Figure I.1 shows that, by 2020, terminal connection should be essentially wireless, established using Wi-Fi technology, including 3G/4G/5G technology. In fact, increasingly, the two techniques are used together, as they are becoming mutually complimentary rather than representing competition for one another. In addition, when we look at the curve shown in Figure I.2, plotting worldwide user demand against the growth of what 3G/4G/5G technology is capable of delivering, we see that the gap is so significant that only Wi-Fi technology is capable of handling the demand very strongly until 2020, and then less and less due to the massive opening of new frequencies, especially those higher than 20 GHz. We will come back to wireless architectures, because the third revolution also has a significant impact on this transition towards radio-based technologies, especially 5G technology.

images

Figure I.1. Terminal connection by 2020

images

Figure I.2. The gap between technological progress and user demand. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

I.2. The third revolution

The third revolution, which is our focus in this book, pertains to the move from hardware-based mode to software-based mode. This transition is taking place because of virtualization, whereby physical networking equipment is replaced by software fulfilling the same function.

Let us take a look at the various elements which are creating a new generation of networks. To begin with, we can cite the Cloud. The Cloud is a set of resources which, instead of being held at the premises of a particular company or individual, are hosted on the Internet. The resources are de-localized and brought together in resource centers, known as datacenters.

The reasons for the Cloud’s creation stem from the low degree of use of server resources worldwide: only 10–20% of servers’ capacities are actually being used. This low value derived from the fact that servers are hardly used at all at night-time, and see relatively little use outside of peak hours, which represent no more than 4–5 hours each day. In addition, the relatively low cost of hardware meant that, generally, servers were greatly oversized. Another factor that needs to be taken into account is the rising cost of personnel to manage and control the resources. In order to optimize the cost of both resources and engineers, those resources need to be shared. The purpose of Clouds is to facilitate such sharing in an efficient manner.

Figure I.3 shows the growth of the public Cloud services market. Certainly, that growth is impressive, but in the final analysis, it is relatively low in comparison to what it could have been if there were no problems of security. Indeed, as the security of the data uploaded to such systems is rather lax, there has been a massive increase in private Clouds, taking the place of public Cloud services. In Chapter 11, we will examine the advances made in terms of security, with the advent of secure Clouds.

images

Figure I.3. Public Cloud services market and their annual growth rate

Virtualization is also a key factor, as indicated at the start of this chapter. The increase in the number of virtual machines is undeniable, and in 2019, three quarters of the servers available throughout the world are virtual machines. Physical machines are able to host increasing numbers of virtual machines. This trend is shown in Figure I.4. In 2019, each physical server hosts approximately 10 virtual machines.

The use of Cloud services has meant a significant increase in the data rates being sent over the networks. Indeed, processing is now done in datacenters, and both the data and the signaling must be sent to these datacenters and then returned to the user after processing. We can see this increase in data rate requirement by examining the market of Ethernet ports for datacenters. Figure I.5 plots shipments of 1 Gbps Ethernet ports against those of 10, 40 and 100 Gbps ports. As we can see, 1 Gbps ports, which are already fairly fast, are being replaced by ports that are ever more powerful.

images

Figure I.4. Number of virtual machines per physical server

images

Figure I.5. Ethernet port shipment

The world of the Cloud is, in fact, rather diverse, if we look at the number of functions which it can fulfill. There are numerous types of Clouds available, but three categories, which are indicated in Figure I.6, are sufficient to clearly differentiate them. The category that offers the greatest potential is the SaaS (Software as a Service) cloud. SaaS makes all services available to the user – processing, storage and networking. With this solution, a company asks its Cloud provider to supply all necessary applications. Indeed, the company subcontracts its IT system to the Cloud provider. With the second solution – PaaS (Platform as a Service) – the company remains responsible for the applications. The Cloud provider offers a complete platform, leaving only the management of the applications to the company. Finally, the third solution – IaaS (Infrastructure as a Service) – leaves a great deal more initiative in the hands of the client company. The provider still offers the processing, storage and networking, but the client is still responsible for the applications and the environments necessary for those applications, such as the operating systems and databases.

images

Figure I.6. The three main types of Cloud

More specifically, we can define the three Cloud architectures as follows:

  • – IaaS (Infrastructure as a Service): this is the very first approach, with a portion of the virtualization being handled by the Cloud, such as the network servers, the storage servers and the network itself. The Internet network is used to host PABX-type machines, firewalls or storage servers, and more generally, the servers connected to the network infrastructure.
  • – PaaS (Platform as a Service): this is the second Cloud model whereby, in addition to the infrastructure, there is an intermediary software program corresponding to the Internet platform. The client company’s own servers only handle the applications.
  • – SaaS (Software as a Service): with SaaS, in addition to the infrastructure and the platform, the Cloud provider actually provides the applications themselves. Ultimately, nothing is left to the company, apart from the Internet ports. This solution, which is also called Cloud computing, outsources almost all of the company’s IT and networks.

Figure I.7 shows the functions of the different types of Cloud in comparison with the classical model in operation today.

images

Figure I.7. The different types of Clouds

The main issue for a company that operates a Cloud is security. Indeed, there is nothing to prevent the Cloud provider from scrutinizing the data, or – as much more commonly happens – the data from being requisitioned by the countries in which the physical servers are located; the providers must comply. The rise of sovereign Clouds is also noteworthy: here, the data are not allowed to pass beyond the geographical borders. Most states insist on this for their own data.

The advantage of the Cloud lies in the power of the datacenters, which are able to handle a great many virtual machines and provide the power necessary for their execution. Multiplexing between a large number of users greatly decreases costs. Datacenters may also serve as hubs for software networks and host virtual machines to create such networks. For this reason, numerous telecommunications operators have set up companies that provide Cloud services for the operators themselves and also for their customers.

In the techniques that we will examine in detail hereafter, we find SDN (Software-Defined Networking), whereby multiple forwarding tables are defined, and only datacenters have sufficient processing power to perform all the operations necessary to manage these tables. One of the problems is determining the necessary size of the datacenters, and where to build them. Very roughly, there are a whole range of sizes, from absolutely enormous datacenters, with a million servers, to femto-datacenters, with the equivalent of only a few servers, and everything in between.

I.3. “Cloudification” of networks

Figure I.8 shows the rise of infrastructure costs in time. We can see that a speed increase implies a rise in infrastructure costs whereas the income of telecommunication operators stagnates, partly due to very high competition to acquire new markets. It is therefore absolutely necessary to find ways to reduce the gap between costs and income. Among other reasons, two aspects are essential to start a new generation of networks: network automation using autopilot and the choice of open source software in order to decrease the number of network engineers and to avoid license costs for commercial software. Let us examine these two aspects before studying the reasons to turn to this new software network solution.

The automation of the network pilot is the very first reason for the new generation. The concept of the autopilot created here is similar to that of a plane’s autopilot. However, unlike a plane, a network is very much a distributed system. To achieve autopilot, we must gather all knowledge about the network – which means contextualized information – in all nodes if we want to distribute this autopilot or in a single node if we want to centralize this autopilot. Centralization was chosen for obvious reasons: simplicity and network congestion by packets with knowledge. This is the most important paradigm of this new generation of networks: centralization. This way, the network is no longer a decentralized system. It becomes centralized. It will be necessary to pay attention to the center’s security by doubling or tripling the controller, which is the name given to this central system.

The controller is the control device that must contain all knowledge about users, applications, nodes and network connections. From there, smart systems will be able to pilot packets in the infrastructure for the best possible service quality for all the clients using the network. As we will see later on, the promising autopilot for the 2020s is being finalized: the open source ONAP (Open Networking Automation Platform).

The second important aspect of the new generation of networks is the open source software. The rise of these open source software always comes from a need to reduce costs, and also to implement standards that can easily be followed by companies. The Linux Foundation is one of the major organizations in this area, and most of the software shaping future networks come from this Foundation, among which is the OPNFV (Open Platform Network Functions Virtualization) platform. This is the most important one since it gathers open source software that will act as a basic frame.

This tendency towards open source software raises questions such as: what will become of network and telecom suppliers since everything comes from open source software? Is security ensured with these thousands of thousands of coding lines in which bugs will occur? And so on. We will answer these questions in the Chapter 4, on open source software.

The rise of this new generation of networks, based on datacenters, has an impact on energy consumption in the world of ICT. This consumption is estimated in 2019 to account for 7% of the total carbon footprint. However, this proportion is increasing very quickly with the rapid rollout of datacenters and antennas for mobile networks. By way of example, a datacenter containing a million servers consumes approximately 100 MW. A Cloud provider with 10 such datacenters would consume 1 GW, which is the equivalent of a sector in a nuclear power plant. This total number of servers has already been achieved or surpassed by 10 well-known major companies. Similarly, the number of 2G/3G/4G antennas in the world is already more than 10 million. Given that, on average, consumption is 1500 W per antenna (2000 W for 3G/4G antennas but significantly less for 2G antennas), this represents around 15 GW worldwide.

Continuing in the same vein, the carbon footprint produced by energy consumption in the world of ICT is projected to reach 20% by 2025 if nothing is done to control the current growth. Therefore, it is absolutely crucial to find solutions to offset this rise. We will come back to this in the last chapter of this book, but there are solutions that already exist and are beginning to be used. Virtualization represents a good solution, whereby multiple virtual machines are hosted on a common physical machine, and a large number of servers are placed in standby mode (low power) when not in use. Processors also need to have the ability to drop to very low speeds of operation whenever necessary. Indeed, the power consumption is strongly proportional to processor speed. When the processor has nothing to do, it should almost stop, and speed up again when the workload increases.

Mobility is also another argument in favor of adopting a new form of network architecture. Figure I.8 shows that in 2020, the average speed of wireless solutions will be of several dozens of Mbit/s on average. Therefore, we need to manage the mobility problem. Thus, the first order of business is the management of multi-homing – i.e. being able to connect a terminal to several networks simultaneously. The word “multi-homing” stems from the fact that the terminal receives several IP addresses, assigned by the different networks it is connected to. These multiple addresses are complex to manage, and the task requires specific functionalities. Mobility must also make it possible to handle simultaneous connections to several networks. On the basis of certain criteria (to be determined), the packets of the same message can be separated and sent via different networks. Thus, they need to be re-ordered when they arrive at their destination, which can cause numerous problems.

images

Figure I.8. Speed of terminals based on the network used

Mobility also raises the issues of addressing and identification. If we use the IP address, it can be interpreted in two different ways: for identification purposes, to determine who the user is, and also for localization purposes, to determine the user’s position. The difficulty lies in dealing with these two functionalities simultaneously. Thus, when a customer moves sufficiently far to go beyond the sub-network with which he/she is registered, it is necessary to assign a new IP address to the device. This is fairly complex from the point of view of identification. One possible solution, as we can see, is to give two IP addresses to the same user: one reflecting his/her identity and the other the location.

Another revolution that is currently under way pertains to the “Internet of Things” (IoT): billions of things will be connected within the next few years. The prediction is that 50 billion will be connected to the IoT by 2020. In other words, the number of connections will likely increase tenfold in the space of only a few years. The “things” belong to a variety of domains: 1) domestic, with household electrical goods, home health care, domotics, etc.; 2) medicine, with all sorts of sensors both on and in the body to measure, analyze and perform actions; 3) business, with light level sensors, temperature sensors, security sensors, etc. Numerous problems arise in this new universe, such as identity management and the security of communications with the sensors. The price of identification is often set at $40 per object, which is absolutely incompatible with the cost of a sensor which is often less than $1. Security is also a complex factor, because the sensor has very little power, and is incapable of performing sufficiently sophisticated encryption to ensure the confidentiality of the transmissions.

Finally, there is one last reason to favor migration to a new network: security. Security requires a precise view and understanding of the problems at hand, which range from physical security to computer security, with the need to lay contingency plans for attacks that are sometimes entirely unforeseeable. The world of the Internet today is like a bicycle tire which is made up entirely of patches (having been punctured and repaired numerous times). Every time an attack succeeds, a new patch is added. Such a tire is still roadworthy at the moment, but there is a danger that it will burst if no new solution is envisaged in the next few years. Near the end of this book, in Chapter 15, we will look at the secure Cloud, whereby, in a datacenter, a whole set of solutions is built around specialized virtual machines to provide new elements, the aim of which is to enhance the security of the applications and networks.

An effective security mechanism must include a physical element: a safe box to protect the important elements of the arsenal, necessary to ensure confidentiality, authentication, etc. Software security is a reality, and to a certain extent, may be sufficient for numerous applications. However, secure elements can always be circumvented when all of the defenses are software-based. This means that, for new generations, there must be a physical element, either local or remote. This hardware element is a secure microprocessor known as a “secure element”. A classic example of this type of device is the smartcard, used particularly prevalently by telecom operators and banks.

Depending on whether it belongs to the world of business or public electronics, the secure element may be found in the terminal, near to it or far away from the terminal. We will examine the different solutions in the subsequent chapters of this book.

Virtualization also has an impact on security: the Cloud, with specialized virtual machines, means that attackers have remarkable striking force at their disposal. In the last few years, hackers’ ability to break encryption algorithms has increased by a factor of 106.

Another important point that absolutely must be integrated in networks is “intelligence”. So-called “intelligent networks” have had their day, but the intelligence in this case was not really what we mean by “intelligence” in this field. Rather, it was a set of automatic mechanisms, used to deal with problems perfectly defined in advance, such as a signaling protocol for providing additional features in the telephone system. In the new generation of networks, intelligence pertains to learning mechanisms and intelligent decisions based on the network status and user requests. The network needs to become an intelligent system, which is capable of making decisions on its own. One solution to help move in this direction was introduced by IBM in the early 2000s: “autonomic”. “Autonomic” means autonomous and spontaneous – autonomous in the sense that every device in the network must be able to independently make decisions with knowledge of the situated view, i.e. the state of the nodes surrounding it within a certain number of hops. The solutions that have been put forward to increase the smartness of the networks are influenced by Cloud technology. We will discuss them in detail in the chapter about MEC (Mobile Edge Computing) and more generally about “smart edge” (Chapter 5).

Finally, one last point, which could be viewed as the fourth revolution, is concretization – i.e. the opposite of virtualization. Indeed, the problem with virtualization is a significant reduction in performance, stemming from the replacement of hardware with software. There is a variety of solutions that have been put forward to regain the performance: software accelerators and, in particular, the replacement of software with hardware, in the step of concretization. The software is replaced by reconfigurable hardware, which can transform depending on the software needing to be executed. This approach is likely to create morphware networks, which will be described in Chapter 16.

I.4. Conclusion

The world of networks is changing greatly, for the reasons listed above. It is changing more quickly than might have been expected a few years ago. A suggestion to redefine networks architecture was put forward, but failed: starting again from scratch. This is known as the “Clean Slate Approach”: forgetting about everything we know and start over. Unfortunately, no concrete proposition has been adopted, and the transfer of IP packets continues to be the solution for data transport. However, in the numerous propositions, virtualization and the Cloud are the two main avenues which are widely used today and upon which this book focuses.

1
Virtualization

In this chapter, we introduce virtualization, which is at the root of the revolution in the networking world, as it involves constructing software networks to replace hardware networks.

Figure 1.1 shows the process of virtualization. We simply need to write a code that performs exactly the same function as the hardware component. With only a few exceptions, which we will explore later on, all hardware machines can be transformed into software machines. The basic problem associated with virtualization is the significant reduction in performance. On average (although the reality is extremely diverse), virtualization reduces performance by a factor of 100: i.e. the resulting software, executed on a machine similar to the machine that has been virtualized, runs 100 times more slowly. In order to recover from this loss of performance, we simply need to run the program on a machine that is 100 times more powerful. This power is to be found in the datacenters hosted in Cloud environments that are under development in all corners of the globe.

It is not possible to virtualize a certain number of elements, such as an antenna or a sensor, since there is no piece of software capable of picking up electromagnetic signals or detecting temperature. Thus, we still need to keep hardware elements such as the metal wires and optical links or the transmission/reception ports of a router and a switch. Nevertheless, all of the signal-processing operations can be virtualized perfectly well. Increasingly, we find virtualization in wireless systems.

In order to speed up the software processing, one solution would be to move to a mode of concretization, i.e. the reverse of virtualization, but with one very significant difference: the hardware must behave like a software. It is possible to replace the software, which is typically executed on a general machine, with a machine that can be reconfigured almost instantly, and thus behaves like a software program. The components used are derived from FPGAs (Field-Programmable Gate Arrays) and, more generally, reconfigurable microprocessors. A great deal of progress still needs to be made in order to obtain extremely fast concretizations, but this is only a question of a few years.

The virtualization of networking equipment means we can replace the hardware routers with software routers, and do the same for any other piece of hardware that could be made into software, such as switches, LSRs (Label Switching Routers), firewalls, diverse and varied boxes, DPI (Deep Packet Inspection), SIP servers, IP PBXs, etc. These new machines are superior in a number of ways. To begin with, one advantage is their flexibility. Let us look at the example given in Figure 1.1, where three hardware routers have been integrated in software form on a single server. The size of the three virtual routers can change depending on their workload. The router uses little resources at night-time when there is little traffic, and very large resources at peak times in order to handle all the traffic.

images

Figure 1.1. Virtualization of three routers. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

Energy consumption is another argument in favor of virtualization. While, to begin with, consumption would rise because we are adding an extra piece of software (the hypervisor or container), it is possible to share the resources more effectively, and move those resources, grouping them together on physical machines, and put other machines, which have become idle, on standby.

A physical machine can accommodate virtual machines if, as mentioned above, we add a hypervisor or a container manager, which is a software program that enables multiple containers, hence multiple virtual machines, to run simultaneously. In fact, the word “simultaneously” implies a macroscopic scale: on a microscopic scale, the virtual machines are executed sequentially one after another. In the context of virtual servers, this serial execution is not a problem. In the area of networks, it may become a problem for real-time applications, which require a very short response time. Each virtual machine’s processing time must be sufficiently short to give the impression that all the virtual machines are being executed in parallel. Figure 1.2 shows the architecture of virtualization.

images

Figure 1.2. A virtualized machine. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

In this section, we will go over the two solutions to obtain virtual machines, as shown in Figure 1.2. The hypervisor is a virtual machine monitor (VMM), which is often open source. Hypervisors operate on standard hardware platforms. In addition to the VMM, running directly on the physical hardware, the architecture generally comprises a number of domains running simultaneously. These domains execute virtual machines isolated from one another. Each virtual machine may have its own operating system and applications. The VMM controls access to the hardware from the various domains, and manages the sharing of the resources between the different domains. Thus, one of the VMM’s main tasks is to isolate the different virtual machines, so that the execution of one virtual machine does not affect the performances of the others.

All peripheral drivers are kept in an isolated domain specific to them. Known as “domain zero” (dom0), it offers a reliable and effective physical support. Dom0 has special privileges in comparison to other domains, known as “user domains” (domU) and, for example, has unfettered access to the hardware of the physical machine. User domains have virtual drivers and operate as though they have direct access to the hardware. However, in reality, those virtual drivers communicate with the dom0 in order to access the physical hardware.

The hypervisor virtualizes a single physical network interface, de-multiplexing the incoming packets from the physical interface to the user domains and, conversely, multiplexing the outgoing packets generated by those user domains. In this procedure, known as virtualization of the network input/output, the domain 0 directly accesses the input/output peripherals, using their native drivers, and performs input/output operations on behalf of the domUs.

The user domains use virtual input/output peripherals, controlled by virtual drivers, to ask the dom0 for access to the peripheral. Each user domain has its own virtual network interfaces, known as foreground interfaces, which are required for network communications. The background interfaces are created in the dom0, corresponding to each foreground interface in a user domain, and act as proxy for the virtual interfaces in the dom0. The foreground and background interfaces are connected to one another via an input/output channel, which uses a zero-copy mechanism to match the physical page containing the packet and the target domain. Thus, the packets are exchanged between the background and foreground interfaces. The foreground interfaces are perceived by the operating systems, working on the user domains, as real interfaces. However, the background interfaces in the dom0 are connected to the physical interface and to one another via a virtual network bridge. It is the default architecture, called “bridge mode”, used, for instance, by the Xen hypervisor, which was certainly one of the first to appear. Thus, both the input/output channel and the network bridge establish a path for communication between the virtual interfaces created in the user domains and the physical interface.

We will go back to hypervision techniques later in this chapter. Before this, let us introduce the second solution to support virtual machines, which seems to take the lead thanks to its simplicity and efficiency while offering a little less functionality. This solution is based on a unique operating system supporting containers that host the virtual machines. More precisely, a container is an abstract data structure, class or type that makes it possible to collect objects. The container technique is more flexible and simpler than that embedding an operating system for each virtual machine. Containers can migrate from one hardware to another, thus performing virtual machine migrations. The open source software called Kubernetes, which we will study later in this chapter, makes it possible to orchestrate migrations from one hardware’s containers to another hardware in the same cluster. The Kubernetes orchestrator seems to become standard to implement virtual machines in the new generation of networks.

1.1. Software networks

Virtual machines, in turn, can be used to create virtual networks, which are also known as software networks. For this purpose, we need to link virtual machines together in the same way as we would connect different physical machines. Of course, the communication links must be shared between the different software networks. A set of software networks are represented in Figure 1.3.

images

Figure 1.3. A set of software networks. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

Each software network may have its own architecture and its own characteristics. One software network could be devoted to a VoIP service, another to an IPTV service, a third to a highly secure application, a fourth to support professional applications, a fifth for asynchronous applications such as electronic messaging, etc. We could, in fact, practically create a software network for each user. The personalized software network is set up at the moment when the user connects. It is eliminated when the user signs out. However, this solution does not scale up, and today, we are limited to a number of software networks suited to the hardware capacity of the underlying physical infrastructure. Each software network receives resources allocated to it on the basis of the user demands. However, resources remain shared by different techniques that allow virtual networks to recover resources from other unused virtual networks.

It should be noted that, in general, the virtual nodes are found in datacenters, which may be of varying size and importance: enormous central datacenters, regional datacenters, local datacenters and small datacenters such as femto-datacenters. We will come back later on to the choices which may be made in this field.

One of the characteristics of software networks is that the virtual machines can be migrated from one physical machine to another. This migration may be automated based on whether a node is overloaded or out of order.

In the physical nodes that support the software networks, we can add other types of virtual machines such as firewalls, SIP servers for VoIP, middle boxes, etc. The networks themselves, as stated above, may obey a variety of different protocol architectures such as TCP/IPv4, UDP/IPv4, IPv6, MPLS, Ethernet Carrier Grade, TRILL, LISP, etc.

Isolation is, of course, a crucial property, because it is essential to prevent a problem on one software network from having repercussions for the other networks. The handover of streams from one software network to another must take place via a secure gateway outside of the data plane. This is absolutely necessary to prevent contamination between networks, such as a complete shutdown for a network attacked, for example, by a distributed denial of service (DDOS).

1.2. Hypervisors and containers

Clearly, virtualization needs hardware, which can be standard. We speak of commodity hardware (white box), with open specifications, produced en masse to achieve particularly low prices. We will talk further about it in the chapter on open source software (Chapter 4). There are various ways of placing virtual machines on physical equipment, and they can be classified into three broad categories, as shown in Figures 1.4–1.6. The first two figures correspond to hypervisors and the third figure corresponds to containers.

images

Figure 1.4. Paravirtualization. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

A paravirtualization hypervisor is a program that is directly executed on a hardware platform and which hosts virtual machines linked to operating systems that have been modified so that the virtual machines’ instructions are directly executed on a hardware platform. This platform is able to support guest operating systems with their drivers. The classic hypervisors in this category include Citrix Xen Server (open source), VMware vSphere, VMware ESX, Microsoft Hyper-V Server, Bare Metal and KVM (open source). These programs are also known as type-1 hypervisors.

The second category of hypervisor, or type 2 hypervisor, is a program that is executed on the hardware platform, supporting native operating systems, which means without any modification. The native operating system, when invited by the hypervisor, is executed on the device thanks to an emulator so that the underlying device takes all constructions into account. The guest operating systems are unaware that they are virtualized, so they do not require any modifications, as opposed to paravirtualization. Examples of this type of virtualization would include Microsoft Virtual PC, Microsoft Virtual Server, Parallels Desktop, Parallels Server, Oracle VM Virtual Box (free), VMware Fusion, VMware Player, VMware Server, VMware Workstation and QEMU (open source).

images

Figure 1.5. Virtualization by emulation. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

The third type leaves behind the previous hypervisor systems, running several machines simultaneously as containers. In such case, we speak of an isolator. An isolator is a program that isolates the execution of the applications in an environment, called the context, or indeed the zones of execution. Thus, the isolator is able to run the same application multiple times in a multi-instance mode. This solution performs very well, because it does not cause any overload, but the environments are more difficult to isolate.

images

Figure 1.6. Virtualization by containers. For a color version of the figure, see www.iste.co.uk/pujolle/software2.zip

In summary, this last solution facilitates the execution of the applications in execution zones. In this category, we can cite the examples of Linux-Vserver, chroot, BSD Jail and Open VZ and most of the container solutions such as Docker.

1.3. Kubernetes

Kubernetes (also called K8s) is an open source system that allows deployment, rise and management of containered applications. This solution was first created by Google, which gave it to the Cloud Native Computing Foundation. This platform allows deployment automation, rise and implementation of application containers on clusters and servers. This open source software works with a whole range of container technologies, such as Docker, for example.

The Kubernetes architecture is shown in Figure 1.7. We can see Pods, which are containers or a group of containers hosted by servers that belong to a cluster of hardware. ETCD is the persisting storage unit for the cluster’s configuration data. The scheduler’s goal is to share the workload on servers, thus managing Pods’ execution in the best possible way. Finally, Kubelet is responsible for the execution state of each server.

images

Figure 1.7. Architecture of the Kubernetes orchestrator

1.4. Software networks

Software networks have numerous properties that are novel in comparison to hardware networks. To begin with, we can easily move virtual machines around, because they are simply programs. Thus, we can migrate a router from one physical node to another. Migration may occur when a physical node begins to fail, or when a node is overloaded, or for any other reason decided on in advance. Migration of a node does not actually involve transporting the whole of the code for the machine, which would, in certain cases, be rather cumbersome and time-consuming. In general, the program needing to be migrated is already present in the remote node, but it is idle. Therefore, we merely need to begin running the program and send it the configuration of the node to be moved. This requires the transmission of relatively little data, so the latency before the migrated machine starts up is short. In general, we can even let both machines run at once, and change the routing so that the data only flow through the migrated node. We can then shut down the first router.

More generally, we carry out what is known as urbanization: we migrate the virtual machines to different physical machines until we obtain optimal performance. Urbanization is greatly used for optimization in terms of energy consumption or workload distribution, as well as to optimize the cost of the software networks or to make the network highly reliable or resilient. For example, in order to optimize energy consumption, we need to bring together the virtual machines on shared nodes and switch off all the nodes that are no longer active. In actual fact, these machines would not be shut down but rather placed on standby, which does still consume a small amount of energy, but only a very small amount. The major difficulty with urbanization arises when it is necessary to optimize all operational criteria at the same time because they are often incompatible – for example, optimizing consumption and performance at the same time.

A very important characteristic mentioned earlier is isolation: the software networks must be isolated from one another, so that an attack on one network does not affect the other networks. Isolation is complex, because simultaneously, we need to share the common resources and be sure that, at all times, each network has access to its own resources, negotiated at the time of establishment of the software network. In general, a token-based algorithm is used. Every virtual device on every software network receives tokens according to the resources attributed to it. For example, for a physical node, ten tokens might be distributed to network 1, five tokens to network 2 and one token to network 3. The networks spend their tokens on the basis of certain tasks performed, such as the transmission of n bytes. At all times, each device can have its own tokens and thus have a minimum data rate determined when the resources were allocated. However, a problem arises if a network does not have packets to send, because then it does not spend its tokens. A network may have all of its tokens when the other networks have already spent all of theirs. In this case, so as not to immobilize the system, we allocate negative tokens to the other two networks, which can then surpass the usage rate defined when their resources were allocated. When the sum of the remaining tokens less the negative tokens is equal to zero, then the machine’s basic tokens are redistributed. This enables us to maintain isolation while still sharing the hardware resources. In addition, we can attach a certain priority to a software network while preserving the isolation, by allowing that particular network to spend its tokens as a matter of priority over the other networks. This is relative priority, because each network can, at any moment, recoup its basic resources. However, the priority can be accentuated by distributing any excess resources to the priority networks, which will then always have a token available to handle a packet. Of course, isolation requires other characteristics of the hypervisors and the virtualization techniques, which we will not discuss in this book.

Virtualization needs to be linked to other features in order to fully make sense. SDN (Software-Defined Networking) is one of the paradigms strongly linked to virtualization, because it involves the uncoupling of the physical part from the control part. The control part can be virtualized and deported onto another machine, which enables us, for example, to have both a far great processing power than that of the original machine and also a much larger memory available.

1.5. Virtual devices

Another interesting application of virtualization is expanding. It is about digital twins. A hardware is associated with a virtual machine executed in a datacenter located either near or far from the hardware. The virtual machine executes exactly what the hardware does. Obviously, the hardware must supply the virtual machine with power when there is a change in parameters. The virtual machine should produce the same results as the hardware. If results are not similar, this shows a dysfunction from the hardware, and this dysfunction can be studied in real time on the virtual machine. This solution makes it possible to spot malfunctions in real-time and, in most cases, to correct them.

Examples of digital twins are being used or developed just like a plane engine twin that is executed in a datacenter. Similarly, soon, vehicles will have a twin, allowing us to detect malfunctions or to understand an accident. Manufacturers are developing digital twins for objects, but in this case, the digital twin’s power can be much bigger and it can perform actions which the object is not powerful enough to perform.

Scientists dream of human digital twins which could keep working while the human sleeps.

1.6. Conclusion

Virtualization is the fundamental property of the new generation of networks, where we make the move from hardware to software. While there is a noticeable reduction in performance at the start, it is compensated by more powerful, less costly physical machines. Nonetheless, the opposite move to virtualization is crucial: that of concretization, i.e. enabling the software to be executed on reconfigurable machines so that the properties of the software are retained and top-of-the-range performances can again be achieved.

Software networks form the backbone of the new means of data transport. They are agile, simple to implement and not costly. They can be modified or changed at will. Virtualization also enables us to uncouple functions and to use shared machines to host algorithms, which offers substantial savings in terms of resources and of qualified personnel.