Ansible, part VI – Network Automation and some optimizations

Ansible is great for managing network devices. This applies to both simple and more advanced changes that can be made directly or through various controllers, such as Cisco NSO (Network Services Orchestrator), Cisco ACI (Application Centric Infrastructure), Cisco SD-WAN or Cisco DNA (Digital Network Architecture). In this article, we will focus on direct automation of selected Cisco Systems network devices.

When we use Ansible to manage GNU/Linux hosts, Ansible copies a module written in Python to the managed host and runs it there. This has a great advantage, especially when the execution of tasks is complex and time-consuming. Thanks to this, the node used for management is almost not overloaded and with little resources it can serve a very large number of hosts. In the case of network automation, it looks a bit different, at least from the inside, because from the operational point of view, nothing changes.

There are several reasons for this. A module that runs on a host usually runs directly on its configuration files. In the case of network devices, you do not work directly with files. Configuration is available through an appropriate interface, such as CLI over SSH, XML over SSH or API over HTTP / HTTPS. The module running on the host additionally creates various directories and files, which is not often either possible or efficient on network devices due to the type of memory used in network devices. The module also requires Python to work, which unfortunately most older network devices do not have. This leads to the fact that it is most convenient to execute the Ansible module locally and then connect via the shared interface.

Ansible establishes an SSH connection by default and tries to send its module to the remote node. To change this, we need to use one of the three types of connections available:

  • network_cli which uses CLI over SSH (persistent connection).
  • netconf which uses XML over SSH (persistent connection).
  • httpapi which uses the API over HTTP / HTTPS (persistent connection).

Previously, the “local” connection type was used for this purpose, which indicated to Ansible that the module must be started locally and the connection to the managed node must be established in the manner specified in the “provider” dictionary. Unfortunately, this was not done efficiently as the connection was re-opened and closed for each task. It is currently not recommended to use the “local” connection type.

In the case of network automation, we must assume that the module will run directly on the managing node, and not on network devices. Often, it will even send some commands to network devices, wait for information received, do something locally after receiving it, and then send something back to them. Therefore, with large number of network devices, it may be required to use more Ansible nodes. Such clusters can be managed from one place using the Red Hat Ansible Automation Platform.

Playbook’s tasks for each node are performed sequentially in the order in which they are arranged. However, they can be performed in parallel on many nodes at the same time. How many at the same time depends on our configuration – by default 5. This can be modified using the “-f” or “–fork” option of the “ansible” and “ansible-playbook” commands or the “forks” parameter of the “ansible.cfg” file. This parameter determines the number of parallel processes that will be available to run the tasks. In this way, the main process splits or in other words, forks into many smaller ones. By default, Ansible waits for all other nodes to finish executing the previous task before starting a new task on the first node. Even when the rest of the processes are not busy and do nothing. This default behavior is due to the default setting of the “strategy” parameter to “linear“.

When a small group of devices is significantly slower than the rest, consider changing this parameter. After setting the “strategy” parameter to “free“, Ansible will execute the next tasks as quickly as possible. That is, as soon as there are free-to-use processes. It will continue to execute the tasks in the correct order, but when no more processes are needed for the current task, they will start handling subsequent tasks on those nodes that have already completed the previous task. This makes the use of available processes more efficient, which translates into improved overall efficiency.

Additionally, if performance is important to us, we should avoid issuing the “show running-config” command. It is one of the most resource-consuming commands, that can be sent to a network device. Its processing takes so long that we advise against executing any commands based on it, on the large number of network devices and using it in playbooks.

Now let’s go through a few examples of using Ansible to automate Cisco Systems devices. We will use two switches for this purpose:

  • Cisco Nexus 3172TQ with NXOS version 7 (Ansible modules prefixed with “nxos_“).
  • Cisco Catalyst 9300 running IOS-XE version 16 (Ansible modules prefixed with “ios_“).

As we do our tasks only on two different switches, there might as well be many, many more. Therefore, while these are simple activities, when we talk about performing them on a larger number of devices or each time a service is created or deleted, doing it manually is ineffective, time-consuming and prone to errors or omissions.

Our inventory file has been slightly modified. With “ansible_network_os” we have indicated which operating systems are used by our network devices. We also used the simplest way to log in to them, using a user and an explicit (clear text) password. Ansible allows you to store sensitive information such as passwords in special credential containers so that they are not available in clear text. Due to the fact, that our goal here is to show you how to manage network devices in the easiest way, and we have not shown this connection method yet, we will use it now.

We will use the “network_cli” method to establish connections. We’ll start with something simple and useful, which is collecting configurations from all network devices. For this purpose we will use the parameters “backup” of modules “ios_config” and “nxos_config“.

If we do not indicate explicitly where the backups must be stored and we do not provide the names of their files, then in the current directory Ansible will create a directory called “backup/” and place configurations from network devices there.

Now let’s do something more demanding. On the Cisco Catalyst 9300 switch, we will secure the VTY lines through which the switch is managed. This will only be possible from two IP addresses and only via SSH. Additionally, everyone will be automatically logged out after 15 minutes of inactivity. If there has been a change in the configuration, it will be permanently saved.

CLI commands available in Cisco IOS-XE and Cisco NXOS systems are related to the section or context in which they are issued. This means that the same command can do quite different things depending on where it is issued. The commands we want to issue are placed in the “lines” list. The “parents” parameter allows you to set the appropriate context or section in which they will be published.

It also happens that we want to delete some previous commands before issuing new ones. The “before” parameter is used for this purpose. In our example, we first delete the entire ACL and then create a new one from scratch. The commands on the “before” list will be issued only if the module has to make changes.

The “match” parameter is used to verify this, which compares the current configuration with what we want to achieve within a given context. This parameter can take the following values:

  • none” – we do not check anything in the current configuration, so the commands will always be entered.
  • strict” – checks if the requested commands are in the correct order in the current configuration.
  • exact” – similar to strict, but the given section cannot contain anything else.
  • line” – checks each command line by line with configuration, if any line match, it does nothing.

In our example, we chose “strict“, which means that the commands in the “lines” list will be checked not only for their occurrence, but also for the position they should be in, which is crucial with ACL lists.

Further, by using the “parents” parameter, the commands in the “lines” list that aim to protect VTY lines 0 through 15 will be entered in the configuration mode of these lines.

On the Cisco Nexus 3172TQ switch, we will set the hostname, domain name, DNS servers, MOTD (Message Of The Day) message that appears during login and one VLAN. Additionally, if there has been a change in the configuration, it will be permanently saved.

Network devices can also be managed using ad hoc commands. This applies to both making changes and collecting certain data. The former may be useful when changing the password on the large number of devices, and the latter for the purpose of inventorying or verifying the status of certain elements or services of the device.

Separate specialized modules are used to gather facts from network devices. For devices with the IOS or IOS-XE system, the “ios_facts” module should be used for this purpose, and for devices with the NXOS system, the “nxos_facts” module.

Below you can see our MOTD banner, which we set up on the Cisco Nexus 3172TQ switch:

It is also possible to use Jinja2 templates to generate configurations that will then be sent to network devices. What’s cool about them is that Jinja2 templates are supported directly by network modules, so you don’t need to use the “template” module indirectly.

Finally, it is worth adding that the Red Hat Automation Hub includes many ready-made roles, plugins, modules and collections created by various manufacturers of network solutions. An example is the well-developed Cisco ACI (Application Centric Infrastructure) automation kit for network and policy management in data centers.

If you are interested in automating and orchestrating your network and its services on a larger scale, but not necessarily in data centers or enterprises, we encourage you to familiarize yourself with Cisco NSO (Network Services Orchestrator), which allows you to cooperate with Ansible and make any changes atomically across the entire network infrastructure.

 

03:03 PM, Mar 01

Author:

CEO, Network Engineer and System Administrator at networkers.pl

Marcin Ślęczek

Marcin works as CEO, Network Engineer and System Administrator at networkers.pl, which designs and implements IT Systems, Data Centers and DevOps environments. networkers.pl also sells software and hardware for building such environments and is a partner of well-known manufacturers such as Red Hat, Cisco Systems, IBM, Storware and VMware.