Today I Passed the CCNA Wireless Exam

Wait, wasn’t I just studying for the CCIE? After my lab attempt, I decided it was important to branch out a little bit and develop a more T-shaped skillset. I came from a generalist background (read jack-of-all-trades), then specialized in expert-level routing & switching which serves as a great foundation for other networking and infrastructure-related skills.

Passing the CCNA Wireless is one of a few moves I’ve decided to make to ensure my realm of knowledge is not too narrow. I applied the skills I developed while studying for the CCIE toward learning the CCNA Wireless blueprint. I went from zero to passing the exam in about a month.

Well, that’s not completely true. About a year and a half ago I passed the vendor-neutral CWNA exam. This prepared me for the RF and 802.11 fundamentals sections. Within a year and half, there are a lot of specifics that a person can lose, but I was fortunate in that most of the details came right back to me. The Cisco WIFUND exam takes many of the fundamentals from the CWNA and introduces several Cisco wireless specifics.

My sole source of study for this was the CCNA Wireless Official Cert Guide. I read it cover-to-cover and made Q&A-style flash cards for details I wanted to remember as I was reading it (links at the bottom). The OCG is very well-written, however I do not recommend using it as the sole source of study, as I did, if you are still relatively new to networking.

My experience taking the actual exam today was a relatively poor one, unfortunately. For starters, there were a few questions that were not covered at all in the OCG. The OCG does touch every part of the official Cisco blueprint, but it unfortunately does not go deep enough in some areas. In all fairness, this is a difficult feat to accomplish, especially when you are dealing with products featuring multiple GUIs; there is only so much room available in a textbook.

To that end, I’ve heard really good things about Jerome Henry’s CCNA Wireless video series (available on Safari), but I did not take the time to go through it. I do very highly recommend going through Jerome’s video course in addition to going through the OCG.

I had a couple of questions that were not specifically on the blueprint. This is allowed, though, as Cisco always states for every exam “The following topics are general guidelines for the content that is likely to be included on the exam. However, other related topics may also appear on any specific instance of the exam. To better reflect the contents of the exam and for clarity purposes, these guidelines may change at any time without notice.” This is their loophole so that you could never legally claim the exam is unfair.

I had the usual few questions that were worded very poorly. The answers to the questions could change depending on the exact way in which you interpreted specific words. I think the worst offense, though, was that I had one question that was a single-answer multiple-choice, but two of the answers were provably correct from Cisco’s own official documentation! On all of the questions that I had issues with, I took my best guess and left detailed comments explaining why I thought the particular questions were poor. In passing this exam, this is where some luck was involved. Cisco really does read the comments you leave on exam questions, so if you feel a question was poor in any way, do leave a comment, and be sure to be as descriptive as possible.

The bottom line is that this exam is definitely achievable with self-study, though with all exams, having real-world experience does help a tremendous amount. The exam blueprint definitely provides a nice overview of wireless networking as it pertains to Cisco. Many concepts learned can no doubt be applied to other vendors as well.

You can download my flashcard deck here. As always, I recommend you create your own flashcards because they will be more meaningful to you; however it is sometimes helpful to have an outside perspective. This deck contains 415 flash cards:

MikroTik Automated MPLS L3VPN Lab

I am breaking out of the Cisco wheelhouse a little bit by using MikroTik RouterOS to build on my previous work of automating a base-level lab configuration. Working with another network operating system that uses a completely different syntax allows you to learn the various protocols in a more meaningful way (in my opinion). When you configure a single vendor’s equipment, it is easy to get in the habit of pushing out configurations without taking the time to understand the meaning behind the commands.

This lab demonstrates an automated configuration of a basic MPLS L3VPN service, including four SP core routers, four SP PE routers, and four CE routers representing two different customers.

The building blocks of a basic MPLS L3VPN include:

  • Loopback IP addresses assigned to all SP routers.
  • Reachability between the loopbacks, whether through static or dynamic routing. A flat OSPF topology was chosen for this lab.
  • LDP to advertise MPLS labels between the SP routers.
  • BGP sessions between PE routers to distribute the VPNv4 NLRI.
  • VRFs for customers defined on the PE routers.
  • Redistribution of customer routing information into BGP for transport across the SP network.

My previous lab automation relied on telnet connections to devices using netmiko, which includes logic to handle the Cisco devices (such as entering privileged EXEC and configuration modes, etc.). With this lab, I am still using telnet since the hypervisor (EVE-NG) acts as a telnet-based terminal server, but netmiko does not  currently work with MikroTik in the way that I need it to. Through trial and error, I worked out a script using Python3’s built-in telnet library.

The Python3 script and associated YAML and Jinja2 files that accompany this post are available on my GitHub page.

Let’s dissect this lab a little bit. The Python3 script is based off the work I did previously with the multi-threaded configuration. The script imports and creates structured data out of the YAML file, then processes the Jinja2 template to build complete configurations. This process is exactly the same as what I had done before. The difference this time is in how the Python3 script handles sending the completed configuration to the MikroTik devices.

Netmiko has had extensive testing and verification done for Cisco IOS behaviors, but not yet for MikroTik RouterOS. Configuration syntax aside, the general behavior of RouterOS is different from Cisco IOS as well. Upon booting an unconfigured instance of the CHR version of RouterOS, you are presented with a default login (admin / no password), and asked if you wish to view the software license agreement. My Python3 script accounts for this.

I also discovered that using this method to push configurations (where the telnet connection represents a serial console port), MikroTik does not seem to respond to the newline character (“\n”), so I had to use “\r” instead. Finally, I discovered that the MikroTik CHR had an issue rapidly accepting commands. The application of the pushed commands is inconsistent. My workaround was to introduce a one-second pause between the sending of each command. I also had to open a telnet console in advance to all devices to be configured, even though the script will be doing the same thing. For some reason, the configurations would not apply accurately and consistently unless I already had a telnet session pre-established in another window.

The other issue I had set me back a few hours worth of troubleshooting, until I finally had to ask someone for help. Using the current version of EVE-NG for the hypervisor, I was using version 6.42.3 of the CHR MikroTik RouterOS. I discovered every time I booted the routers, the interfaces were not connected as they were displayed in the lab. It was almost as if the actual interfaces were scrambled, and it seemed to be random with every boot. Fortunately, my friend Kevin Myers came to the rescue! There seems to be an unknown bug somewhere in the interactions between EVE-NG and RouterOS for versions beyond 6.39. After talking to Kevin, I tried RouterOS version 6.38.7 and that resolved the issue.

Now to the automated configuration. This is represented in the Jinja2 file.

The following configuration section applies to all routers in the lab. It sets the hostname (as defined in the YAML file), and adds a loopback interface. MikroTik treats loopback interfaces no differently than a regular bridged Layer 3 interface (similar to an SVI or a BVI on Cisco equipment). I named the interface “lo1” out of convention, but the name is arbitrary and can be anything.

/system identity set name {{ host }}
/interface bridge add name=lo1

Go through all interfaces defined for all devices, and assign the IP addresses. Add the appropriate LDP configuration if the interface is to be enabled for LDP.

{%- for iface in interfaces %}
/ip address add address={{ iface[1] }}/{{ iface[2] }} interface={{ iface[0] }}
{% if iface[3] %}
/mpls ldp interface add interface={{ iface[0] }}
{%- endif %}
{%- endfor %}

This lab uses eBGP as the PE-CE routing protocol. If the device is a CE router (as indicated by having “CE” in its hostname), enable BGP for the defined ASN, redistribute connected routes, and add the PE BGP peer.

{%- if 'CE' in host %}
/routing bgp instance set default as={{ bgp_asn }} redistribute-connected=yes
{%- for peer in bgp_peers %}
/routing bgp peer add remote-address={{ peer[1] }} remote-as={{ peer[0] }}
{%- endfor %}
{%- endif %}

If the router belongs to the service provider (whether Core or PE), enable OSPF and define the router-id. Since I kept all interfaces in the global routing table within the 10.0.0.0/8 network, I added the blanket statement to enable those interfaces in OSPF area 0. This could be easily modified for variables in the YAML file if necessary. I did not do that here in the interest of keeping things simple. Next, LDP is enabled with the same IP address as defined on the loopback interface.

{%- if 'Core' in host or 'PE' in host %}
/routing ospf instance set 0 router-id={{ router_id }}
/routing ospf network add network=10.0.0.0/8 area=backbone
/mpls ldp set enabled=yes transport-address={{ router_id }}
{%- endif %}

Finally, the remaining configuration applies only to the PE routers to enable the MPLS L3VPN service.

This enables BGP globally and defines the local ASN:

{%- if 'PE' in host %}
/routing bgp instance set default as={{ bgp_asn }}

This configuration set goes through all defined VRFs for each PE. In the interest of keeping things simple, only a single VRF is defined on each PE, though more VRFs can be defined by simply adding the information in the YAML file. The first configuration line (denoted by “/”) names the VRF, which interfaces will participate, the Route Distinguisher, and the import and export Route Targets. The second line creates a BGP instance for the particular VRF and associates it to the VRF. The third line enables redistribution of IPv4 routes from the BGP VRF RIB to global BGP VPNv4 routes to be sent to the other PE routers.

{%- for vrf in vrfs %}
/ip route vrf add routing-mark={{ vrf[0] }} interfaces={{ vrf[1] }} \
 route-distinguisher={{ vrf[2] }} export-route-targets={{ vrf[3] }} \
 import-route-targets={{ vrf[4] }}
/routing bgp instance add name={{ vrf[0] }} router-id={{ router_id }} \
 as={{ bgp_asn }} routing-table={{ vrf[0] }}
/routing bgp instance vrf add instance=default routing-mark={{ vrf[0] }} \
 redistribute-connected=yes redistribute-other-bgp=yes
{%- endfor %}

This final configuration section defines the PE BGP peers. The configuration is separated into two parts: is the peer part of a VRF or not? If the peer is part of a VRF, regular IPv4 BGP NLRI is exchanged. If the peer is not part of a VRF, we are only interested in exchanging VPNv4 NLRI.

{%- for peer in bgp_peers %}
{%- if peer[2] %}
/routing bgp peer add instance={{ peer[2] }} remote-address={{ peer[1] }} remote-as={{ peer[0] }}
{%- endif %}
{%- if not peer[2] %}
/routing bgp peer add remote-address={{ peer[1] }} remote-as={{ peer[0] }} \
 address-families=vpnv4 update-source=lo1
{%- endif %}
{%- endfor %}
{%- endif %}

Assuming all variables are correctly defined in the YAML file, running the Python3 script should produce successful results for a baseline configuration. Here is an example of the output on PE8:

This is the output of the corresponding CE router (CE1b):

On all routers, you can verify the IP routing table with:

/ip route print

This shows the current base-level routing table from CE1b. It has reachability to the routes in CE1a:

If we go to CE1a and create a new bridged interface, it will appear in the routing table of CE1b after an IP address is assigned to it. This is because our baseline configuration automatically redistributes connected interfaces.

Verified from CE1b:

On the PE routers, you can verify the LDP neighbors with:

/mpls ldp neighbor print

Verify the MPLS FIB with:

/mpls forwarding-table print

Verify BGP peers:

/routing bgp peer print

This lab keeps things simple to demonstrate the concepts. More advanced configurations such as the ability to have the same BGP ASN present at multiple customer sites, dual-homed CE routers, BGP route reflectors, and more complex route-target import/export scenarios are not demonstrated in this lab. However, I have shown how you can easily deploy a base configuration for which you can then dive into more advanced scenarios.

If you’re building an MPLS L3VPN, you’re always going to need the building blocks in place. By automating the base-level configuration, you save yourself from the monotony of typing the same commands over and over again before you can get to the more interesting stuff. As always, make sure you understand what you are automating before you do it, otherwise you will waste valuable time when you need to troubleshoot when things go wrong.

Easy Disaster Recovery Plan

DR plans encompass everything from no plan whatsoever (failing to plan is planning to fail), to active/active workloads distributed among several geo-redundant datacenters. This spectrum, just like nearly everything else in business, goes from zero to enormous cost and complexity. In the interest of keeping things simple, I designed a relatively inexpensive and uncomplicated enterprise DR plan that can be adapted and scaled with organizational requirements.

The initial design starts with two datacenters (or campuses, or a couple boxes in a rented colo cage, whatever your situation may be). One is primary (Datacenter A), and the other functions as a cold standby with some secondary production traffic (Datacenter B). In this example, I am using a single router, and a single VRF-capable Layer 3 switch in Datacenter B.

With this design, Datacenter A serves Site X and Site Y for normal production traffic. Datacenter B also serves normal production traffic, but mostly for secondary systems for when individual primary systems at Datacenter A become available, which is different from a full disaster recovery scenario. This demonstrates the cost savings of starting out with partial redundancy. Services that are deemed critical to infrastructure (such as DNS) have live systems simultaneously active in both datacenters. However, it is not a full replacement for all systems should Datacenter A become completely unavailable.

The Layer 3 switch in Datacenter B is divided into two VRFs: one for secondary production traffic, and one for disaster recovery traffic which mirrors the prefixes available in the primary datacenter. The DR VRF is always active, and can be prepared and maintained using whatever resources are required before an actual DR event occurs.

During normal production, the router at Datacenter B (indicated in the diagram as BGP 64514) sees both the production 10.100.x.x/16 and DR 10.100.x.x/16 networks. When a router receives multiple routes from different protocols, it needs to decide which route to believe. This router is configured to install the production 10.100.x.x/16 routes from Datacenter A in its FIB as long as it is receiving them from BGP. If those routes disappear, the DR 10.100.x.x/16 routes are installed, instead.

This means the DR 10.100.x.x/16 routes will be inaccessible from the rest of the network during normal production. With this design using a Layer 3 switch and VRFs, the SVIs are assigned to either the production or DR VRF. You can access the DR routes during normal production by configuring a jump host that contains at least two network interfaces: one in the production VRF, and one in the DR VRF.

Finally, a disaster recovery plan is incomplete if you do not test it. On the BGP router in Datacenter B, you can set up filtering to prevent the production 10.100.x.x/16 routes from being learned by BGP. This will cause the Datacenter B router to believe the DR 10.100.x.x/16 routes are the “real” routes. Likewise, for DR testing purposes, you should ensure that the Datacenter B BGP router does not advertise the 10.100.x.x/16 routes back to the rest of the network. You would only do that during an actual DR event. With the bidirectional filtering in place, you can use one or more hosts on the secondary production 10.200.x.x/16 network to test the validity of the recovery. After the results of the test are verified, you can remove the inbound BGP filtering to restore Datacenter B to normal production.

During a DR test, hosts in other parts of the network (such as Site X and Site Y) can still access hosts in the 10.200.x.x/16 network, and they will still use the 10.100.x.x/16 routes present in Datacenter A. Hosts in Datacenter A can reach hosts on the 10.200.x.x/16 network, but the replies will be sent to the DR VRF. You can use static routes on the Datacenter B BGP router to override this behavior. For example, if during DR testing a system in the DR environment absolutely must reach another system in the production network in Datacenter A, a static /32 route can be entered into the Datacenter B router pointing toward the MPLS L3VPN.

This design starts small and simple, which is sometimes all you need to reach the particular business goals. It also represents an older way to do things by using an active/standby model, based on systems that are not prepared to have their IP addresses changed. The tradeoff is a much lower cost since additional hardware and software licensing are not required until an actual disaster (depending on what is being recovered).

A more modern (albeit more expensive) model is active/active using different subnets and load-balancers. The client-facing IP prefixes could be advertised from both datacenters, fronted by the load-balancers. The clients would reach their closest load-balancer (anycast style), and the load-balancer would redirect to the proper server to complete the connection (depending on the application architecture, of course). This is more expensive because it requires a duplication of hardware and software (including licensing) to maintain the active/active environment.

The tradeoff is a much quicker recovery time. In fact, the users might not even be aware of an outage if proper planning and testing is performed. It all depends on the business weighing the risks versus the costs. Some businesses can tolerate the active/standby model, many cannot.