How to Install Oxidized for Network Configuration Backup

Oxidized is an open-source project started by Saku Ytti and Samer Abdel-Hafez as an alternative to the very popular RANCID software. A little over a year ago, I created a RANCID server to backup the configuration of my network devices. It has been a good, stable piece of software that has been doing the job very well across hundreds of devices.

When I set up the RANCID server, I had heard of Oxidized, but the project wasn’t yet as far along as it is now. A few days ago, I decided to take another look at it. One of the things that made Oxidized more appealing to me right away is its companion web interface. While RANCID can be “web-enabled” with the viewvc interface, it is pretty limited in functionality. I found the Oxidized web interface to be exactly what I was looking for. It also supports a very wide range of network devices and network operating systems.

Compared to using viewvc with RANCID, Oxidized also lets you view current configurations and diffs between versions. However, Oxidized lets you search for terms across all the configurations. If only some of your devices have a very specific configuration or inventory item, you can search and only the devices matching will be displayed. For example, in my environment, I can search for “PVDM” and quickly see which of my Cisco routers contain DSPs.

The web interface is also very fast! I have approximately 500 devices being backed up, and the web interface is always extremely responsive. Another feature of the web interface is the status of the last device configuration poll. You can see how long it takes on average to pull a configuration from the device, the number of times the configuration backup failed, the failure rate, and the time of the last failure. This helped me to identify a broadband link that was consistently slow, because the average run time was much more than the other devices.

As wonderful as Oxidized is, one of its current drawbacks is a lack of good, complete documentation. When I set up an Oxidized server for my environment, I documented all of the steps I took, including caveats I encountered, to have a successful install. The following is a guide to setting up an Oxidized server on CentOS 7 with basic web authentication. Like many Linux-related installation instructions, there are multiple ways to reach the ultimate goal, and what I have done may not be the best, most secure, or optimized way, but I reached the end goal of a working installation.

Continue reading “How to Install Oxidized for Network Configuration Backup”

Today I Passed the CWNA Exam

I have been involved with both wired and wireless networking for many years. My original wireless setups were from the early 2000s, shortly after 802.11b became popular. I remember at one point I had a PCMCIA card with a pigtail and external antenna attached to it.

As my career started taking a focus more toward networking, I became intimately familiar with just about every aspect of wired networking. Having worked with wireless for so long, I knew a decent amount about how the technology works, but not nearly to the level of familiarity I have with Ethernet.

Occasionally, I look at various job listings just to see what employers generally expect within different levels of networking careers. I kept seeing wireless networking as a general skill, and in many listings, I saw the CWNA as either a requirement, or a “nice to have”. I decided it was time to finally bridge the divide in my networking knowledge and learn some wireless topics at a deeper level.

I feel like the CWNA exam is absolutely perfect for this. This exam is not so introductory as to have no value whatsoever, but it is not so deep that you have to devote a significant amount of time toward it to pass. I am not yet looking to devote myself to wireless networking, but the CWNP program does offer more advanced certifications for those that are. If I ever decided to pursue an even deeper level of wireless networking knowledge, I would definitely come back to the CWNP program and work on those additional certifications.

I started studying for this certification, and took and passed the exam on the first attempt, within the course of just a little over a month. I will admit, with me already having CCNP-level knowledge, there were a lot of topics on the CWNA that I was already familiar with (and even a few topics that I disagreed with!). This made studying for the exam go by a little faster.

My process was to first read the Official CWNA Study Guide all the way through. This took a couple of weeks, reading one or two chapters each day. In the past, when studying for a certification, I would have taken tons of notes, which end up being somewhat useless to me. It took me a long time to break this habit. This may work well for some people, but I found out through time that this process doesn’t work for me. I still have all of the notes I’ve ever taken for all of the certifications I’ve studied for, but simply reading my notes doesn’t really do much for me. This time, I took no notes while reading.

For this certification, after reading the entire certification guide, I took all of the chapter questions from the book, and all of the entries in the glossary, and made flash cards out of them in Anki. Using Anki, I was able to very quickly separate what I already knew from what I still needed to retain. After two weeks of spending an hour or so each day reviewing flash cards, I took the first of three online practice tests. I made new flash cards out of the questions that I missed, and continued to study. A week later, I took the second of three practice tests and did much better. Once again, I made cards out of the questions I missed.

Since I did so well on the second practice exam, I decided to schedule the real exam for the following week. I continued to review cards, and a few days before taking the test, I took the third of three practice exams and did very well. I didn’t do as well on the third practice exam as the second, which shook my confidence a little bit, but it was still a passing score, so I proceeded to review the cards and keep the exam as scheduled. In the end, my flash card deck contained about 1100 cards.

The online practice exams are included as part of having access to the textbook. I have a subscription to Safari Books Online (best money I’ve ever spent in my life!), and I was able to register for access to the practice exams on the Sybex website. These official practice tests, along with using Anki, absolutely transformed my method of studying and more importantly, information retention. I actually found the practice tests to be a little more difficult to pass than the actual exam, which was a nice bonus.

There are a lot of little details that you need to memorize to pass the CWNA exam. These are details that will definitely be forgotten after the test is over, unless you keep reviewing the material. But, the CWNA also teaches many different concepts and methodologies that revolve around the world of wireless networking, and this is the most important information that I believe will stick with you if you study for and pass the exam.

For example, if you are setting up a brand new 802.11ac wireless network, when previously there was no wireless network (a Greenfield installation), you might not need to remember what the Modulation and Coding Schemes are that 802.11ac uses, but knowing essentials such as the fact that 802.11ac operates only in the 5 GHz bands, and how the 5 GHz frequency bands operate a little differently than the 2.4 GHz bands, will be excellent knowledge to have when you need to troubleshoot the wireless network post-installation.

The pricing of the CWNA-106 exam isn’t too bad ($175 as I write this), at least not compared to Cisco’s recent price hikes, and the process of studying for and gaining the credentials has been well worth it to me. I will now absolutely be able to more intelligently discuss wireless networking, troubleshoot, and plan and make appropriate proposals when needed.

General Network Challenges, and IP/TCP/UDP Operations

Having fundamental knowledge of what affects TCP, UDP, and IP itself helps you to better troubleshoot the network when things go wrong. I feel like most of the lower-level network-oriented certifications barely touch on these topics, if at all. However, the current Cisco CCNP and CCIE Routing & Switching exams do expect you to know this. This post is geared toward Cisco’s implementation and defaults regarding the various topics. However, whether you are studying for a certification or not, this is all good information to have.

This mega-post covers the following topics:

Continue reading “General Network Challenges, and IP/TCP/UDP Operations”

QoS in Action

Quality of Service is an added-value network infrastructure service that is still very important within the scope of private networks. Some might argue that QoS is not as important as it once was as we start to see more SD-WAN deployments that utilize the general Internet for transport, because the Internet has no inherent QoS. Additionally, many private networks do not utilize QoS whatsoever, and their operators essentially just “hope for the best” as all the different types of traffic traverse the various links. This may be due to lack of awareness or training on the part of the operators, or it may simply be that the business has not placed enough value in its importance.

One of the ideas behind an SD-WAN deployment is that since the Internet does not offer QoS, you can attempt to circumvent this when using the Internet for transport by having multiple connections, ideally from different service providers, and monitor the end-to-end quality of the links through metrics such as bandwidth utilization, delay and jitter. A good SD-WAN solution will monitor the links, and could be configured perhaps to send voice and other delay-sensitive traffic over the link that is the least congested and/or has the lowest delay and jitter, while sending bulk data over a different link.

Even if you are using the general Internet for your transport, QoS may still be important if you consistently use all or the majority of your available bandwidth. You can’t control how your data will flow across the Internet after it leaves your private network, but you can control all aspects of your data until it reaches your private edge. One of the major benefits of using QoS is queuing/scheduling your traffic through classification and marking.

At a high level, you implement QoS by first classifying your traffic. This can be as simple as two classes, such as delay-sensitive traffic, and everything else. The most common model is four-class, and there is also a standardized eight-class model. Most networking equipment that supports QoS allows you to get even more granular, if you wish. You determine classes based on different characteristics such as the type of treatment or relative importance of the traffic. You can also simply classify the traffic based on the source or the destination (such as all traffic to or from a particular server).

After classifying traffic, actions can be taken on the different traffic classes, such as marking or specialized treatment. Classified traffic is often marked using CoS for Layer 2 (such as Ethernet), and DSCP for Layer 3 (IP). Layer 2 is considered a local marking, whereas DSCP can be carried across the entire IP network. For example, traffic coming from an IP phone may be marked as CoS 5 by the switch the phone is connected to. Then when the traffic crosses the first-hop router (which could very well be the same switch), the Layer 2 CoS marking may be mapped to DSCP “EF” at Layer 3. The DSCP marking may be ignored at various points in the network, but it will remain inside the packet header unless a network device purposely changes it. With QoS marked in the IP header, any devices along the path that processes IP packets can examine the header and possibly take action, such offering that particular packet different treatment.

The ultimate purpose of classifying and marking traffic is for queuing/scheduling, which is the process of determining which traffic is sent first. Network interfaces will normally use FIFO (first-in, first-out) scheduling when the link is not congested. However, when the link is congested, traffic that has been classified and marked as more important can be scheduled to be sent first.

When using the Internet for transport, you can’t control the treatment of your most important data once it leaves your network, but you can make sure at your Internet edge that the most important traffic gets sent out before any other traffic does. This is one of the main reasons why QoS is as important as it ever was, even with SD-WAN solutions that use the Internet for transport.

QoS scheduling is also important when the data is transmitting from a higher-speed link to a lower-speed link. For example, a company’s data center will almost always have much higher WAN-facing bandwidth than a branch-office WAN link. QoS scheduling once again ensures that higher-priority traffic makes it to the branch WAN link first. For example, in MPLS L3VPN environments, the service provider can offer as a service (and usually for an extra fee) QoS capabilities. If your data center has a 1 Gbps pipe toward your MPLS WAN, but your branch office is on a 1.5 Mbps T1, subscribing to the service provider’s QoS service can ensure that when a large file is blasted out to the branch office, the VoIP traffic will still receive preferential treatment because it will be scheduled first as it leaves the service provider’s router on the other end of the T1.

Another aspect of QoS is policing and shaping. A service provider will often use policing to create “sub-rate” links. For example, the SP may provide for you a physical gigabit Ethernet link, but you may be only paying for 200 Mbps of service. The SP uses policing to turn the gigabit link into an effective 200 Mbps link by dropping any traffic that goes over the 200 Mbps mark. Policing is typically used on the ingress to a network. Conversely, shaping is typically used on the egress of a network. Shaping works by temporarily buffering excess traffic, and then transmitting it when possible, which helps to avoid dropping the traffic.

Policing can also be very useful within your private network to prevent a source of traffic from overwhelming a particular destination. For example, if you have a server in your data center that provides some kind of updates to the computers in your network (such as a WSUS server), you could use granular policing to prevent it from overwhelming just the slower branch office links during regular business hours, but still offer the full available capacity after hours. 

As important as QoS is, I find it pretty amazing that it is not covered at all in the current Cisco CCNP R&S curriculum. It’s covered earlier under Cisco’s Collaboration, Wireless, and Service Provider tracks, but the general R&S track does not mention QoS at all until the CCIE level (as of this writing).

Getting into QoS can seem very daunting at first. Like most technologies (or sub-technologies), there’s a new lexicon to learn, and not everything may seem obvious at first. When I first started exploring QoS as part of reading the CCIE OCG a couple years ago, it did seem a bit overwhelming, and I felt that even though I could follow along and understand what I was reading while I read it, when I was done I wasn’t really able to retain what I had just read because at that point in time, I’d never experienced it for myself. Working at my current job has changed that, fortunately.

Like so many things, witnessing it in action (especially in production) and repeated exposure to books and documentation has helped to solidify the major concepts of QoS for me. Experience is great, and it really solidifies the things you learn when you study. But I am still a firm believer that you need to obtain the knowledge first (at least in the general sense), and then build the experience afterward. If I had not taken it upon myself to move past the CCNP R&S curriculum and explore the content within the scope of the CCIE, there are several things I would not even know about. Things like QoS, the service provider side of MPLS, and working with VRFs. These things represent tools in a toolbox, and knowing what tools you have to work with is the key to solving business problems and making you a success.

Update: I found out that QoS is indeed introduced now on the current CCNA exam. This is excellent news, and I would expect it to be covered somewhere on the next revision of the CCNP R&S.

The Data Center Move, Part 4

Part 1  |  Part 2  |  Part 3  | Part 4

Over the next couple of weeks, we continued to migrate more portions of the network and less-critical systems over to the new data center. One of the issues we experienced was temporary route instability due to accidentally advertising the same routes through multiple BGP ASNs because of redistribution.

The overall WAN and Internet design of our network is hub-and-spoke. We use an MPLS L3VPN service and peer with the PE routers with eBGP, so all of the spokes can talk directly to each other (which is useful for all of the inter-office VoIP traffic), but both the primary business data as well as Internet connectivity for the entire company all flow back to the hub data center.

Over time, for various reasons, we ended up with multiple MPLS routers at the old data center that face the rest of the WAN. All the MPLS routers use the same BGP ASN and speak iBGP with each other, and peer with our MPLS provider with eBGP. Even though all the MPLS routers have equal access to the WAN (differing bandwidths aside), different routing policies had been put into place for various reasons over the years. For instance, all of our SIP sessions went over just one of the routers. We advertised the company-wide default route from two of the routers, but not all of them. We advertised our data center private /16 network out of all the routers, but  advertised more specific /24s of just a couple subnets on only a couple of the routers. Nearly all of these routing patterns were established before I got here. Some of them made absolutely no sense to me, so I had to question their history, which often pointed back to one-off fixes that were supposed to be temporary, but of course became permanent.

We’re primarily a Cisco shop as far as the routing and switching infrastructure goes, so we use EIGRP on the internal network. Both the new data center and old were connected together using the same EIGRP ASN. We perform mutual redistribution of BGP and EIGRP on all the MPLS routers, and use tagging to prevent route loops. However, at the new data center, we used a different private BGP ASN.

I knew that because of the two different BGP ASNs, I would have to be careful about outbound route advertisements appearing to the rest of the network from multiple sources. So I used BGP prepending to make different paths be more preferable than others, while still allowing for redundancy in case one of the routers went down. But, since all of the MPLS routers at both data centers were joined together by the same EIGRP ASN and they were configured to do mutual redistribution of BGP and EIGRP, it ended up causing a problem that I didn’t see at the time, but can now see very clearly in hindsight.

The routing table seemed stable, and everything was flowing properly. Then a couple of our remote branches were having WAN troubles, and their circuits went down. When they came back up, they could no longer reach any of our data center subnets. Or more correctly, they could reach us, but we couldn’t reach them. I didn’t think that a routing loop would have occurred, because at all points of redistribution, I set and matched tags to prevent that from happening.

The part that I can now see clearly with the experience of hindsight is that when those branches went down, their routes were removed from the overall routing table. However, when they came back up and the routes were re-introduced, they were learned by BGP from our old data center, redistributed into EIGRP and passed onto the new data center, and redistributed back into BGP with a different ASN, which caused a loop. But only for routes that were removed from and re-introduced into the overall routing table.

Luckily, we were able to catch this very quickly and correct it with only a couple of small branches experiencing the issue. As I write this, we are still in the middle of migrating the WAN over, and the way we chose to deal with this is to not yet re-distribute EIGRP back into BGP at the new data center, and instead just advertise the routes we desire directly through BGP. It’s true, we could have just used filtering, but we only need to do this temporarily and I thought it would be a much quicker and easier way to solve the problem.

Having this job has been very exciting so far, and I feel like I came onboard at just the right time. Since this is my first “real” enterprise-level networking job, I spent the first several months getting up to speed and seeing where all the things I read about and studied for fit into place. Now I’ve reached a point of being able to put the more advanced things I’ve studied into action, and gain real practical experience from it which will propel me forward in my career.

As I said, I am well aware of the fact that the majority of people who are early on in their network careers will not have had the opportunities that I’ve had in experiencing this data center move. That is why I made sure to make the most of it and write down some of the many things I experienced and learned. The static EtherChannel problem, in particular, was a real nightmare and I don’t think I will ever forget it. More importantly, I now know exactly what symptoms to look for and how to fix it if something like that ever happens again.

I feel like I have gained a lot of really important experience in a short amount of time. I’m very grateful for it, and I’m always continuing to study and look toward the future with what this career may bring me, and to help solve the many problems of business.

Part 1  |  Part 2  |  Part 3  | Part 4