The 5 Dangers of Poor Network Timekeeping

We know there are issues getting NTP Timing from the Internet. These are the five key risks associated with taking time from the public internet:

  1. Operations Failure
  2. Data Loss
  3. Security Holes
  4. Potential Legal Liability
  5. Loss of Credibility

This White Paper details these 5 risks, and what you can do to remediate them.

Implementing an NTP Server in your datacenter is one of the most cost-effective ways to increase the reliability, performance, and security of your network for many years to come.

For more details, visit https://empowerednetworks.com/timing/ntp-servers.

Want to know more? Simply reply to this email, or complete our secure online form.

 

Leveraging Wire Data Visibility for Monitoring

Visibility, at its simplest level, is the ability to instrument or observe an environment. This allows us to gain access to data which describes its state and performance characteristics. You can describe monitoring as the use of this data to identify situations of interest. From there you can then drive operational and business processes to address them. In essence, visibility and monitoring answer the question of how an environment is performing.

Traditionally, the focus of visibility and monitoring has been on the individual elements that make up the environment. The assumption is that these collections of data will somehow synthesize into an integrated view. This is rarely, if ever, achieved. As the services provided by IT move from being necessary evils to the primary interface point between organizations and their customers, this focus in changing. What is the quality of service delivery? What is the customer experience? These are the questions that are being asking. The business is demanding visibility that is relevant to them. They are no longer willing to accept the opaque, disjointed, and technology focused monitoring of the past.

The New Service Delivery Challenge

Successfully providing this visibility to the business can be challenging. Without question, the advent of technological advances such as Cloud Computing, the “Internet of Things” (IoT), and “Software Defined Networking” (SDN) are enabling innovation at a blazing pace. But these same technologies also make the environment vastly more complex with respect to monitoring.

Elastic computing, virtualization, and cloud computing have an obvious impact with respect to scale. But more important, they make the components that services are created from far more transient. Components may be spun up for days, hours, or even minutes in response to capacity demands then spun down again just as quickly as the demand passes. The transient nature of these components invalidates many of the fundamental assumptions made by traditional monitoring systems. In a similar way, the IoT presents similar or greater scaling challenges. More important, it introduces a huge population of devices that do not support monitoring in the style of legacy servers and network equipment. In many cases, they do not supporting monitoring at all.

The moves to agile development and continuous delivery introduce more challenges. Because both are associated with an accelerated rate of change, they contributes to the increasingly dynamic nature of the environments. Add to this the fact that in the continuous development paradigm, visibility becomes a core functionality that can make or break the success of the delivery pipeline.

Watching Who’s Talking

Addressing these challenges requires fundamental changes in our approach. One of the more promising strategies comes from the realization that for all the complexity, there remain sources of consistent data that can be leveraged. One of these is the information traversing the network, also referred to as wire data. The network contains all communication which occurs between the customer and a service as well as between all the components that participate in providing that service. With appropriate technologies such as taps and packet brokers, access to this data is available via completely non-intrusive mechanisms which impose no impact on network devices. Better yet, they require no changes at all to the applications and other components which are participating in service delivery. Furthermore, the data itself is intrinsic to the delivery of the service, not additional overhead added to allow the visibility.

Wire Data has Challenges

The understanding of the value in wire data is not new. Network engineers have leveraged it to provide insight into the behavior of the network itself for years. But the amount of data is vast. As network capacity has expanded, a traditional focus on capturing all of the information in the form of raw packets has imposed a greater and greater “storage tax”. There is minuscule signal to noise ratio using the limited tooling and post-processing approach of traditional packet capture solutions. Extracting usable knowledge from that volume of data has limited its successful use to a small number of highly technical network engineers.

Big Data Spawns Opportunities

It is only recently that advances in big data theory and tooling have begun to allow real-time normalization and analytics of this wire data in motion. Consequently, this allows us to answer questions about the services as a whole. More important, these capabilities do not require in house teams of data scientists and custom code development to realize. New products from companies like NetScout and ExtraHop provide  technical capability but still stress ease of implementation and use. It is now possible for any organization to transform the data flowing through their network into knowledge that can be visualized and interacted with. This access is available to IT as well as business domain experts throughout the entire organization. It has even spawned a new area of practice in the form of “IT Operational Intelligence”.

Conclusion

Is wire data a panacea for visibility and monitoring in the modern age? Does it obviate the need for all other monitoring? Of course not. But it does provide one of the more promising visibility strategies for dealing with the scale, transience, and ambiguity of today’s services. This make it a solid contributor to any modern monitoring strategy.

4 Reasons to Assess Your Monitoring Strategy and 1 Reason To Not

Monitoring is an interesting subject. You need to have it but it is rarely true that it represents core business. For many organizations, it is just another cost of doing business. As a result, it doesn’t always get the focus that it should have. This is despite the level of inefficiency that can result if you’re doing it wrong. A scant few years ago, this used to be not such a big issue. Things were simpler, more straight forward, and, let’s be honest, a whole lot slower. But the world is changing and it’s time to consider whether your monitoring strategy and implementation can cope with it. Here are 4 reasons you should have a assess your monitoring strategy, and 1 reason you shouldn’t.

Reason to #1 – You have a lot more “things” to monitor.

The way that organizations deliver services has changed. The move from monolithic to microservices provides understandable and maintainable services. It also cranks up the number of moving parts to levels not seen before. The Internet of Things (IoT) is characterized by unprecedented control over and interaction with what used to be dumb endpoints. You also get equally unprecedented growth in the number of end points you need to manage. These factors, among others, are driving up the number of “things” you need to monitor. You often measure these increases in orders of magnitude. For operations groups already drowning in a sea of red, the challenge of scale is only going up.

Reason to #2 – You’re producing a lot more monitoring data.

Instrumentation of the monitored end points is the base of any monitoring strategy. Greater access to instrumentation options, both commercial and open source, combined with an “Instrument Everything” mentality means we’re producing an ever increasing number of metrics and events from each monitored element. The concept is that if anything happens, you’ll have captured everything about it. Unfortunately, most of this extra data is noise that drowns out the useful information. This makes it impossible to reasonable track what is going on. It seems a bit ironic but adding more instrumentation to a legacy monitoring strategy can actually cause monitoring efficacy to drop rather than improve.

Reason to #3 – The nature of modern service delivery is ephemeral.

Everyone understands the benefits of a “Just in Time” approach to resources supporting your services. There’s no point in running a server 7/24 just to run an occasionally intensive activity like end of month reporting. With the advent of cloud, container, and most recently serverless models, the ability to allocate resources on an as needed basis is becoming a reality. The move to the “Software Defined Data Center” means infrastructure is only a logical construct defined at run-time. Even traditional infrastructure delivery around things like network and storage moves into the realm of the transitory. Unfortunately, many traditional monitoring solutions assume that components under management are long lived and once in the system, will continue to be available to attach monitoring data to. In this new world of the ephemeral, these assumptions break at a fundamental level.

Reason to #4 – Adoption of CI/CD

There’s no question that implementing CI/CD improves your ability to deliver functionality to users in a more timely manner while, at the same time, improving the quality of your product. But a similarly increased rate of change is the corollary to this increased rate of deployment. Older monitoring strategies base their assumptions of relatively static services and require time to retool and adjust after a change. That needed time has now disappeared. When something goes wrong, operational personnel are forced into ad hoc manual processes and “Superhero” activities to try and save the day.

Reason to Not #1 – Hats, Pens, Shirts, and Other Vendor Swag

As we’ve seen, there are a lot of good reasons to assess your monitoring strategy but there is also one very good reason not to: vendors trying to entice you to “add-on” additional software. That is not to say that there aren’t opportunities to leverage your existing investments as a base to improve monitoring in your environment. However, a monitoring assessment to determine a clearly defined need and strategic response should be your driver, not simply a new software feature that a current vendor has tacked on. This is particularly true of vendors that provide legacy solutions. All of the reasons to assess listed here require fundamental changes to existing monitoring paradigms, tweaking the old ones just won’t cut it.

Conclusion

The world of IT is changing or, more realistically, has already changed. As with many things, the time to assess your monitoring strategy was actually some time ago. But, as the ancient Chinese proverb points out: “The best time to plant a tree was twenty years ago, the second best time is now”. By honestly evaluating where you are today, and where you want to go tomorrow, you can start to understand your organization’s needs and put the right monitoring in place not just for now but for the future.

HowTo: Start Using NetMRI Policy

Image showing two arrows on a chalkboard. The top arrow points to the left and reads "Insanity" and the bottom arrow points right and reads "Sanity." A hand at the bottom right holds a piece of chalk.Sanity: We in network operations desire it.  Crave it, really.  And it seems tricky to find sometimes in complex network implementations.  When we deploy networks in the real world of mergers and acquisitions, upgrades and new vendors, it can be awfully hard to keep your sanity.  When I work with our customers to deploy Infoblox NetMRI, one of the tools I use to help improve sanity is NetMRI Policy.

What is NetMRI Policy?

Policy is a feature that’s a part of the full NetMRI license. The settings are found in Configuration Management -> Policy Design Center.  NetMRI provides policy as a way to test a managed device’s configuration and attributes.  A policy is a group of one or more rules. Rules test for the presence or absence of values in a device’s config.

Rules come in three flavours:

  • Simple Rule – Looks for config values that either must or must not appear in the config
  • Rule Logic Builder Rule – Combines several Simple Rules together using logical operators (and, or, not, If-Then-Else)
  • XML Rule – NetMRI stores all of its rules as XML, so the third option is to write rules as XML documents (there’s a separate blog post on XML Rules coming soon!)

Sanity Tests in NetMRI Policy

Our Sanity Tests check the most basic configuration items on a device.  We’ll create six to eight rules that test:

  • DNS
  • NTP
  • TACACS+
  • Time zone settings
  • Banners

There may be certain configuration items that are more important in your environment that you’d add.  Start by making a list of the most important items you’d expect to see on any device in your environment.

Let’s first look at the DNS rule, because it’s a good example of a Simple Rule.

DNS Rule

Image showing a NetMRI Simple Rule from the Policy module checking the configuration of DNS servers on a Cisco device.

It’s pretty simple, right?  We’re looking at the device’s configuration for a match on the configured DNS servers.  If they appear, the rule passes, otherwise it fails.  Notice the “Config File Must Contain” dropdown? That lets us fine-tune the matching.  In this case, we need to find all of the lines, but we don’t care about the order. We can also set this to look for:

Image showing the contents of NetMRI's Simple Rule ordering selector.

Also, notice the bottom section of the rule, “Config File May Not Contain.”  If your environment is like ours, there’ve likely been some changes over time, and you may have old DNS settings stuck in your devices.  Using the May Not list, you can test for these old settings.  The rule will tell you If it finds any.

We Do Regexes Here, Too.

In the DNS example, we’re doing a straight text match: Either the whole line matches or it doesn’t.  But what if we need to match something like an encrypted password?  There’s no way to match that across devices.  Fortunately, all the NetMRI rule types allow for the use of Regular Expressions (regexes) for this kind of match.  Take, for example, matching an NTP authentication key.  You configure an NTP key on a Cisco device like this:

Screen shot of a terminal session to a Cisco device configuring an NTP authentication key.

If you have “service password-encryption” enabled on the device, IOS will hide the key in the configuration, so doing a “show running-config” will show you:

Screen shot of a terminal session to a Cisco device showing the running configuration.

The value after “md5” will be different from device to device.  We can use a rule like this to check that there’s a key present:

Image showing an example of using a regular expression to match dynamic text in a device's configuration.

The “.*” in the first row is a regular expression meaning “any character, repeated endlessly.” Now we can match across many devices.

Regexes are a powerful tool in rule building, but many people find them difficult to master.  I use Regex101 when developing rules to test my expressions are right.

Rule Logic Builder

The next option in rule building is the Rule Logic Builder.  Basically, these rules take a bunch of Simple Rules and combine them together using logical operations (and, or, not and if-then-else).  I generally choose RLB rules when I want to address a single rule theme across several environments.  For instance, you might use RLB rules when there are different settings for a service per datacentre.

Take a look at this example:

This RLB will check the NTP and log server settings, taking into account an East Coast vs. West Coast configuration.  Each of the four rules are Simple Rules like we used above.  The expression in “Enforce This Rule” at the top of the box sets how the rules are combined.  In this case, the East Coast NTP and Syslog rules both need to be true, or the West Coast set need to be true for the overall rule to be true.

XML Rules

The final option for building rules is in native XML.  Under the covers, NetMRI saves all its rules as XML documents and that’s also what the internal parser uses to execute tests.  XML rules can be complex, and they’re not the first tool to reach for.  They do, however, offer some benefits not found in the other rule types:

  • Access to Lists – Lists are a feature in NetMRI Job Management that let you create data structures that work like dictionaries in Python or hashes in Perl
  • Iterators – XML rules can iterate through collections of blocks in device configs and perform an operation on each

There’s a separate post coming soon on XML rules, but here’s an example of one in our Sanity Check Policy.  To enable SSH access to Cisco devices, all of the vty lines must be configured with “transport input ssh.” 

XML rules are perfect for this kind of problem.  Here’s an example:

Image of an example NetMRI XML rule that checks all vty lines on a Cisco device to make sure SSH is enabled.

OK, what does this do?  Lines 3 through 6 set up some variables to hold information as we process the config.  Line 8 breaks the configuration up into blocks, using “line (vty digit digit)” as the start boundary.  The rule feeds each block into a search looking for exactly one instance of “transport input ssh.”  Each match increments the pass-count counter, while each miss increments the fail-count counter.  The name of the vty is also added to a list called interface_names if the block fails to match.

We didn’t find “transport input ssh” on some of the vty’s if fail-count is more than zero. If that’s true, the rule fails, and the output will include a list of vty’s that didn’t match.  Otherwise, the rule passes.

Rules to Policy

After you’ve created your individual rules, you create a policy to contain them on the Policies tab.  The NetMRI docs cover this well, so I won’t go into details here.  Here’s what a finished policy looks like for our network:

Image of our NetMRI sanity check policy showing nine rules as well as the policy description and filter.

Finally, it’s time to deploy the policy on a device group.  Again, the NetMRI docs cover this well, so I won’t go into details.  I will, however, advise that it’s best to try the policy out on a small group before it gets into wide circulation, just to make sure it’s working the way you expect.

How This Helps

The example content that comes with NetMRI: DISA STIG v7 and v8, PCI DSS, SANS, IANA and NSA best practices are pretty large compliance regimes.  PCI DSS v3.0 as an example contains 39 individual rules.  DISA STIG v8 for L2 devices has 71 rules.  In both cases, the rules are meant as a starting place and require some massaging for use in a real-world network.  Starting with such large policies, especially if you don’t have a good sense of how NetMRI rules work just ends up being frustrating.  By starting with a simple set of rules that reflect how your network works, you can quickly learn how to build good rules.  After you’ve got the skills, the sky is the limit.

Happy rule building!

Got DDoS? How to stop DDoS attacks on your NTP Server

DDoS attacks cost you time and money. Is your NTP server immune? Why take the chance?

Microsemi’s S600 and S650 NTP Servers with the NTP Reflector option, are effectively immune to DoS (Denial of Service) attacks via NTP.

Check out Microsemi’s important Application Note details the important functionality that provides secure, accurate and reliable NTP services to applications and systems that are exposed to the Internet.

Continue Reading » » »