I work with a lot of clients helping them to understand the monitoring capabilities that they have, how they are using them, and where they may have deficiencies or duplication in their monitoring tooling. In many cases, these clients have entire collections of tools that look like the top 10 favorites list of the market analysts – tools from acknowledged leaders in the monitoring space happily crowded into the upper right of magic quadrants. Still, I’m generally involved with them because, even with all these tools – complete with the associated administrative overhead and maintenance costs – they still don’t feel that they are able to address their monitoring needs.
Successfully choosing and implementing the right tools for monitoring is a multipart process but most organizations only focus on one aspect. Let’s face it, the world of IT monitoring has a lot of technical challenges and it is these challenges that are of interest to monitoring folks. We like the technical stuff. We tend to be fully engaged in the decision of which tool is best, from a technical point of view, including all the familiar activities such as scouring analyst reports, doing technical evaluations, proof of concepts, bake-offs, and the like.
These technical considerations are all important pieces of the decision process and ensure that the chosen tool has all the bells and whistles. However, if they are the primary considerations, which they all too often are, it can be challenging for organization to later clearly define the specific functions that the tools have in their environment or how to measure if they are fulfilling those functions. Lacking better understanding, you can end up measuring the success of the monitoring environment in technology related dimensions, such as the proportion of the monitoring tool’s features that are being used. It’s not uncommon to hear a client say that they are only using a small proportion of a given tool’s capabilities. Sure, the proportion of the features used is a measure of a tool, but one that has little relationship to the capabilities that are needed to solve real business problems. The result can be tools in search of a problem where the monitoring teams spend their time trying to justify the tool spend to a business that doesn’t understand their role.
Avoiding this situation starts with a focus on defining what is to be monitored before talking about the how. This means getting the services owners involved, as they’re in the best position to understand what is needed. You may not even want to use the term “monitoring” initially. Asking something like “what do you want to monitor?” or “What features do you need?” of the service owners is opening yourself up to responses like “you’re the monitoring guys, not us – you tell us”.
Instead, the discussion should be around understanding what work does a given service do and how does it do that? How can you tell that it’s working as expected, or working at all for that matter? Are there any limits that they are aware of? How do you get to this information today? These are questions that the service owners should be able to answer and take responsibility for (if they can’t, you may have deeper problems).
Now you have a target, a set of information that the monitoring tools need to make visible, information that is defined by and important to the business. Because it’s linked to the business, you may also have a way of figuring out what the value of that information is based on the value of the service it is associated with. With this target in hand, you can go ahead and do the fun technical stuff of figuring out what are the best tools to provide the information. Measuring the success of monitoring becomes much easier as it can be focused on service delivery outcomes rather than technical features of the monitoring tools.
Simple right? Well, no – and I won’t pretend it is. Doing this stuff is hard work. It requires communication, commitment, and empathy (things you’ll be familiar with if DevOps is part of your vocabulary). You may need to talk to groups you traditionally haven’t had much direct interaction with. In many cases, it represents a fundamental cultural change and you may even want to bring in a partner to help mediate the interactions. But it is essential and the only way to ensure that the monitoring tools that you have are the ones that you need, not just the ones that have the shiniest technical goodies.