Thoughts on Data Science, IT Operations Analytics, programming and other random topics


Are you ready for IBM SmartCloud Analytics - Predictive Insights?

02 Oct 2014

(first published on IBM Service Management 360 site)

With all the buzz about analytics being the next big thing in performance management (PM), you owe it to yourself and your company to assess and understand the reality of this technology. This “next big thing” is here today, and it is already being deployed by your competitors to improve the lives of their clients.

There is incredible pressure to investigate this technology and understand what it can do for your business. In order to maximize your chances of success when carrying out such investigations, I recommend that you pause for a few moments and think about whether you and your organization are actually ready to play.

There are many offerings and components that fall under the broad umbrella of analytics. My own immediate experience is with IBM SmartCloud Analytics – Predictive Insights and the anomaly detection it provides. This is but one kind of technology and though my comments below are made based on this offering, the basic considerations are more widely applicable and can be used as you seek to understand and eventually incorporate these capabilities into your environments.

Prior to deploying SmartCloud Analytics – Predictive Insights or a similar offering, there is generally an expectation that the target environment has a certain level of PM maturity. The areas of concern are:

  • Data collection mechanisms
  • Analytic output presentation and visualization mechanisms
  • Processes (human) to deal with analytic output

A basic requirement for success is to have reasonably up-to-date systems and processes in each of these areas. However, let’s look a little more deeply at what is implied by that.

Where’s the data?

Analytics starts with data, and so the initial question is: “From where do the analytics get the data?” Analytics systems are typically focused on the analysis of data, not the day-to-day data collection of it. This collection activity is better suited to existing performance management systems with the analytics systems using data from those PM systems. The analytics tools themselves generally expect to connect to the various data repositories in the PM systems and extract the desired data—either in near real time or in backlog modes, depending on the use case. An entry-level requirement for analytics, then, is to have an existing PM infrastructure in place where data is regularly collected, stored, summarized and more. This might seem obvious, but we’ve had a few interested clients come to us with a view to exploring our analytics technology, and yet they don’t have the basic robust data collection regimes in place.

Who – or what – gets the output?

The insights derived from the analytic processing needs to be made available to the operators to act upon, and so the next key question is: “Where do the analytics tools send their output?” Often, insights are delivered by the analytics tools in the form of events (an event indicates an emerging problem with a particular server) to your event management system. Beyond that, the analytics systems often come with their own user interface (UI) to supply the details and help the operator decide what to do with the information (see my previous blog post, Anomalies, alarms and actionability, for a sense of the kinds of discrimination the operator needs to make).

In order to seamlessly integrate the analytics into your environment, you should have a sense of how the output information can be integrated with other existing systems, typically those from your other PM systems. For example, do you want the events to go to your existing event handling mechanisms, or will you create new ones specifically for these analytics? Do you want the analytics UI to be integrated into your existing portals? What will you do with the insights?

Finally, the analytic insights gathered are generally new, and they often span more than just one tier; for example, both operating system and application perspectives may be presented together. These deeper insights and broader perspectives are different from what is usually handled by, for example, operators dealing with anomaly events from a single application. Therefore, to interpret and act upon these new insights, new organizational processes and wider skills may be required. I say “may” because this very much depends on the skills available in the organization. Minimally, some thought needs to be given as to who will be the recipient of this new analytic information, the human-centric processes for dealing with the information and how those processes will relate to and complement existing ones. This is probably one of the more interesting areas given that the processes are generally very particular to each individual environment.

Maturity in all the above areas is key to successful deployment and integration of new analytic capabilities in general. However, even before we get to the stage of attempting a full formal deployment, a proof of concept (POC) or trial is often required. Although these are necessarily limited in scope, the three areas of concern and the implied questions above still need to be answered before POC progress can be made. Of course, the answers for a POC may be a little different than those for production deployment, but more often than not they are actually the same answers!

POCs tend to be much smaller than full production deployment, so one additional question emerges: “Which subset of your environment will the analytics focus on?” It is important to focus on a subset for many reasons, and scale is just one consideration. Another reason might be the availability of other information like trouble tickets and issue reports, so you can cross-reference analytics output with known information to make assessments on the usefulness of the analytics output. It is also important that the trial scope is having issues that are worth solving. Why else would you be exploring this technology? Believe it or not, I have participated in some trials where we learned in hindsight that the selected environment area was not having problems! In some cases, there had been problems many months ago, but those problems had already been resolved. After a trial period with not much of interest turning up, we realized what had happened. It seems silly in hindsight, and filtering out such situations early on became a best practice for us.

Finally, beyond POCs, the scoping question applies to full deployments too. You’ll always be trying to pick some area of your environment to focus on, and choosing the right area, with the right kind of problems that resonate with your chosen analytic technology, is vital.

If you can convince yourself that you are in good shape with respect to the areas mentioned above and can answer the implied questions then you are well on your way to being ready to give this technology a try. I wish you luck in your explorations, and I welcome your comments and thoughts on this topic either here or on Twitter @rmckeown.

comments powered by Disqus