Tech Monitors On The Lookout For Trouble

Tools can help insurers, agents guarantee IT systems remain up and running

Organizations with a large information technology infrastructure require a way to remotely monitor their equipment and applications to ensure that they are up and running.

The most common and simple of monitoring tools is something called a "ping" test. When most people hear the word "ping," they likely think of golf. However, in this case, we are referring to a test on the Internet that is used to determine if a Web site or Web address is available, or to check if an Internet connection is up and running.

In a company's monitoring environment, a ping test is also used to determine if various network devices or servers are available. Although a ping test will indicate whether equipment is up or down, it definitely has its drawbacks–one of which is that it gives no indication as to the health of the device or applications running on it.

Because the test can indicate if the device is reachable and functioning, it is therefore best used as a sort of weather vane. In other words, it is a quick diagnostic test that can help locate the general location of the problem and, ultimately, lead to more advanced trouble-shooting techniques.

Within the monitoring environment, ping tests are automated and test most devices on a set schedule. If a failure does occur, the device controlling the ping test will send out an alert in the form of an e-mail or page, or by recording the event to a log file.

Due to the limited usefulness of ping tests, more advanced diagnostics are necessary. Most organizations will install third-party server and device-based monitoring software.

These comprehensive, high-end monitoring systems will include modules for most popular operating systems and network devices. They collect very detailed information about the environment and report the data to a centralized monitoring server, which can send out alerts when a device fails or is about to fail.

For example, if disk space gets too low on a server or if CPU (central processing unit) utilization goes too high, a technician will receive a page to address the problem.

The real benefit of having a comprehensive monitoring package is not only to provide 24/7 heath checks but also to help technicians identify potential trouble areas long before equipment actually fails.

Although we have great tools that enable the IT staff to know whether there are problems within the computing environment, the question remains–do these tools really benefit the end-user? The short answer is yes. However, we can do much better.

Obviously, when a server fails, this directly impacts the end-user. In some cases, it's quite possible to have all the servers and network devices up and running while the users cannot work due to a slow or failed application.

So, how do we really know if the end-user experience is satisfactory? This is where application monitoring comes in.

Network and server monitoring gives a good indication of the health of our environment, whereas application monitoring measures the end-user experience. Application monitoring allows us to track two of the most vital indicators of the user experience–application response time and availability.

I recently received a call from one of my staff members reporting end-user response time problems with our e-mail system.

According to our hardware monitoring tools, the e-mail server was available and working fine. However, our application monitoring system identified a one-to-two minute response time issue, then alerted our support team to address the problem immediately.

Without this monitoring, it may have taken at least an hour to determine if there really was an e-mail problem, because the equipment appeared to be functioning properly.

An application monitoring environment can have several modes of deployment. In some instances it can be completely automated, where notifications are sent out at the first sign of trouble. In other cases, it can be a simple indicator, where a technician decides if support is needed.

We use a combination of automated alerts and technician-interpreted data. We have a central dashboard that gives a red, yellow or green status on any one of the 67 applications we monitor. An unexpected benefit that comes from our application monitoring deployment is that it also allows us to identify some hardware problems.

For example, we have several applications that are hosted in a third-party mainframe environment. If all the mainframe applications fail, we're pretty sure the error is either on the mainframe or on the link between our company and the third-party's site.

In the past, our support teams would troubleshoot a single application, not knowing that other applications were having the same problem. In doing so, valuable time would be lost.

Most people realize that equipment and application failures are inevitable in any organization, but what really sets a world-class IT department apart is how well it can respond to system issues and prevent them in the future.

We see plenty of server and network uptime reports in our IT department. In most cases, these reports mean little, simply because it is the experience of the end-user that serves as the ultimate gauge of how well an IT department performs.

For us to really measure the performance of an IT support team, our measures need to focus around the users. Server and application monitoring is a step in the right direction.

Tony Hashem is the chief technology officer for the life business at Genworth Financial in Lynchburg, Va. He can be reached at Tony.Hashem@genworth.com.

Caption:

Monitoring packages not only provide 24/7 heath checks but also identify trouble areas long before equipment actually fails.

Quotebox, with mug?

"Most people realize that equipment and application failures are inevitable, but what sets a world-class IT department apart is how well it responds to problems and prevent them in the future."

Tony Hashem

NOT FOR REPRINT

© Arc, All Rights Reserved. Request academic re-use from www.copyright.com. All other uses, submit a request to TMSalesOperations@arc-network.com. For more information visit Asset & Logo Licensing.