Skip Links

Network World

  • Social Web 
  • Email 
  • Close

Troubleshoot to repair, or predict and prevent?

By Steve Henning , Network World , 06/10/2008
This vendor-written tech primer has been edited by Network World to eliminate product promotion, but readers should note it will likely favor the submitter's approach.

It sounds simple. Instead of spending hours or days troubleshooting an application slowdown or system outage, why not just avoid it to begin with?

Until recently, the only way for IT organizations to resolve problems was to sift through alerts, log files and trouble tickets and burn the midnight oil on conference calls. Today, powerful analytics and automation capabilities built into system management tools can help organizations identify and resolve issues before they become problems.

Interconnected business services have made management exponentially more difficult. Collecting more data isn’t the answer because:

* Monitoring static thresholds triggers a flood of alerts, most of which do not represent actual problems.
* Problems are identified by groups of abnormal behaviors, not a solitary metric.
* With tens of thousands of devices and millions of metrics, the correlation effort required to identify problems is impossible.

This deterministic approach is not only ineffective but also cannot scale to accommodate increasing complexity. Highly complex service infrastructures demand a new approach, a probabilistic approach.

Intelligent system-management solutions now employ sophisticated correlation algorithms to sample subsets of metric data and deliver accurate information about potential system behavior. In addition, new learning technologies continuously refine alert thresholds — providing dynamic thresholds that recognize and accommodate the normal ebbs and flows of business. A probabilistic approach allows organizations to solve problems faster and with far less manual effort.

Intelligent management solutions integrate with existing monitoring infrastructures, automatically collecting and analyzing metrics from across all tiers of an application — such as Web server, application server and database tiers.

The first job for the intelligent management solution is to learn the normal behavior of the application. It should be possible to build behavior models for each resource in your infrastructure by using dynamic thresholding algorithms to continuously collect data. This makes it possible to compare the real-time measurements of metrics with the expected range of values to determine when a metric should trigger a threshold violation.

Partner Content

NetScout is one of the world's premier providers of integrated network and application performance solutions.

www.netscout.com

Know First

Get Proactive — Move from Troubleshooting to Monitoring to Management with nGenius K2's Service Dashboard & Intelligent Early Warning Alarms

Watch the Video

Know Where

Get Rapid Performance Problem Isolation with nGenius Performance Manager and Diagnose Problems up to 70% Faster!

Learn More

Know Why

Get the Details to Validate and Solve your Toughest Performance Issues with nGenius InfiniStream and Sniffer Intelligence Modules

Read the Whitepaper

Comment
Login
Forgot your account info?
Add comment
Anonymous comments subject to approval. Register here for member benefits.
Have a NetworkWorld account? Log in here. Register now for a free account.

Videos

rssRss Feed
Get instant email notification when white papers, webcasts, executive guides are added to our library. Stay informed and up-to-date with the latest on IT Technologies with Network World's Resource Alerts.

Whitepapers

File Integrity Monitoring: Secure Your Virtual and Physical IT Environments

Discover the capabilities your file integrity monitoring solution should have to effectively secure...

PRICING STUDY: GROUNDWORK OPEN SOURCE VS. HP SOFTWARE

This study examines in detail the cost savings offered by GroundWork relative to comparable...

Toward More Flexible, Next-Generation Collaboration Solutions

A recent study by CIO Magazine and IDG Research Services found that while collaboration tools are...

Webcasts

PoE Plus: Impact on the PoE Market

The standard for Power over Ethernet (PoE), IEEE Std. 802.3af(tm)-2003, advanced networking,...

Intelligent Mobility: BlackBerry Technical Seminar 2008

The virtual BlackBerry Technical Seminar keeps growing in popularity every year, and we want to...

Harnessing the power of communications to increase workplace performance

Due to the convergence of IT and telecommunications technologies, the business workplace has been...

Special Reports

Bringing IT Operations Management to Open Source and Beyond

Learn how to cost effectively and efficiently manage your open source environment in this...

Executive Guide: Virtualization Reality Check

Find out why analysts say approaching virtualization with an ounce of caution is wise. And also why...

Ethernet Services: WAN options mature

WAN Ethernet services are reliable, cost-efficient offerings that are widely available and in a...