Working as an IT pro might not sound the like the most exciting role, but I like to think it is like being a detective, solving IT mysteries and making end-users feel secure. Whether their network is running slowly or they have discovered malware on their system, we’ll be there to find the source of the problem and resolve it quickly, while having little or no impact on their productivity.
But the super-sleuthing really comes into play when IT technology decides to throw a red herring your way. Although you are expecting – and prepared – to solve one issue, it turns out to be something else entirely. At this point, you have to act quickly to pinpoint the real problem, and ensure your IT network remains safe and protected.
I’ve experienced this on several occasions, although the most memorable one was on a rather wet and dreary Monday in the network operations center. Now, every once in a while, you come across a day you both love and hate: Random Crisis Day. You love it, because the thrill of finding and resolving these issues truly makes you feel like the Sherlock Holmes of the IT world, but on the other hand, doing so can be quite a frustrating process.
Initially tasked with some traffic analytics implementation queries, I soon found myself tackling quite a different problem. This was a new office with just over 100 employees, new gear and a decent WAN link, but employees out in the branches weren’t able to send or receive emails. They were helpless and reduced to running around the office and unexpectedly lamenting the days of snail mail and pigeon post. I would have to use my investigative powers to get these emails back on track – and quickly!
The server admin bounced the Exchange to no avail, so next was checking the various Exchange settings. And despite all the new hardware, they hadn’t fully implemented application monitoring, so I spent an hour manually combing settings and performance reports. The only thing that popped out was long send queue lengths and a low delivery rate. Memory, CPU and Windows performance counters all seemed normal.
While monitoring a client PC and running low on ideas, I looked into a recurring ‘ping’ that kept cropping up, and there it was: incredibly slow response times, even for a WAN. Luckily, the network administrator arrived to ask if we were fixing “the frozen network interface controller (NIC).” He’d received an alert on his phone the previous day from his network monitor. And so we discovered that several workstations had been running BitTorrent and were hogging the WAN link port.
With another mystery solved, the lesson here is that while all network monitoring solutions might be able to provide visibility of business processes and the status of inter-related systems, you can’t ignore the foundation, otherwise the simplest of problems can become the hardest to resolve and wreak havoc in the meantime.
Using a network monitoring solution gives you real-time analysis of the packets on the network, and calculates the time it takes for a user to get information back from an internal system. Tools like this are vital to maintaining a secure and protected network, and network monitoring must go beyond ‘ping’. It must also include bandwidth information, dropped and errored packets, the state of WAN interfaces, as well as information on the status of network hardware like CPU and RAM.
With this kind of overview, you can resolve any problem quickly and easily, be it a slow responding application, a network error or a security breach.