Simple Network Monitoring Protocol (or SNMP for short) is a very well established technology for monitoring devices on your network. This protocol has been around for a while which means that it can be a bit arcane to work with but that is it also available on almost all modern day computer and networking hardware. The details of how SNMP works are far beyond the scope of this article but a knowledge of what SNMP is and does is essential for any manager that oversees and IT department.

SNMP has two basic functions. First is the ability to connect to a device over the network and read data about that device. The data that is available is often overly detailed and granular which can make working with SNMP a bit challenging. On the positive side the detailed data that is available allows for the monitoring for almost any error condition. For example, you can read the temperature of the CPU on your servers in order to know if they might be running to have an potentially overheat.

The second feature of SNMP is the ability to send what is know as a “trap”. A trap is some condition that when it is met will cause the SNMP system to send a notification to another system which is usually some sort of a monitoring system. For example, a trap can be sent to your monitoring system when the CPU on your server hits a certain temperature.

Although SNMP was designed for network management, the technology has permeated almost all network attached hardware. This means that with a good SNMP monitoring system you can monitor for problems throughout your organization. Here are just a few items you can monitor with SNMP:

  • Networking equipment like switches and routers. This allows for the ability to see and correct problems like switch failures or network congestion before they start impacting your business.

  • Servers. With modern servers you can monitor everything from the CPU to the hard drives. This means you can see small hard drive errors in time to swap a bad drive before it completely fails and you loose data.

  • End User Computers. With SNMP monitoring you can look for a host of problems on PC’s to get a bigger picture of your business of to proactively resolve problems. From hardware failures to network congestion you can see out in to the field and know what might be coming.

  • Environmental Monitoring. You can use various monitoring equipment for things like:

    • Temperature, flooding and smoke alarms

    • Security Alarms

    • Camera Systems

    • Business specific needs like monitoring fluid levels or temperatures within equipment.

Monitoring for SNMP data can be achieved through many different types of tools. There are many SNMP specific packages available but these are often to big and complex for small to mid-sized businesses. SMB’s often monitor using systems that also provide some other function like Manageability or Remote Monitoring and Management. Often these systems will have built in tools that automate the collection of SNMP data in such a way that the SNMP components are hidden. What is important is to make sure that your staff knows how to add additional monitoring in order to catch business specific problems that need monitoring.

With today’s tools we often forgo the usage of “traps” since networks are fast enough to allow monitoring systems to pulls data on a frequent interval. This allows for collection of data along with alerts that can be generated at the monitoring system without the additional configuration required to setup traps. In addition, this method of alerting allows for the setup of conditions to occur on the centralized monitoring platform rather than in hardware scattered throughout the environment.

My final thought on SNMP has to do with manufacturing companies and other companies that may have complex machines or equipment. You may find SNMP in devices ranging from scales to robots. This means that your IT department should be working with you business units to look for opportunities where monitoring could be beneficial to the business unit. It’s easy for the IT department to miss an opportunity to monitor for a condition on a robot that may cause a quality problem or might be a precursor to a failure in expensive equipment.