2012 Evaluation Team Report: Server/Storage Monitoring
The Server and storage monitoring product evaluation group evaluated numerous different monitoring solutions. We focused our evaluation on the following areas:
- Platform Neutrality
- Application-level monitoring functionality
- Alerting / monitoring / remediation capabilities
- Reporting functionality and customizability
- Multi-mastered / no single point of failure
- Multi-location scanning
Two of the products evaluated - Nagios Core, and Icinga, were identical in nearly all of the above areas of functionality; subsequently they will be grouped together, except where noted separately. Below is the summary of five of the products that our group evaluated.
Our group also evaluated two other products, DSRazor and CPTrax, which were determined to be out of the scope of this evaluation group and will not be included in the group's findings.
- Great for environments with a large Microsoft infrastructure
- Server infrastructure is scalable by simply adding management servers to the Management Group
- Established Management packs that monitor for best application level problems, are developed and maintained by the vendors, and have great out-of the-box monitoring
- Server components must run on Windows, but can be used to monitor any OS.
- SNMP monitoring capabilities for network equipment.
- Agent based monitors available for Linux and Windows systems alike.
- Can do (albeit more limited) agent-less monitoring.
- Select Licensing costs for the server backend are relatively inexpensive (separate licenses for SQL and 1 management server under $2000). Additional licensing cost for each monitored system ($6/computer for server OSes; $6/user for client OSes) will cause the price to increase with the installed base.
- Agent-based monitoring for Windows; Monitoring for Linux systems limited to SNMP. Subsequently, the monitoring capabilities for Linux systems are more of an up/down nature.
- Server components can run on Windows or Linux platform
- Basic monitoring functionality is built in, and the platform is very flexible; with some tweaking more advanced monitoring is feasible.
- Affordable startup cost ($1225)
- Great way to get started with monitoring.
- Open-source, with a community driven environment
- Free to get started
- The server components must run on a *nix environment, but can monitor any OS.
- Icinga is a little more full-featured out of the box but the overwhelming majority of plug-ins will work on both platforms.
- With some tweaking, can be used to monitor SAN volume disk space
- Nagios: $2400 / year for enterprise version with support, which has a higher feature set
- Icinga: Community driven support; no enterprise level support
Dell OpenManage Essentials
- For groups using primarily Dell equipment, OpenManage Essentials should definitely be deployed to monitor the health of your Dell hardware, and alert of hardware failure
- Can also be used to monitor and push out updates for BIOS and firmware.
- In future releases, it will also incorporate e-mail home functionality, to streamline the replacement of failed components.
- SNMP monitoring of client machines is possible, but beyond that, it is not a one-stop shop for monitoring.
- Subsequently, this should be used in conjunction with an OS and application monitoring platform.
- The Server components must run on a Windows server (VM or a physical machine), but can be used to monitor hardware on servers of any flavor as long as the underlying hardware is Dell.
Storage Monitoring Products
While our group did not meet to evaluate any specific storage monitoring products, based on our general consensus based on experience, we have found that storage vendors generally have a built-in, or available for purchase, monitoring solution. These products seem to meet the needs of the storage infrastructures.
Our group found that storage solutions from Dell, including Equallogic, and Compellent, as well as Promise's offerings and the DS series Fibre Channel storage offerings from IBM all had solid built-in monitoring solutions. At a bare minimum, these products will alert you of issues with the arrays; others will send e-mail home alerts, to start the tech support process without any admin interaction.
While the same most likely holds true for storage solutions from other major vendors, if you are planning to refresh your storage environment, we recommend you first verify with the technical sales team that the built-in monitoring offerings meet your particular needs.
Less is more, in terms of alerting. When configuring alerting thresholds, only send alerts for the most critical events. Rely on monitoring capabilities for less-severe issues. Triggering e-mails or pages for less severe issues can and will lead to accelerating fatigue which can cause more critical issues to be overlooked.
Products that use domain-based authentication for agent deployment and communication (ie: SCOM) may have obstacles when dealing with multiple domains. This could cause deployment issues and obstacles for multi-tenant environments.
If you are a Dell shop, strongly consider running OpenManage Essentials alongside whichever OS/Application stack monitoring solution you select. It provides great monitoring of the health of the hardware, and can help automate firmware and BIOS update deployment. Other manufacturers may have similar products as well; if you run other hardware, contact your technical sales reps, to inquire on their offerings.
There is no one turnkey monitoring product that will suit everyone's needs. When selecting a monitoring platform, first consider your environment, and determine what your needs are. This will help you when selecting a product and when architecting the prerequisite server infrastructure.
Being both free and powerful to get started, Icinga or Nagios are attractive options for small to large deployments, although they both require a *nix environment to run on which may prove problematic for Windows-only environments. They both require a fair amount of configuration expertise to reach their full potential so are not simple deploy-and-forget solutions. Out of the box, Icinga is a more powerful and flexible solution, but has no enterprise-level support; departments desiring official support may wish to consider investing in Nagios' Enterprise support program.
Big Brother and SCOM are options that will be more attractive to Windows-heavy environments as they can both be installed on Windows servers. Since both are paid supported solutions, system administrators deploying either would not be on their own.
SCOM is a good solution if you have only one domain, especially for large departments. The startup cost is fairly low, but costs scale depending on the breadth of the installation base. The size of your deployment should be considered when making a decision, for this reason.
Big Brother is another strong option as a paid solution; it has a robust client and an intuitive interface; however its Linux monitoring is more limited so it is more suited to a Windows-heavy environment.
Among the products we investigated, Dell OpenManage Essentials is in something of a class of its own as unlike the other solutions, it is more oriented toward hardware monitoring than service and OS monitoring. This product is an excellent solution to use alongside another platform, for large Dell install bases. While early in its development, OpenManage Essentials also shows great promise for future releases, including some self-remediation functions such as e-mail home, in which Dell can be automatically notified of hardware problems, and replacement parts can be dispatched without any interaction from the administrators.