LinuxConf Europe logo
LinuxConf Europe 2007
Conference and Tutorials
---------------------------------------------------
Sunday 2nd - Wednesday 5th September
University Arms Hotel, Cambridge, England

LinuxConf Europe 2007

Photos and reports

Timetable

Programme

Registration

LPI Exams

Conference Dinner (Sunday)

Duxford Excursion (Monday)

Exhibitors and Sponsors

Accommodation

Venue

Travel

About Cambridge

Kernel Summit 2007

Other GUUG events

Other UKUUG events

Matthias Rechenburg - Freelancer / Project Manager of the openQRM Project

Automated system and service monitoring with openQRM and Nagios

The first step to make sure all systems and services in a data-centers are running well is to monitor them. A well-known, proven and widely used monitoring tool is Nagios which is available for openQRM in the flavour of an additional plug-in. The second, also essential, step is the automatic handling of errors - what openQRM is famous for. The combination of the enhanced monitoring utility Nagios and the automated error-handling, high-availability and fail-over features of the openQRM data-center management platform creates a powerful and dynamic environment which reduces down-time of systems and services in a modern data-center to the minimum.

openQRM takes care about high-availability on application- and system-level by either initiating failover of an application to a hot-standby or failover of the full server including its applications to available resources using openQRM's rapid-deployment features. A high-availability pool is calculated by the openQRM-server according to the HA-requirements of the virtual environments to give an overview of resources, available and selected for possible system or service fail-over, and the high-availability status of the complete data-center.

To limit possible human errors during manual failover situations Nagios monitoring alerts are transferred to the openQRM-server as events. These are then evaluated and handled automatically by, e.g. initiating a system and/or service fail-over to one or more resources available in the high-availability pool. Directly, this mechanism connects critical services in the data-center through Nagios to the error handling procedure on the openQRM server.

The integration of Nagios within openQRM provides an automated setup and configuration of Nagios for the managed servers and fits the Nagios web console into the openQRM GUI via a plugin-extension hook. On an additional Nagios-service configuration page the system-administrator can select the services to be checked per system, easily. The Nagios-plugin also features real-time and history reports and graphs for network-traffic, CPU-utilization and memory usage of the Virtual-Environments. Setup of the Nagios-client for the systems managed by openQRM is also fully automated by a resource boot-service which is another plugin-extension hook provided by the openQRM-server.

This presentation shows how to turn to a dynamic, flexible, scalable and fully monitored infrastructure including automated error-handling in a data-center managed by the combination of openQRM and Nagios. It gives an overview of a typical reference installation and points out details about the integration of Nagios within openQRM.

Submitted paper

Paper (PDF) and Paper (tgz) .


G O L D  S P O N S O R  S I L V E R  S P O N S O R 
Intel
Intel
Google
Google

S  P O N S O R S
Bytemark
Bytemark
Sun
Sun
Novell
Novell
Positive Internet
The Positive Internet Company
collabora
collabora

M  E D I A   S  P O N S O R S
Linux User
Linux User & Developer
Linux Magazine
Linux Magazine
The USENIX Association
The USENIX Association

For more information please contact UKUUG Problems? e-mail webmaster
© Copyright 2007 UKUUG Ltd