Copyright © 2003-4 UKUUG Ltd.
UKUUG

UKUUG LISA/Winter Conference

High-Availability and Reliability

25-26 February 2004

Bournemouth


Event Homepage Provisional Timetable Programme & Tutorial Booking Venue & Travel Accommodation

All Speakers and their Abstracts

Table of Contents


Cluster filesystems

Steve Whitehouse

ChyGwyn Ltd

 


Experiences with the Sobig worms
and how we combatted them (and other spam)

Niall Mansfield

UIT Cambridge Ltd

 

In August-2003 our incoming mail increased from a few hundred messages per day (a few megabytes) to 10,000 messages/day (2 Gigabytes). The increase was due to the Sbig.F worm.

This traffic overloaded just about everything -- our network, our servers, our disk space, and our network admins who had to handle all the junk.

To overcome the problem we:

- added new hardware and upgraded the network
- installed the Exim mail server software
- used procmail and SpamAssasin to filter spam
- used the Exiscan add-on for Exim, to scan messages for virus and other rubbish
- installed SMTP virus-scanning software to work in conjunction with Exiscan

This paper describes how we went about this, the problems we had, and how some of the old unix tools (like "mailx") were surprisingly useful. We also give some statistics about the behaviour of the Sobig.F worm. (By the time the Conference comes along, we expect to have more experience with, and information on, new variants of the Sobig worm.)


Getting the best from your server with User-Mode Linux

Matthew Bloch

Bytemark Hosting

 

Serious software virtualization is a technique which has only recently become available to the free software community despite being one of the most valuable techniques for server consolidation and authoring of kernel code. It has until very recently been too demanding a job for free software authors to tackle practically in respect of a modern OS. When it has been got right and made useful, its fruits are normally guarded by expensive and restrictive licenses.

User-Mode Linux has broken this rule and permits a network administrator to run 20 to 30 independent Linux kernels on a =A31000 commodity server, making much better use of expensive space in a data centre for medium-demand applications. It has also allowed kernel developers to write kernel code which can be quickly and frequently tested as an unprivileged user, and is allowing students and tinkerers to set up virtual networks to try out techniques that would otherwise require an unwieldy quantity of hardware.

Back in August 2002, Bytemark owned a single cheap server in Telehouse serving its clients' needs, and system administration was shared haphazardly between three parties. We had recently installed a second more powerful server and wanted to save on rack rental by only keeping the one piece of hardware running in the face of failures from the old server. Within two days we had copied our old drive image to the new hardware, and put it into production use again running a User-Mode Linux kernel. What's more we made three copies of the old server so that everyone could have root access to their own independent version of it. Eventually our clients' applications were also running inside a virtual Linux kernel; still without taxing the hardware! The experience impressed us so much we decided that User-Mode Linux would be the future of our business and developed our hosting product around it.

This talk will cover the new challenges faced by a Linux system administrator in the face of maintaining hundreds of independent UML nodes: setting up IPv40 and IPv6 routing, boilerplate filesystems, techniques for quick setup, movement of virtual machines between physical hosts, external monitoring tricks and (un)suitable hardware choices.


Hardware for high availability

Stephen Mayo

Pre-sales consulting group, Hewlett-Packard

 

We review how hardware has evolved to provide greatly improved resilience to component failure. Servers, storage and cluster interconnect technologies are discussed.

Today's servers use a combination of duplicated components, parity checks, hot- pluggable components, RAID disks, NIC teaming, etc, to provide high availability (HA) at the hardware level. Bus technologies such as PCI-X and USB have built-in HA support with features such as parity checking and hot-pluggability, which a llows hot-plug removal, replacement and addition of expansion cards and peripherals, subject to operating system support. The benefits of emerging bus technologies such as Infiniband will be summarised. The latest hot-pluggable RAID protected memory technology is also described.

Equally important is the ability to detect, report and even prevent hardware failures via system management tools which are now routinely integrated into component firmware.

As well as dramatic increases in the speed and capacity of hard drives, storage has evolved from single and mirrored disks through simple SCSI and FC-AL RAID ar rays to today's high-end switched fabric Storage Area Networks (SAN) which supp ort zoning, multi-hosting, multi-pathing, storage virtualisation and SAN based v irtual copy and data replication.

Clusters are widely deployed to increase both availability and performance. Shar ed nothing architectures still dominate both the high-end 'big iron' market as well commodity server clustering but shared disk architectures, as originally developed by DEC (now HP), have recently seen a resurgence in products such as O racle's 9iRAC cluster. Compute clusters consisting of large numbers of small UN IX or Linux servers are increasingly replacing large 'super-computers'. Cluster interconnects range from low-cost industry standard Ethernet to purpose-designed proprietary interconnects such as Memory Channel and HyperFabric. The demands on the cluster interconnect vary for different types of clusters and applicatio ns. Key differences between, and benefits of, some common cluster interconnects are described.


The Evolution of the Linux-HA Project

Alan Roberston

IBM Linux Technology Centre

The Linux-HA project is used in thousands of sites across the world. It provides software-based high-availability clustering services which are used to improve the availability of many kinds of services.

Although the project is named Linux-HA, it is quite portable, and the software runs on other OSes including FreeBSD and Solaris.

This paper will present an overview of the current capabilities of the Linux-HA software, the near-term new capabilities anticipated for 2004, and a longer-term vision for a comprehensive framework for high-availability services for the future.


IPv6 Deployment Status

Tim Chown

University of Southampton

IPv6 adoption has to date been slow in Europe. While the research networks have paved the way for validating the technology, commercial deployment has been relatively insignificant to date. However, a number of recent developments, in particular the US DoD's announcement that it is requiring all procurements to be IPv6-capable, are likely to accelerate the uptake. In this overview talk we describe UK/European IPv6 deployment status, the support in Unix systems for IPv6, and briefly discuss the drivers and barriers to further deployment.


javaGMA: A lightweight implementation of the Grid Monitoring Architecture

Mark Baker & Matthew Grove

Distributed Systems Group, University of Portsmouth

 

Wide-area distributed systems require scalable mechanisms that can be used to gather and distribute system information to a variety of endpoints. The emerging Grid infrastructure is rapidly being taken up for technical computing as well as in business and commerce. The Distributed Systems Group at the University of Portsmouth has, for the last few years, been developing a resource monitoring system for the Grid that can gather data from endpoints, filter and fuse this data for subsequent use by a variety of clients. The monitoring system, known as GridRM [1], needs to distribute information over the wide area between, so called, GridRM gateways. The software for distributing this information around GridRM needs to be lightweight, modular, fast, and efficient. There a number of ways to do this, we have, however, decided to use the Grid Monitoring Architecture (GMA) [2], which is the mechanism recommended by Global Grid Forum (GGF) [3]. The GMA specification sets out the requirements and constraints of any implementation. The GMA is based on a simple consumer/producer architecture with an integrated system registry. There are several implementations of the GMA, including the R-GMA from the European Datagrid project [4], and PyGMA from LBNL [5].

We have investigated and studied the existing and emerging GMA implementations with reference to the needs of GridRM and found that none currently fulfill our requirements. With this in mind we have developed javaGMA [6], a lightweight Java GMA compliant reference implementation, which has a simple API that is easy to use, deploy and extend.

This paper describes javaGMA, a Java-based GMA reference implementation. The first part of the paper outlines our motivation, and the particular features that we need to fulfill the requirements of GridRM and be GMA compliant. We then move on and briefly outline the exiting GMA implementations with reference to their features and functionality. The second part of the paper describes javaGMA, its constituent components, and their features. In the next part of the paper we present the results of our performance comparisons of javaGMA versus R-GMA and PyGMA. Our tests were carried out over the wide area between multiple Linux clusters. We then review our findings, and discuss the potential advantages and disadvantages of javaGMA. In the final part of the paper we summarise the paper, and draw a number of conclusions about javaGMA, and outline our future work.

[1] GridRM, http://www.gridrm.org/
[2] GMA, http://www-didc.lbl.gov/GGF-PERF/GMA-WG/
[3] Global Grid Forum, http://www.ggf.org/
[4] R-GMA, http://www.r-gma.org/
[5] PyGMA, http://www-didc.lbl.gov/pyGMA/
[6] JavaGMA, http://dsg.port.ac.uk/projects/javaGMA/


A new Cluster Resource Manager for Heartbeat

Lars Marowsky-Bree

SUSE Labs, SUSE LINUX AG

 

The Linux HA project has made a lot of progress since 1999. It's main application heartbeat is probably one of the most widely deployed two-node failover solutions on Linux, and has proven to be very robust. Linux HA not only has a large user base, but its modular nature has also attracted many developers. However, a lot of work remains to be done and some is even on-going (though patches are always accepted).

This talk outlines the key features and design of a clustered resource manager to be running on top of and enhancing the Open Clustering Framework infrastructure provided by heartbeat.

The goal is to allow flexible resource allocation and globally ordered recovery actions in a cluster of N nodes and dynamic reallocation of resources in case of failures (fail-over) or in response to administrative changes to the cluster (switch-over).

It will provide insight for potential contributors looking for a great project to work on.


MailScanner

Julian Field

University of Southampton

 

E-mail viruses costs businesses millions of pounds every year. Spam accounts for around 60% of all e-mail traffic, wasting large quantities of network bandwidth and resources.

There are many commercial e-mail systems available for a high cost that claim to help stop spam, but they are rarely effective in real-world environments. They provide support for a very restricted set of virus scanners, usually one or two, forcing the use of particular products in what should be a separate purchasing decision.

MailScanner is a highly-respected open source e-mail security system, with more users than AOL and Hotmail combined. It is used at over 30,000 sites and processes over 750 million messages per day. It supports the use of any combination of 17 different commercial anti-virus engines for reliability and good coverage. Its anti-spam system incorporates SpamAssassin, which without doubt is the best anti-spam engine available at any price.


MySQL High Availability Features

David Axmark

MySQL

 

MySQL is a different from other FOSS (Free and OpenSource) projects since it was started as commercial company that developed a OpenSource project from day one. Most other Free Software/OpenSource databases was either introduced by 'normal' proprietary commercial companies at the end of their commercial life or the result of university research.

The main benefits of using MySQL are Speed, Robustness and Practicallity/Usability.

The talk with start with a history of MySQL and then continue with a overview of the current MySQL functionality (version 4.0,4.1, 5.0 & 5.1).

MySQL has with these new releases stared to add all the "standard" functionality (subqueries, stored procedures, triggers, Unicode etc). We have also added some non standard features like Full Text search, Geographical data with more in the pipeline. MySQL are already used in mission critical applications at companies like Yahoo!, Cisco and Google.

MySQL AB uses a dual licensing scheme there the same source code is available under both a GPL and a non GPL commercial license. This unique way of mixing a normal software business with Free Software will be mentioned covered in the talk.

With over 35000 server downloads per day only through MySQL.com and an estimated 4 millions installations the MySQL Database system (TM) is one of the world's most used SQL databases.


Open Source Capitalism: Innovative Business Models for an Innovative Development Methodology

Matt Asay

Director, Linux Business Office & Open Source Review Board, Novell

 

While the software industry struggles to figure out ways to squeeze old wine (intellectual property) into a new bottle (open source development), the patterns for monetizing this new breed of software are all around us. Outside of IT.

This presentation takes the software industry to task, arguing that its problem is not how to make intellectual property relevant to a changing software landscape, but rather how to innovate payment models to keep pace with technology. Several alternative models will be discussed.


NetRAID

Peter Breuer

Universidad Carlos III de Madrid

 

This talk will detail necessary adaptations of RAID for the context of networked storage resources, as prototyped in and for the Linux kernel. In that context, ``network block devices" substitute for the usual hard disks in the RAID array and special mechanisms for reporting, recording and recovery are required which act together in order to lower re-synchronisation transfer totals to levels the network can bear, lower latency, and provide intelligent control that recognises and compensates for temporary network brown-outs.

Standard failover techniques are easily adapted to make best use of NETRAID mirrors. Combined with journalling over NETRAID, zero takeover delays can be achieved at the file system level, albeit at the cost of increased write latencies in normal operation. Asynchronous write-behind in NETRAID offsets such delays.

In this talk I will discuss the engineering considerations in choosing between different possible configurations of RAID and fallover strategies.


Preparing Linux for the Enterprise

Richard Moore

IBM Linux Technology Centre

 

Often cited is that lack of readiness of Linux for the Enterprise with respect to Availability and Serviceability. This talk discusses the state of readiness of Linux for prime-time use. In so doing it roundly refutes the accusation of unreadiness. The speaker will also discuss some of the leading edge work being done in the IT industry for Linux availability such as the Common Diagnostic Model industry standard and the On-line Diagnostics Framework which IBM is working on. The future of high-end systems will be those that are not only highly-available but also operate on principles of self management such as those defined by IBM's Autonomic Computing architecture. The speaker will conclude with an assessment of how well Linux is and will be doing in the Autonomic Computing arena.


Scaling up Cambridge University's email service

Tony Finch

University of Cambridge

 

One of the central services provided by the University of Cambridge Computing Service is the email system, Hermes. Since its introduction in 1993 it was based on UW IMAP and a shared Unix filestore. This architecture has a number of inherent limitations in performance, scalability, resilience, and cost-effectiveness. In order to address these problems, and particularly users' desire for larger messages and disk quotas, the system is being replaced with a mostly-compatible reimplemantation based on multiple CMU Cyrus message stores.

This paper explains the effects of the limitations of the old system, including the restrictions imposed on our users, and the implementation of our webmail software, Prayer. I describe how Cyrus has been modified to make the transition to the new system easier, and to address the anticipated reliability and administration problems caused by using more inexpensive hardware. Finally, after covering the changing role of our central email relay in the new architecture, I discuss some possible development projects for the future.

Contents
Old Hermes
software and hardware architecture
limitations
Prayer Webmail
architecture
performance
New Hermes
architecture
compatibility
data replication
Central Email Switch
functions old & new
Future Directions

Storage Replication with DRBD

Philipp Reisner

LINBIT Information Technologies GmbH

 

High availability is a hot topic for service companies which offer their services via electronic networks. As Linux is in a position to gain a market share in newly build systems, there is a lot of interest in HA solutions for Linux based systems.

Quite a number of conventional HA solutions are brought to Linux these days and most of them are based on a shared storage device.

DRBD, however, is a more Linux-like approach to the field. As Linux brought PC hardware into the server market, DRBD brings HA clustering to the PC hardware. High availability in storage is realized by replicating the storage via off-the-shelf networking equipment to a second machine.

Since prices for gigabit ethernet hardware have declined and since 10 GB Ethernet has become the standard, currently available off-the-shelf hardware offers the required performance.

Basic problems and algorithms

In this abstract only the areas of interest are mentioned, the complete algorithms are described in the final paper:

In order to achieve shared-disk like semantics over a networking protocol, DRBD has to guarantee not to violate write-ordering constraints imposed by upper software layers like journaling-file-systems.

In the event of a cluster restart, the nodes have to find the node with the up-to-date data (or the fact that all nodes are up-to-date).

Background resynchronization in case of a node joining a degraded cluster.

New requirements in DRBD 0.7

Quicker resynchronization. If no bitmap of out-of-date blocks is available, use hash values of blocks to speed up resynchronization.

In embedded applications and unmonitored installations, the remaining node must be able to continue to offer the service if a node is permanently damaged.

Decoupling of resynchronization from the node's role-assignment. Real-world use shows that this is often required and eases the interaction between DRBD and the cluster manager. This of course requires that DRBD offers a consistent view of the data even if the node's backing storage is the target of a resynchronization process.

Shared disk mode for use with openGFS. Although the block operation will support active/active configuration, it will turn out if it is usable without a common view of the cluster membership.

Conclusion

DRBD 0.6 proved its applicability for a number of purposes, mainly deployments in data-centers. Its replication performance was sufficient, but resynchronization was too slow. DRBD 0.7 will be applicable in embedded environments as well and will increase the possible storage size by speeding up resynchronization.


 


For more information please contact UKUUG Problems? e-mail webmaster