The free Unix world has traditionally thought of userspace environments as being quite tightly tied to the kernel. The vast majority of Linux-based distributions utilise a GNU userland, while BSD environments (unsurprisingly) tend to have BSD-derived userlands This results in issues when a user wishes to switch kernels. Rather than having to learn a small set of kernel-specific features, the user must instead come to terms with what is potentially an entirely different userland. This paper will investigate the practicality of porting the Debian userland environment to multiple kernels, allowing for easy transitions between kernels.
The number of ABIs available to Linux users has been increasing steadily.
The current FHS makes this difficult. All libraries must be placed in separate directories depending on their ABI, but the naming scheme of these directories is not well-defined. /lib is defined as the directory containing the libraries required to boot, with the FHS grudgingly allowing this to be a symlink to another directory. Each ABI's library directory must be of the form /lib(suffix), with the example cases being /lib32 and /lib64 for architectures that have a 32 and 64 bit ABI.
This presents two significant problems. Firstly, the suggested naming scheme is sufficiently coarse-grained that seamless multi-arch support is impractical. In the above example, the best case scenario would be that both PPC32 and x86 would expect their libraries to be in /lib32. This would, unsurprisingly, fail. To some extent this can be worked around by using different run-time dynamic linkers with different library search paths, but this will still fail with binaries built with the -rpath argument. The second problem is that a sufficiently fine-grained namespace would lead to a proliferation of long directory names in the root directory.
The basic concept of the Debian multiarch proposal is to provide a fine-grained library namespace without polluting the directory structure, while also allowing for integration with the package management system. A user should be able to install any binary capable of running on his system without having to be aware of the multiple ABIs involved.
To this end, libraries would instead be installed in a subdirectory of /lib defined by their ABI. Binary packages would depend on the appropriate library packages. Any binary package with an ABI supported by the OS would be installable, bringing with it the dependent libraries. Since the subdirectory names would be standardised, this would avoid any difficulties with -rpath.
On an AMD64 NetBSD system, perl would link against libraries in /lib/amd64-netbsd. A 64-bit version of Intel's C compiler would link against libraries in /lib/amd64-linux. Legacy NetBSD applications would link against libraries in /lib/i386-netbsd, and legacy Linux applications (such as Unreal Tournament) would link against libraries in /lib/i386-linux.
However, no matter how good the emulation, this is unlikely to be the preferred mechanism for performance-critical applications. Linux binaries running on NetBSD 2.0 would be unable to take advantage of the new kernel threading support, for example. It will be necessary to determine what level of porting effort is appropriate to maximise the functionality payoff.
Regrettably, Debian is not designed with the aim of being easy to port. As a consequence, cyclical build dependencies are not uncommon. The net result is that much of the early porting work must be done by hand. An assumption will be made here that a GNU toolchain already exists for the platform concerned.
The vast majority of packages will build without modification. However, those packages which are problematic often turn out to be awkward infrastructural packages, and their absence will hold up building of a large number of other packages. Much "by hand" intervention will be necessary at this stage. Gradually the required patches will be merged into the standard Debian source, allowing for autobuilding to take place without manual intervention. Modifications to increase portability to one platform tend to help porting to others, as troublespots are highlighted.
However, there are generally still one or two underlying awkwardnesses. For example, the BSDs have an approach to password databases that is markedly different to Linux. Passwords and user information are stored in a binary database, and the traditional plain text files are generated from this. This could be dealt with in two ways. Firstly, all applications that attempt to manipulate the password database could be ported to the BSD mechanism. Since the vast majority of password manipulation and checking is now done via PAM, this would require relatively little modification. Alternatively, an implementation of the Linux-style password management could be written and used. This would merely require that all relevant applications be modified to link against this new implementation -- alternatively, the implementation could be merged into the C library.
An especially tricky area is that of trademarks and branding. There is generally a desire that a derived product not be confused with the "real thing", and trademark law potentially allows this to become a legal issue if not resolved to the satisfaction of all concerned. Finding an appropriate name that clearly describes the derivation of the port without interfering with upstream's desires for lack of brand confusion is a job that should not be underestimated.
Porting Debian to further kernels allows for more widespread use, both on hardware not well supported by the existing ports and in areas where a specific kernel is deemed desirable. The work involved is difficult but not insurmountable, and the potential for a larger userbase is likely to make this worthwhile.
This document was generated on 21 July 2004 using texi2html 1.56k.