[Linux Quality Database] [Articles]

Using Test Suites to Validate the Linux Kernel

Test suites provide an easy to use way to give wide test coverage to new Linux kernels. You can use tests written for individual software packages, or tests written for Linux itself.

Michael D. Crawford
crawford@goingware.com

Copyright © 2000 Michael D. Crawford.

Creative Commons License
This work is licensed under a Creative Commons Attribution-ShareAlike 2.5 License.

What People are Saying about This Article

God sent you right? :) Been looking for something along this nature.

-- David D.W. Downey on the linux-kernel mailing list

Revision History

27-Jul-2003

Relicensed from the GNU Free Documentation License to the Creative Commons Attribution-Sharealike 2.5 License so that it will be compliant with the Debian Free Software Guidelines.

21-Jan-2002

Checked and updated hyperlinks (many pages had moved), added Charles Cazabon's memtester and used the tasteful green stylesheet.

22-Feb-2001

Fixed spelling, grammar and HTML errors, added PostgreSQL, VA Linux Cerberus, Mauve, cpuburn and Lucifer. Many thanks to those who wrote in with comments, corrections and test suite suggestions.

Introduction

The simplest way to test a new Linux kernel is to just install the kernel, reboot your machine, and try out the software you normally run on it.

This is an important test, because it tests most quickly the things that matter most to you, and you're most likely to notive things that are out of the ordinary from your normal way of working.

But it doesn't give very widespread test coverage; each of us tends to use the GNU/Linux system only for a very limited range of the available functions it offers.

One might think a better test would be to download lots of different software packages and just try them out, but this can be a lot of work for limited results. We have to spend a lot of time learning to build and use each software package, and if we're not experts in them we might not notice when small things go wrong.

Another alternative is to run test suites. These are software packages written for the express purpose of testing, and they are written to cover a wide range of functions and often to expose things that are likely to go wrong.

These test suites are useful for more than just testing the kernel, they can help test commonly used system libraries while they are under development, and if you're working on building a new version of a Linux distribution, test suites can help you determine which versions of the Linux kernel and system libraries will work well in combination to provide reliable performance.

There are two main kinds of test suites to consider:

What I'll call "application test suites" were not written originally to test the system itself, but to test the correct functioning of the program they come with. Examples of these are the test programs distributed with some programming language implementations, one runs a lot of programs written in that programming language and if there is an error result then we know something is wrong.

We also have what I'll call "system test suites". These are test tools written explicitly to test Linux itself.

There are advantages to using each, the system test suites likely cover the overal functioning of the system in more breadth and probably more rigorously check the correct functioning of each individual system call. The application test suites will test a narrower portion of the system in more depth by likely doing more complicated things with it.

When tests can be run unattended, you might try running several of them simultaneously to test the functioning of a heavily loaded system. You should check that each test passes when run by itself first, though.

I am actively seeking new test suites that can be included in this list. If you know of any, please mail me (Michael Crawford) at crawford@goingware.com and let me know where I can find it.

Should All the Tests Pass?

Maybe not.

It's probably quickest to build and run all the tests just under the kernels that you're testing, but if you get results that you don't expect you should determine if this is really a new bug. This is one form of what's called "regression testing" or "bug regression".

It may be that the test was written in such a way that it would pass if some feature of the software under test were correctly implemented, but that has not been done yet. Or it may be that the test fails due to some interaction with the system libraries on your Linux distribution, rather than with the kernel.

So at the very least, if you get an unexpected failure, build and run the tests while running under a stable kernel on what should otherwise be the same setup as the one you're trying to test. If the test still fails there, you should probably take it up first with people who know about the software package in question - it's mailing list would be a good place to ask, or you could ask the test suite maintainers.

Only if the test fails under one kernel version but not the other, or if the suite maintainers tell you that the test should always pass under the stable kernel should you report the bug to the Linux Kernel Mailing List.

If the test fails under both kernels you try it with, but the suite maintainers tell you it is expected to pass, please also report it to the technical support or developers for the Linux distribution you use. They will be better prepared to determine if the problem is with a system library such as glibc. It could be that the library in general works OK, but the build you've got on your machine is buggy.

Test the Basic Health of Your System

These tests are not so much meant to test Linux itself but to test that your hardware is basically working correctly. It's probably a good idea to start with memtest86 if you're running an x86 machine:

memtest86

Memtest86 is a rigorous, low-level memory tester. It doesn't run under Linux, rather, it is a boot sector and kernel that can be installed under LILO, Grub or on a floppy. You boot off the memory test and it runs as it's own little operating system.

Memtest86 spends much more time testing your memory than the startup BIOS tests and uses a number of different techniques to stimulate and detect memory errors, tests that would take too much time if they were run each time you started your machine. And in fact is found a problem with the memory on my PC that is repeatable with one of the tests not with the machine's own BIOS test and does not appear to cause trouble in practice.

This test is pretty hardwired to run on x86 PC's with an industry standard BIOS, so it wouldn't run as written on other platforms, but I think the tests could be easily ported to a new platform if one were to write new code for discovering such things as the size and location of installed physical RAM as well as handling screen display.

Charles Cazabon's memtester - download here

Memtester serves a similar purpose to memtest86, except that it runs as a user-mode process under most Unix-like operating systems. It attempts to allocate and lock into physical ram as many pages of virtual ram as it can, then tests the memory it has locked. (Because it uses a privileged system call to do the locking, it must run as root.)

I understand that on Linux a single process can lock no more than half the physical ram, so that's as much as you can test with a single instance of memtest. It may be possible to test more by running multiple instances of it simultaneously, or perhaps by patching the kernel to allow more memory to be locked. It would also be useful to provide an argument to the kernel to use memtester as the init program so that no ram is used for other processes.

Use memtester if you want to test your ram without rebooting, or if you run on some other architecture than PC-clone x86 (memtest86 uses the PC BIOS so it may not run on non-PC x86 hardware like some embedded systems). I've had good results with it running under Debian PowerPC on my Mac 8500.

Doug Ledford's Memory Test Script - Danger, Will Robinson!

For an even more rigorous memory test (and one that will run on any Linux platform), try this one. It tests your memory even more rigorously by doing CPU memory accesses in combination with disk DMA accesses; typically this consumes memory bandwidth much more than a purely CPU-based test.

Be careful with this one, I've been told it has been known to damage cheap hardware, or could even make your computer so hot it catches on fire. This is probably not something you'd want to run on a machine you normally use for production or the machine you keep for your personal use, but if you're comfortable with taking the risk it's probably a good way to test the basic health of a brand-new machine you feel should be of sound quality

Cerberus Test Control System Download Here or did I mean this one?

You'll have to smoke your own box to find out, but I think maybe this is the test tool that I heard was so hard on cheap hardware. From the README.FIRST file:

THE CERBERUS TEST CONTROL SYSTEM CAN BE CONFIGURED IN WAYS THAT MAY DAMAGE HARDWARE, SOFTWARE, OR DATA. IT IS NOT RECOMMENDED TO RUN THIS SOFTWARE ON A MACHINE USED FOR PRODUCTIVE PURPOSES, OR ON HARDWARE YOU DO NOT WANT TO DESTROY.

But significantly, this software has already revealed bugs in the Linux kernel:

In fact, one of the reasons behind the release of the test control system under the GPL is that some critical filesystem corruption bugs were revealed by using the VA test generator in "pathological mode."

cpuburn

This is a CPU chip load tester, written in assembly to cause maximum heat output on P6 and P5 grade Intel-architecture chips. From the author's page:

Undercooled, overclocked or otherwise weak systems may fail causing data loss (filesystem corruption) and possibly permanent damage to electronic components.

Lucifer scroll down the page to "Linux Software"

Lucifer's Freshmeat entry says:

Lucifer is a burn-in program suppoering Linux and DOS. It tests RAM, hard disks, processors, and floating-point processors by running stress tests to ensure that the hardware is not likely to have trouble as it ages. Lucifer should be portable, although it is only tested with Linux and DOS.

Application Test Suites

Python's make check - download here

After you build the Python programming language package, you can run a test suite by giving the command make check. This runs a number of test programs. Most of these test the python language itself, but many of them provide testing for Linux by making system calls.

Some of the tests won't run because they aren't configured - this may be because they needed a feature of Python you didn't choose to build, or they need a feature that's not available for the Linux platform (Python is supported on other platforms like Mac OS and Windows, and has platform-specific extensions for each system it runs on).

At the completion of make check you should see a summary that lists the number of tests passed as well as the number skipped; none should have failed.

Mesa's make exec - download here

Mesa is a free implementation written to the SGI OpenGL specification for 3D graphics. (Since it is not an OpenGL licensee it cannot claim to be an OpenGL implementation.) In recent versions it supports the Direct Rendering Infrastructure, which works in conjunction with software recently added to the Linux kernel and XFree86 to do hardware accellerated 3D graphics under XWindows.

Whether or not you use DRI, after building Mesa you can give the command make exec and a number of graphic demo programs will run. You need to quit each one manually, usually by pressing the ESC key, and some programs require you to manipulate them via keystrokes or a joystick to fully test them (to navigate around a scene, for example).

You get no announcement whether the tests have passed or failed, if they run OK and the images on the screen look appropriate you should consider the tests to have passed. There are a couple of benchmarks, and if you don't mind jotting down the benchmarks results you could use these to verify that performance stays good as new kernel builds appear.

I cannot pretend to be able to give you help on initially getting DRI to work on your system, but to do so, you will need a Linux kernel version 2.4.0 or later, XFree86 4.0.2 or later, and the /dev/agpgart and drm support selected in your kernel (some folks say these must be built as modules, but others claim this is not true, I don't know). It is also helpful to join the dri-users and newbie@xfree86.org mailing lists.

Kaffe's make check

Kaffe is a Java virtual machine and set of class libraries which is available under the GNU General Public License. It is independently developed from Sun's Java and so in my opinion provides an important check on the validity of the Java language and the specified requirements of the Java virtual machine. (Most other available Java implementations are derived from Sun's product; without independent competing implementations like Kaffe it would be difficult to determine if there was a bug in the design of Java, if that bug was present in all the available versions because they came from the same codebase.)

After building and installing Kaffe, give the command make check . You will see the names of many tests preceded by PASS, FAIL or XPASS (I think this means a test pass where it was expected to fail). A summary will be printed at the end.

In my experience with a couple versions of Kaffe, some unexpected results will occur even with a stable kernel, so you should investigate with the Kaffe developers before raising alarm with the Linux kernel developers.

The Mauve Java Test Suite - download here

"The Mauve Project is a collaborative effort to write a free test suite for the Java™ class libraries. The current collaborators come from the Kaffe project, the GNU Classpath project, and the Cygnus Java project."

Hewlett-Packard has contributed code, see the CNET article about it

PostgreSQL's Regression Tests - download here

From the regression test page:

The regression tests are a comprehensive set of tests for the SQL implementation in PostgreSQL. They test standard SQL operations as well as the extended capabilities of PostgreSQL. The test suite was originally developed by Jolly Chen and Andrew Yu, and was extensively revised and repackaged by Marc Fournier and Thomas Lockhart. From PostgreSQL 6.1 onward the regression tests are current for every official release.

The regression test can be run against an already installed and running server, or using a temporary installation within the build tree. Furthermore, there is a "parallel" and a "sequential" mode for running the tests. The sequential method runs each test script in turn, whereas the parallel method starts up multiple server processes to run groups of tests in parallel. Parallel testing gives confidence that interprocess communication and locking are working correctly. For historical reasons, the sequential test is usually run against an existing installation and the parallel method against a temporary installation, but there are no technical reasons for this.

System Test Suites

Linux Standard Base Test Suite

These are formal test suites to test the compliance of the Linux OS with different standards. So far the tests provided cover the Linux Filesystem Heirarchy standard and to the core POSIX.1 standard.

Compliance of your distribution to the filesystem standard allows software you install to work correctly if it depends on system files being in a certain place (for example, that there is a file named "passwd" in a directory named "/etc" that is in a format that can be parsed to get user account information.)

Compliance to the POSIX.1 specification enables programs written for other Unix variants and workalikes that also comply to the POSIX spec to be source code compatible with Linux. It's also the tightest way to test those Linux system calls that are part of POSIX.

The Linux Test Project - download here

According to their page, the Linux Test Project was created to answer the very question I had when I began to work on this article - "what tests are there for Linux?"

So far there are about 550 tests, a test driver, and a result scanning tool. It is possible to run multiple tests in parallel (which therefore tests a more heavily loaded system).

Valid XHTML 1.0!