From Steve Snyder on 20 Sep 1998
Is there a validation test suite for glibc v2.0.x? I mean a more comprehensive set of tests than are run by "make check" after building the runtime libraries.
Not that I know of. I guess the conventional wisdom is that if I install glibc, and a bunch of sources and I compile the sources against the libc and run all of them --- that failures will somehow be "obvious."
Personally I think that this is stupid. Obviously it mostly works for most of us most of the time. However, it would be nice to have full test and regression suites that exercise a large range of functions for each package --- and to include these (and the input and reference data sets) in the sources.
It would also be nice if every one of them included a "fuzz" script (calling the program with random combinations of available options, switches and inputs --- particularly with a list of the available options). This could test the programs for robustness in the face of errors and might even find some buffer overflows other bugs.
However, I'm not a programmer. I used to do some quality assurance --- and that whole segment of the market seems to be in a sad state. I learned (when I studied programming) that the documentation and the test suite should be developed as part of the project. User and programmer documentation should lead the coding (with QA cycles and user reviews of the proposed user interfaces, command sets, etc, prior to coding.
The "whitebox" test suites should be developed incrementally as parts of the code are delivered (if I write a function that will be used in the project, some QA whiteboxer should write a small specialized program that calls this function with a range of valid and invalid inputs and tests the function's behaviour against a suite that just applies to it).
Recently I decided that md5sum really needs an option to read filenames from stdin. I want do write some scripts that essentially do:
'find .... -print0 | md5sum -0f '
... kind of like 'cpio' Actually I really want to do:
'rpm -qal | regular.files | md5sum -f'
... to generate some relatively large checksum files for later use with the '-c' option. This 'rpm' command will "Query All packages for a List of all files. The regular.files filter is just a short shell script that does:
#!/bin/sh ## This uses the test command to filter out filenames that ## refer to anything other than regular files (directories, ## Unix domain sockets, device nodes, FIFO/named pipes, etc) while read i ; do [ -f "$i" ] && echo "$i" done
So I grabbed the textutils sources, created a few sets of md5sum files from my local files (using 'xargs'). Those are my test data sets.
Then I got into md5sum.c, added the command options, cut and pasted some parts of the existing functions into a new function, and what able to get it cleanly compiling in a couple hours. I said I'm not a programmer didn't I. I think a decent programmer could have done this in about an hour.
Then I ran several tests. I ran the "make check" tests, and used the new version's -c to check my test sets. I then used same script that generated those to generate a new set using the new version/binary. I then compared those (using 'cmp' and 'diff') and checked them with the old version. Then I generated new ones (with the new switch I'd added, and again with the old version) and cross check them again.
This new version allows you to to use stdin or pass a filename which contains a list of files to checksum --- it uses the --filelist long argument as well as the -f short form; and you can use -f - or just -f to use stdin. I didn't implement the -0 (--null) option --- but I did put in the placeholders in the code where it could be done.
The point here is that I had a test suite that was longer than the code. I also spent more time testing and documenting (writing a note to Ulrich Drepper, the original author of this package to offer the patches to him) than I did on coding.
Though a benchmarking component would be nice, my main concern is to verify that all (or at least the vast majority) of the library function work correctly. What I want to know is, given a specific compiler and a specific version of glibc source files, how can I verify that the libraries built are reliable?
By testing them. Unfortunately, that may mean that you'll have to write your own test suites. You may have to start a GNU/new project to create test suites.
It is likely that most of the developers and maintainers of these packages have test suites that they run before they post their new versions. It would be nice if they posted the test suites as part of the source package --- and opened the testing part of the project to the open development model.
In addition these test suites and harnesses (the scripts to create isolated and sample directory structures, etc) to run a program (or library) through its paces) would serve as a great addition to the documentation.
I find 'man' pages to be incredibly dense. They are find if you know enough about the package that you are just looking for a specific feature, that you think might be there, or one that you know is in there somewhere --- but you don't remember the switch or the syntax. However, a test harness, script, and set of associated inputs, outputs, and configurations files would give plenty of examples of how the bloody thing is supposed to work. I often have to hunt for examples --- this would help.
The specific version I want to test is the glibc v2.0.7 that comes with RH Linux v5.1 and updated after 5.1 release by package glibc-2.0.7-19.src.rpm. I think that such a testsuite, though, if it exists, would be applicable to any platform.
I agree. I just wish I could really co-ordinate such a project. I think this is another example where our academic communities could really help. Before I've said that I would like to see the "adopt a 'man' page project" --- where college and university professors even high school teachers from around the world assign a job to their students:
Find a command or package for Linux, FreeBSD, etc. Read the man pages and other docs. Find one way that the command is used or useful that is not listed the "examples" section of that man page. Write a canonical example of that command variant.
... they would get graded on their work --- and any A's would be encouraged (solely at their option) to submit the recommended example as a patch to the maintainer of the package.
Similar assigments would be given for system calls, library functions, etc (as appropriate to the various classes and class segments).
Along with this, we could have a process by which students are encouraged to find bugs in existing real world software --- write test suites and scripts to test for the recurrence of these bugs in future versions (regressions), and submit the tests to that package's maintainer.
The problem with all of this is that testing is not glamorous. It is boring for most people. Everyone knows Richard M. Stallman's and Linus Torvalds' names --- but fewer people remember the names of the other programmers that they work with and no one know who contributed "just the testing."
There are methods that can be used to many detect bugs quicker and more reliably than by waiting until users "bump into" them. These won't be comprehensive. They won't catch "all" of the bugs. However, people will "bump" into enough bugs in normal usage, even if we employ the best principles of QA practice across the board.
Unfortunately I don't have the time to really devote to such a project. I devote most of my "free" time to the tech support department. I do have spare machine cycles. could gladly devote time to running these tests and reporting results. Obviously some tests require whole networks, preferably disconnected ones, on which to run safely. Setting up such test beds, and designing tests that return meaningful results is difficult work.
I personally think that good test harnesses are often harder to design than the programs that they are designed to test.
***** Steve Snyder *****