Instructions for contributing statistical code

 [PSPP Logo]


| PSPP Home | News | Manual | FAQ | Help wanted! | Contributors | PSPPIRE | GNU Homepage |

PSPP needs more statistical functionality. Though PSPP developers do not mind having to write all such code ourselves, we do not have enough time. So we are asking volunteers to submit statistical code for inclusion in PSPP. We will accept and review submissions of such code from volunteers. Below are guidelines and requirements for authors, and instructions for submitting code.

We need basic, "back-end" statistical routines most. Inserting new statistical functionality requires two steps: Writing the statistical part of the program, then making it run within PSPP. We are asking for volunteers to work on the first task: Writing modules that run statistical procedures, ignorant of PSPP. Most statisticians, even those with plenty of experience programming, will find the task of writing the statistical subroutines difficult enough. The added difficulty of hooking those routines into PSPP, and writing the necessary I/O functions, will likely discourage those statistical programmers. We want to avoid this problem by asking authors to write just the statistical subroutines, then to submit them to us for editing, review and eventual inclusion in PSPP.

Guidelines for contributions

  1. Right now, our greatest need is for basic subroutines for estimation and testing: Logistic regression, Poisson regression, smoothing splines, factor analysis, n-way analysis of variance (including random-effects models), clustering, neural networks, classification and regression trees, non-parametric tests. This list is incomplete, so if the program you want to submit is not listed above, ask us if we would like to include it. If it is something that PSPP lacks, but exists in other statistical software, we probably want to include it.
  2. If you want to submit a more esoteric statistical procedure, for example, one that appeared in a recent journal article but has not yet appeared widely in the statistical literature, please contact us to ask about its inclusion. If it is something users want, we probably would like to include it.
  3. We welcome improvements to existing statistical functionality.
  4. Before you write something to send to us, make sure it is not already part of PSPP.

Requirements for authors

Before submitting your code for inclusion in PSPP, make sure your program satisfies the following requirements:

  1. Include with your subroutines a test program, and directions telling us how to compile and run it on a GNU/Linux system. The whole package must run.
  2. In a comment at the top of at least one of the files, include one or two short paragraphs that describe what the program does.
  3. In a prominent place in your source, include a comment that describes all inputs necessary to your program. Also include a comment that describes all of the program's output.
  4. Include a comment containing a bibliography relevant to your code. This may be necessary for fixing bugs, or altering the code later.

If your submission meets these requirements, a reviewer will check it for correctness, and contact the author to resolve any technical problems.

Guidelines for writing software

The following list is a set of guidelines we would like authors to follow when writing code for PSPP. We will not always reject submissions that do not conform to these guidelines, but submissions that do conform to them will make our job easier.

Remember that we will have to modify the submitted program substantially to make it run inside PSPP. These modifications may take a lot of work, so following the guidelines below could speed up inclusion a lot.

  1. When possible, please follow the GNU coding standards.
  2. Please use the GNU Scientific Library (GSL) when appropriate. GSL supports linear algebra, functions for computing probabilities and related values, random number generators, optimization, and much more. So there is no need to write, for example, your own functions for computing the gamma density.
  3. Please write a struct to store the results of your routines, along with some accessor functions for that struct. For example, the PSPP linear regression library has a "pspp_linreg_cache" which holds information about the fitted model including parameter estimates, mean squared error, and pointers to functions to compute predicted values and residuals.
  4. Include with your program a file that describes how one should use the program, from the viewpoint of a PSPP user. The format of this file should be either plain text or Texinfo. We will use this file to create a chapter in the PSPP user's manual.
  5. Please write your program in C if you can.

Submission and review

To submit your code, go to http://savannah.gnu.org/projects/pspp/ and open a new task. Include your source files as an archive, and assign the task to jstover. If your submission meets the requirements above, it will be reviewed and you will contacted later about its status. Remember to give us your email address so we can contact you.

If your submission is accepted for inclusion in PSPP, you, and possibly your employer, will have to assign copyright to the Free Software Foundation to allow us to include your program.