mapSoN

Name

mapson -- An automatic spam filter

Synopsis

mapson [-d | --debug] [-c config | --config-file config] [mail...]

Description

The amount of unsolicited commercial e-mail ("spam") circulating in the Internet today has become unbearable for most people. Many approaches have been proposed to stop this junk from filling up your mailboxes, such as the Real-time blackhole list, Teergruben, or procmail-based anti-spam recipts.

mapSoN is another anti-spam system, but it uses an approach entirely different than those systems named before. Instead of trying to recognize spam by the IP address of the SMTP dialag's peer or by certain patters in the mail's body, mapSoN uses the sender's e-mail address to decide whether the e-mail is delivered to your mailbox or not: Any e-mail that comes from a "known" address may pass, any e-mail that comes from an e-mail address seen for the first time needs special confirmation, before it may pass.

"Special confirmation" means that mapSoN will generate an MD5 checksum of the to-be-confirmed mail and stores the mail in a temporary spool directory. Then it sends an request for confirmation to the address from which the mail was coming from. In this request, it will include the MD5 checksum and ask the recipient to reply back and to quote that MD5 hash. Once mapSoN sees that MD5 hash again, it considers that a confirmation of the original mail, delivers the deferred mail from the spool to your mailbox, and adds the sender's address to the database of known addresses, so that the next time he tries to contact you, his mail will pass through mapSoN immediately..

This heuristic catches almost any spam mail, because spammers have to fake their sender addresses in order to avoid being held responsible for their abuse. Hence, their address will most likely not be in the database of known addresses, nor will they ever receive the request for confirmation mail!

For a complete description of what mapSoN does and how to confirgure it, please take a look at the user manual that comes with it. This document is only a reference page and does not explain all options in detail.

Command Line Syntax

mapSoN understands the following command line parameters:

-h

Show mapSoN usage information.

-d, --debug

Enable debugging. Please note that debugging is only available if mapSoN has been compiled with the define DEBUG. Otherwise, the debug code is not included in the binary.

-c config, --config-file config

Use the configuration file config rather the default.

mail …

If any parameter is specified on the command line that is not an option, mapSoN will go into gather addresses mode. The parameters are interpreted as filenames, each of the files containing an e-mail that mapSoN will parse. Any sender address mapSoN finds in these mails will be added to the database of known addresses. This mode is meant to import addresses from your mail archive to the database.

Configuration File

The configuration file may contain the following directives:

Mailbox file

This directive sets the complete path of the mailbox file, where mapSoN stores approved mails.

SpoolDir directory

This directive sets the complete path to the directory, in which deferred mails will be spooled until a confirmation arrives for them.

AddressDB file

This directive sets the complete path of the file mapSoN uses to store the "known" addresses.

ReqConfirmTemplate file

This directive sets the complete path to the request-for-confirmation template file mapSoN uses to generate the request-for-confirmation mail sent to first-time originators.

An arbitrary number of alternate paths can be specified, if they're separated by colons, for example:

$HOME/.mapson/reqmail.template:$DATADIR/reqmail.template:…

In this setup, mapSoN would first try to load the file $HOME/.mapson/reqmail.template. If that failed, it would try $DATADIR/reqmail.template, and so on, until one of the files can be loaded successfully.

This is an extreme useful feature if you are a system administrator who wishes to allow all users of the system to use mapSoN without having to create a request-for-confirmation template of their own: Configure mapSoN to load that request-for-confirmation template first, that is located in the user's home directory. If this file does not exist, then fall back to the system-wide file.

In effect, that means that the user can simply use mapSoN to filter his mail, and if he ever feels like it, he can create a request-for-confirmantion template file of his own and it will be preferred over the system-wide one.

MTA command

This directive sets the command mapSoN will use to send-out a request-for-confirmation mail. The actual mail will be piped into the started process.

PassIncorrectMails boolean

When mapSoN parses the incoming mail's headers for the addresses, it may detect syntax errors in the mail header, that do not cause a fatal error, but that surely hint to the fact that this mail was not created by an RFC822-conformant mail client.

Many spam mails contain incorrect header lines, so you may chose to have mapSoN fail on any syntax error -- even non-fatal ones. "Failing" means that mapSoN will abort and return the return code configured below to the MTA. Depending on the setting of the return code, the MTA will then bounce the mail.

The parameter given to this option is a boolean, meaning that you may specify either yes or no.

StrictRFCParser boolean

If you enable this option by specifying yes, mapSoN will perform additional syntax checks on the incoming mail, if you say no, it will check only those headers that are needed for mapSoN to operate at all.

Enabling this option makes little sense unless you disable the PassIncorrectMails option.

RuntimeErrorRC integer

This directive sets the return code mapSoN exits with in case it had to abort with a run-time error. Possible run-time errors are failure to open file, lack of available memory, etc.

The default choice is "75", which sendmail will interpret as a temporary system error, so it will queue the mail and re-try.

A valid return code is a positive integer up to 128.

SyntaxErrorRC integer

This directive sets the return code mapSoN exits with in case it encountered a fatal syntax error in the e-mail. If PassIncorrectMails is disabled, non-fatal syntax errors will also cause mapSoN to abort with this return code.

The default choice is "65", which sendmail will interpret as a permanent error that causes the mail to bounce.

A valid return code is a positive integer up to 128.

Debug boolean

If you enable debugging messages by saying yes here, mapSoN will log additional information about its procssing of the mail. If you say no, mapSoN will log only very few messages at all.

Debugging is available only when the binary has been compiled with the DEBUG symbol defined. Currently, that is the default, though, so unless you exclicitely disabled it, debugging will be available.

In order to make the contents of the configuration file as independent from the system's directory structure as possible, mapSoN provides a set of environment variables, which are guaranteed to be defined. You can use them anywhere in the data part of a configuration directive, and you can use the usual manipulations on them.

Envirenment variables are looked-up case-sensitive, so $home is not the same thing as $HOME. This behavior is different in the request-for-confirmation template, where you can spell the variables upper- or lower-case as you wish. That's because the variables there are not coming from the environmnet, but are mapSoN's internal variables. So be sure not to confuse that, because an undefined variable in this file will cause mapSoN to abort with an error.

mapSoN will not overwrite already existing variables, though! If your system defines, for example, the $HOME variable, then you'll get the value from the system's variable.

Here is the complete list:

$MAILBOXDIR

This variable contains the complete path of directory, in which the system's mailboxes are located, usually /var/spool/mail. Please note that the value provided here is the one determined at compile-time, so if you changed your system's installation and want to rely on this variable, you'll have to re-compile.

$MTA

This variable contains the path to the systems mail transport agent. Please note, that this is only the path of the executable -- for example /usr/sbin/sendmail --, the variable does not contain the flags that must be passed to the MTA in order to do something useful.

$DATADIR

This variable contains the complete path of the directory, which has been compiled into mapSoN as the directory where read-only architecture-independent data should be stored. You will, for example, find the system-wide request-for-confirmation template file here.

$USER

This variable contains the name of the user under which mapSoN is running. Depending on your MTA, this must not necessarily be the user who is receiving mail! If you're using sendmail, though, you're on the secure side.

$HOME

This variable contains the complete path of $USER's home directory.

License

This software is copyrighted by Peter Simons . Permission is granted to use it under the terms of the GNU General Public License. For further details, refer to the file LICENSE included in the software distribution or see http://www.gnu.org/licenses/gpl.html in case that file is missing.

mapSoN uses the "Variable Expression Library", which is included in the distribution for comfort. This library is not part of the mapSoN package and is licensed under the terms described in the file libvarexp/LICENSE. Should that file be missing in your distribution, contact me for a copy.

The MD5 library included in this distribution has been taken from the "GNU C Library" and is licenced under the terms of the GNU Library General Public License. The licence file can be found in the file libmd5/COPYING.LIB.

The getopt() and getopt_long() implementation used my mapSoN in case the routine are not available in the system libraries come from the GNU C Library, too, and are licensed under the terms of the GNU Lesser General Public License, what is effectively just a newer version of the library-licence version. A copy of the license can be found in libgetopt/COPYING.