NAME
Apache::ClickPath - Apache WEB Server User Tracking
SYNOPSIS
LoadModule perl_module ".../mod_perl.so"
PerlLoadModule Apache::ClickPath
Google Googlebot
MSN msnbot
Mirago HeinrichderMiragoRobot
Yahoo Yahoo-MMCrawler
Seekbot Seekbot
Picsearch psbot
Globalspec Ocelli
Naver NaverBot
Turnitin TurnitinBot
dir.com Pompos
search.ch search\.ch
IBM http://www\.almaden\.ibm\.com/cs/crawler/
ClickPathSessionPrefix "-S:"
ClickPathMaxSessionAge 18000
PerlTransHandler Apache::ClickPath
PerlOutputFilterHandler Apache::ClickPath::OutputFilter
LogFormat "%h %l %u %t \"%m %U%q %H\" %>s %b \"%{Referer}i\" \"%{User-agent}i\" \"%{SESSION}e\""
ABSTRACT
"Apache::ClickPath" can be used to track user activity on your web
server and gather click streams. Unlike mod_usertrack it does not use a
cookie. Instead the session identifier is transferred as the first part
on an URI.
Furthermore, in conjunction with a load balancer it can be used to
direct all requests belonging to a session to the same server.
DESCRIPTION
"Apache::ClickPath" adds a PerlTransHandler and an output filter to
Apache's request cycle. The transhandler inspects the requested URI to
decide if an existing session is used or a new one has to be created.
The Translation Handler
If the requested URI starts with a slash followed by the session prefix
(see "ClickPathSessionPrefix" below) the rest of the URI up to the next
slash is treated as session identifier. If for example the requested URI
is "/-S:s9NNNd:doBAYNNNiaNQOtNNNNNM/index.html" then assuming
"ClickPathSessionPrefix" is set to "-S:" the session identifier would be
"s9NNNd:doBAYNNNiaNQOtNNNNNM".
If no session identifier is found a new one is created.
Then the session prefix and identifier are stripped from the current
URI. Also a potentially existing session is stripped from the incoming
"Referer" header.
There are several exceptions to this scheme. Even if the incoming URI
contains a session a new one is created if it is too old. This is done
to prevent link collections, bookmarks or search engines generating
endless click streams.
If the incoming "UserAgent" header matches a configurable regular
expression neither session identifier is generated nor output filtering
is done. That way search engine crawlers will not create sessions and
links to your site remain readable (without the session stuff).
The translation handler sets the following environment variables that
can be used in CGI programms or template systems (eg. SSI):
SESSION
the session identifier itself. In the example above
"s9NNNd:doBAYNNNiaNQOtNNNNNM" is assigned. If the "UserAgent"
prevents session generation the name of the matching regular
expression is assigned, (see "ClickPathUAExceptions").
CGI_SESSION
the session prefix + the session identifier. In the example above
"/-S:s9NNNd:doBAYNNNiaNQOtNNNNNM" is assigned. If the "UserAgent"
prevents session generation "CGI_SESSION" is empty.
SESSION_START
the request time of the request starting a session in seconds since
1/1/1970.
CGI_SESSION_AGE
the session age in seconds, i.e. CURRENT_TIME - SESSION_START.
REMOTE_SESSION
in case a friendly session was caught this variable contains it, see
below.
REMOTE_SESSION_HOST
in case a friendly session was caught this variable contains the
host it belongs to, see below.
The Output Filter
The output filter is entirely skipped if the translation handler had not
set the "CGI_SESSION" environment variable.
It prepends the session prefix and identifier to any "Location" an
"Refresh" output headers.
If the output "Content-Type" is "text/html" the body part is modified.
In this case the filter patches the following HTML tags: