NAME
    ApacheBench - Perl API for Apache benchmarking and regression
    testing.

SYNOPSIS
      use ApacheBench;

      my $b = ApacheBench->new;

      # global configuration
      $b->config({
                  concurrency  => 5,
                  priority     => "run_priority",
                 });

      # add sequence(s) of URLs to request
      $b->add({
               repeat    => 10,
               cookie    => ["Login_Cookie=b3dcc9bac34b7e60;"],
               urls      => ["http://localhost/one", "http://localhost/two"],
               postdata  => [undef, undef],
               order     => "depth_first",
              });

      my $regress = $b->execute;

      # calculate hits/sec == ($#urls+1)*$n*1000 / total_time
      print (2*10*1000/$regress->{"total_time"})." req/sec\n";

      # dump the entire regression hash (WARNING, this could be a LOT OF DATA)
      use Data::Dumper;
      my $d = Data::Dumper->new([$regress]);
      print $d->Dumpxs;

GOALS
    This project is meant to be the foundation for a complete
    benchmarking and regression testing suite for an advanced,
    transaction-based mod_perl site. We need to be able to stress
    our server to its limit while also having a way to verify the
    HTTP responses for correctness. Since our site is transaction-
    based (as opposed to content-based), we needed to extend the
    single-URL ab model to a multiple-URL sequence model.

    ApacheBench is based on the Apache 1.3.12 ab code
    (src/support/ab.c).

    Note: although this tool was designed to be used on an Apache
    mod_perl site, it is generally applicable to any HTTP-compliant
    server. Beware, however, that it sends a high volume of HTTP
    requests in a very short period of time, which may overwhelm
    some weaker HTTP server platforms like NT/IIS.

DESCRIPTION
    ApacheBench sends sequences of HTTP requests to an HTTP server
    and keeps track of the time taken to receive a response, the
    data that was returned, the size of the data that was returned,
    and various other bits of information.

    Since it is implemented in C, it sends HTTP requests in a tight
    loop which can stress your server to 100% capacity, especially
    if invoked in multiple concurrent instances. It gives accurate
    time measurements down to the millisecond for each HTTP request-
    response interval.

    Included is a simplified re-implementation of ab using the
    ApacheBench Perl API. This should help get you started with
    ApacheBench.

METHODS
    new()
        The constructor. It takes no arguments.

    config({ %global_params })
        Global configuration method. Should only be invoked once,
        else previous configuration parameters will be clobbered.
        See the global configuration section for details on how
        %global_params should be structured.

    add({ %run_params })
        Run configuration method. Can be invoked multiple times.
        Each invocation will register a new benchmark run to be
        executed. See the run configuration section for details on
        how %run_params should be structured.

    execute()
        The execute method takes no arguments. It will send the HTTP
        requests and return a hash reference to the regression data.

CONFIGURING
    You need to tell ApacheBench what the requests will be, in what
    order to send them, and how to prioritize sending them.
    ApacheBench was designed to simulate many users logged in
    simultaneously, each of whom may be doing many different
    transactions at once.

  Object constructor

    First, you will need to construct an ApacheBench object.

    Example:

      use ApacheBench;
      my $b = ApacheBench->new;

  Global configuration

    Next, you need to setup global configuration parameters. These
    apply to all benchmarking runs associated with this ApacheBench
    object.

    Global configuration is done by calling the config() method with
    a reference to a hash containing the configuration parameters.
    The global configuration parameters are: concurrency, priority,
    repeat, and filesize.

    concurrency
        Number of requests to send simultaneously (default: 1)

    priority
        Either equal_opportunity or run_priority.

        If set to equal_opportunity, all benchmark runs that are
        configured (see below) under this ApacheBench object are
        given equal access to the concurrency level. This means
        requests are taken from each run and sent in parallel (the
        level of parallelism defined by concurrency).

        If set to run_priority, the benchmark runs that are
        configured first get the highest priority. This means all
        requests in the first run will be sent before any requests
        in the second run are sent, if the first run is configured
        as order => breadth_first (see below).

        (default: equal_opportunity)

    repeat
        The number of times to repeat the request sequence in each
        run. This can be overridden on a per-run basis (see below).

        (default: 1)

    filesize
        The maximum size of the buffer to store HTTP responses. If
        an HTTP response is received with a size larger than this
        limit, the content is truncated at length filesize, and a
        warning is issued. This can be overridden on a per-run basis
        (see below).

        (default: 16556)

    Example:

      $b->config({
                  concurrency => 4,
                  priority    => "run_priority",
                 });

  Run configuration

    Finally, you need to setup one or more benchmark runs. Each run
    is defined as an ordered sequence of HTTP requests to be sent to
    the server, which can be repeated multiple times, and scheduled
    to be sent in different ways. For each run you want to
    configure, call the add() method with a hash reference
    containing the following configuration parameters:

    repeat
        Number of times to repeat this request sequence.

        (default: 1, or whatever is specified in global config())

    cookie
        An array reference of length repeat containing HTTP cookies
        to send. This is meant to be used for login IDs. You could
        simulate N users all doing the same transaction
        simultaneously by giving N different login cookies here. If
        this option is omitted, no cookies will be sent in any of
        the requests for this run.

    urls
        An array reference containing the URLs in the sequence

    postdata
        An array reference containing data to POST to the
        corresponding URL in the urls sequence. For GET requests use
        undef. If this option is omitted, all requests for this run
        will be GET requests.

    order
        Either depth_first or breadth_first

        breadth_first mode sends repeat of the first request in the
        urls list, then repeat of the second request in the urls
        list, then repeat of the third... and so on. (e.g. If repeat
        == 3 and urls contains two requests, then ApacheBench would
        send the first request 3 times, and then the second request
        3 times.)

        depth_first mode ensures that HTTP requests in the sequence
        are sent in order, completing a full sequence before
        starting again for the next repeat iteration. (e.g. If
        repeat == 3 and urls contains two requests, then ApacheBench
        would send the urls sequence in order, then send it again,
        and then again. A total of six requests would be sent.)

        (default: breadth_first)

        Note: if repeat == 1, or the length of urls is 1, then the
        order option has no effect

    filesize
        The maximum size of the buffer to store HTTP responses. If
        an HTTP response is received with a size larger than this
        limit, the content is truncated at length filesize, and a
        warning is issued.

        (default: 16556, or whatever is specified in global
        config())

    Example:

      $b->add({
               repeat    => 10,
               cookie    => ["LoginID=b3dcc9bac34b7e60;"],
               urls      => ["http://localhost/one",
                             "http://localhost/two"],
               postdata  => [undef, undef],
               order     => "depth_first",
              });

EXECUTING
    Simply call the execute() method, which returns a single hash
    reference after all runs are completed.

    Example:

      my $regress = $b->execute;

REGRESSION
    All data about the benchmark runs are stored in the hash
    reference returned by the execute() method.

  Top-level of regression hash

    The following global regression data is returned in the top-
    level of the hash.

    total_time
        Total time, in milliseconds, between the start of the first
        request in the first run, and the end of the final response
        in the final run.

    bytes_received
        Total bytes received from all responses in all runs.

    warn
        Various warning messages.

    run0 .. runN
        Hash reference to regression data for each run (see below).

  Regression data for each run

    Regression data for each run are separated into individual
    hashes labelled runI where I is the run number. Run numbers
    start at zero, so run0 will always exist. Each runI hash has the
    following keys: threads, and a value for each individual URL in
    the urls sequence.

    threads
        This is an array reference containing data about each
        iteration of the URL sequence. A thread refers to a single
        iteration of the URL sequence for this run.

        Each element in the threads array reference is a hash
        reference containing the following data:

    connect_time
                An array reference the same length as urls which
                contains connection times, in milliseconds, for each
                URL in the sequence.

    total_time  An array reference the same length as urls which
                contains the total times taken to receive a response
                from the server, in milliseconds, for each URL in
                the sequence.

    headers     An array reference the same length as urls which
                contains the HTTP response headers returned by the
                server for each URL in the sequence.

    page_content
                An array reference the same length as urls which
                contains the full page content (including HTTP
                headers) returned by the server for each URL in the
                sequence.

                page_content is very useful for regression testing.
                You can use a parser relevant to your application
                (e.g. HTML::Parser, XML::Parser) to verify that the
                HTTP response is correct and expected for each step
                in the request sequence.

    http://url.first/ .. http://url.last/
        Each individual URL in the urls sequence is a hash
        containing the following keys:

    hostname    The hostname this request was sent to.

    port        The port this request was sent to.

    software    The Server: line from the HTTP response headers.

    doc_length  The length of the HTTP response document (not including
                headers). This should be equivalent to the Content-
                Length: line in the response headers, if the headers
                were set correctly.

    total_read  Total bytes read from the server for all repetitions of
                this URL.

    header      The HTTP response headers.

    page_content
                The full HTTP response, including headers.

    completed_requests
                Number of requests for this URL that completed
                successfully.

    failed_requests
                Number of requests for this URL that failed
                miserably.

    min_time    The minimum response time of all repetitions of this
                URL.

    max_time    The maximum response time of all repetitions of this
                URL.

    average_time
                The average response time of all repetitions of this
                URL.

    min_connect_time
                The minimum connect time of all repetitions of this
                URL.

    max_connect_time
                The maximum connect time of all repetitions of this
                URL.

    average_connect_time
                The average connect time of all repetitions of this
                URL.

EXAMPLES
    The following examples of ApacheBench usage are paired with the
    resulting output from an Apache access_log. This should give you
    a feel for how the global priority parameter and the per-run
    order parameter affect how HTTP requests are sent.

    First, let's set global priority to its default
    equal_opportunity.

      my $b = ApacheBench->new;
      $b->config({
                  concurrency => 1,
                  priority    => "equal_opportunity",
                 });

    Add a single run and execute, then look at what gets sent to
    Apache.

      $b->add({
               repeat       => 3,
               urls         => [ "http://localhost/",
                                 "http://localhost/server-status" ],
               order        => "breadth_first",
              });
      $b->execute;

    Apache access_log output:

      127.0.0.1 - - [20/Sep/2000:18:43:32 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:43:32 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:43:32 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:43:32 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:43:32 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:43:32 -0400] "GET /server-status HTTP/1.0" 200 5294

    Let's add another run and execute, and see what Apache sees.

      $b->add({
               repeat       => 3,
               urls         => [ "http://localhost/perl-status",
                                 "http://localhost/proxy-status" ],
               order        => "breadth_first",
              });
      $b->execute;

    Notice that both the first and second runs get equal
    opportunity. Apache access_log output:

      127.0.0.1 - - [20/Sep/2000:18:49:10 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /proxy-status HTTP/1.0" 200 5886
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /proxy-status HTTP/1.0" 200 5888
      127.0.0.1 - - [20/Sep/2000:18:49:11 -0400] "GET /proxy-status HTTP/1.0" 200 5889

    Now let's set global priority to run_priority.

      $b->config({
                  concurrency => 1,
                  priority    => "run_priority",
                 });
      $b->execute;

    Notice that now ApacheBench completes the entire first run
    before it starts the second. Here's the Apache access_log
    output:

      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /proxy-status HTTP/1.0" 200 5858
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /proxy-status HTTP/1.0" 200 5861
      127.0.0.1 - - [20/Sep/2000:18:52:47 -0400] "GET /proxy-status HTTP/1.0" 200 5864

    Let's now create a new ApacheBench object with runs set to
    depth_first instead of breadth_first. With depth_first, the
    global priority option has no effect, since each run can only
    use a maximum of one concurrent server (by definition, it can
    only be sending one request at a time). So we'll just leave it
    set to run_priority.

      my $b = ApacheBench->new;
      $b->config({
                  concurrency => 1,
                  priority    => "run_priority",
                 });
      $b->add({
               repeat       => 3,
               urls         => [ "http://localhost/",
                                 "http://localhost/server-status" ],
               order        => "depth_first",
              });
      $b->add({
               repeat       => 3,
               urls         => [ "http://localhost/perl-status",
                                 "http://localhost/proxy-status" ],
               order        => "depth_first",
              });
      $b->execute;

    Notice each sequence gets sent in full before it repeats. Apache
    access_log output:

      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /proxy-status HTTP/1.0" 200 5858
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /proxy-status HTTP/1.0" 200 5860
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:19:02:01 -0400] "GET /proxy-status HTTP/1.0" 200 5860

    Now let's send the same runs, but with a higher concurrency
    level.

      $b->config({
                  concurrency => 2,
                  priority    => "run_priority",
                 });
      my $regress = $b->execute;

    Notice that ApacheBench sends requests from all runs in order to
    fill up the specified level of concurrent requests. Apache
    access_log output:

      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /proxy-status HTTP/1.0" 200 5891
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /proxy-status HTTP/1.0" 200 5878
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET / HTTP/1.0" 200 5565
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /perl-status HTTP/1.0" 200 848
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /server-status HTTP/1.0" 200 5294
      127.0.0.1 - - [20/Sep/2000:19:04:38 -0400] "GET /proxy-status HTTP/1.0" 200 5878

    We captured the regression data on that last execute() call, so
    let's take a look at it.

      print "response times (in ms) for run 0, 1st iteration:\n  ";
      print join("\n  ", @{$regress->{'run0'}->{'threads'}->[0]->{'total_time'}});
      print "\n";

      print "response times (in ms) for run 0, 2nd iteration:\n  ";
      print join("\n  ", @{$regress->{'run0'}->{'threads'}->[1]->{'total_time'}});
      print "\n";

      print "response times (in ms) for run 0, 3rd iteration:\n  ";
      print join("\n  ", @{$regress->{'run0'}->{'threads'}->[2]->{'total_time'}});
      print "\n";

      print "response times (in ms) for run 1, 1st iteration:\n  ";
      print join("\n  ", @{$regress->{'run1'}->{'threads'}->[0]->{'total_time'}});
      print "\n";

      print "response times (in ms) for run 1, 2nd iteration:\n  ";
      print join("\n  ", @{$regress->{'run1'}->{'threads'}->[1]->{'total_time'}});
      print "\n";

      print "response times (in ms) for run 1, 3rd iteration:\n  ";
      print join("\n  ", @{$regress->{'run1'}->{'threads'}->[2]->{'total_time'}});
      print "\n";

    Perl output:

      response times (in ms) for run 0, 1st iteration:
        69
        39
      response times (in ms) for run 0, 2nd iteration:
        67
        39
      response times (in ms) for run 0, 3rd iteration:
        65
        41
      response times (in ms) for run 1, 1st iteration:
        67
        40
      response times (in ms) for run 1, 2nd iteration:
        66
        39
      response times (in ms) for run 1, 3rd iteration:
        65
        39

BUGS
    Error checking is a bit poor, so if you call the config() or
    add() methods incorrectly (e.g. with insufficient configuration
    parameters) you may experience a segmentation fault on
    execute().

    The page_content of any response which is larger than the
    filesize applicable to it will be truncated to zero length. This
    is contrary to what the documentation says above. This should be
    fixed ASAP. For now, just set your filesize big enough for the
    largest page you anticipate receiving in any run.

    ApacheBench may consume quite a lot of memory in some cases,
    depending on how big your runs are, because it stores all HTTP
    response data in memory.

    It has not been tested on any platforms other than Linux /
    Apache. Please send ports to other platforms to me (Adi
    Fairbank).

AUTHOR
    The ApacheBench Perl API was written by Ling Wu
    <ling@certsite.com> with guidance and moral support from Adi
    Fairbank <adi@certsite.com>.

    The ApacheBench Perl API is based on code from Apache 1.3.12 ab
    (src/support/ab.c), by the Apache group.

    The simplified re-implementation of ab, included with this
    distribution, was written by Adi Fairbank.

    Documentation for the ApacheBench Perl API was written by Adi
    Fairbank.

    Please e-mail either Adi or Ling with bug reports, or preferably
    patches.

LICENSE
    This package is free software and is provided AS IS without
    express or implied warranty. It may be used, redistributed
    and/or modified under the terms of the Perl Artistic License
    (http://www.perl.com/perl/misc/Artistic.html)

LAST MODIFIED
    Sep 20, 2000