Apache HTTP Server Version 2.4

Available Languages: en
There are a number of common pitfalls encountered when writing output filters; this page aims to document best practice for authors of new or existing filters.
This document is applicable to both version 2.0 and version 2.2
    of the Apache HTTP Server; it specifically targets
    RESOURCE-level or CONTENT_SET-level
    filters though some advice is generic to all types of filter.

 Filters and bucket brigades
 Filters and bucket brigades Filter invocation
 Filter invocation Brigade structure
 Brigade structure Processing buckets
 Processing buckets Filtering brigades
 Filtering brigades Maintaining state
 Maintaining state Buffering buckets
 Buffering buckets Non-blocking bucket reads
 Non-blocking bucket reads Ten rules for output filters
 Ten rules for output filters Use case: buffering in mod_ratelimit
 Use case: buffering in mod_ratelimitEach time a filter is invoked, it is passed a bucket
    brigade, containing a sequence of buckets which
    represent both data content and metadata.  Every bucket has a
    bucket type; a number of bucket types are defined and
    used by the httpd core modules (and the
    apr-util library which provides the bucket brigade
    interface), but modules are free to define their own types.
A filter can tell whether a bucket represents either data or
    metadata using the APR_BUCKET_IS_METADATA macro.
    Generally, all metadata buckets should be passed down the filter
    chain by an output filter.  Filters may transform, delete, and
    insert data buckets as appropriate.
There are two metadata bucket types which all filters must pay
    attention to: the EOS bucket type, and the
    FLUSH bucket type.  An EOS bucket
    indicates that the end of the response has been reached and no
    further buckets need be processed.  A FLUSH bucket
    indicates that the filter should flush any buffered buckets (if
    applicable) down the filter chain immediately.
FLUSH buckets are sent when the
    content generator (or an upstream filter) knows that there may be
    a delay before more content can be sent.  By passing
    FLUSH buckets down the filter chain immediately,
    filters ensure that the client is not kept waiting for pending
    data longer than necessary.Filters can create FLUSH buckets and pass these
    down the filter chain if desired.  Generating FLUSH
    buckets unnecessarily, or too frequently, can harm network
    utilisation since it may force large numbers of small packets to
    be sent, rather than a small number of larger packets.  The
    section on Non-blocking bucket reads
    covers a case where filters are encouraged to generate
    FLUSH buckets.
    HEAP FLUSH FILE EOS
This shows a bucket brigade which may be passed to a filter; it
    contains two metadata buckets (FLUSH and
    EOS), and two data buckets (HEAP and
    FILE).
For any given request, an output filter might be invoked only once and be given a single brigade representing the entire response. It is also possible that the number of times a filter is invoked for a single response is proportional to the size of the content being filtered, with the filter being passed a brigade containing a single bucket each time. Filters must operate correctly in either case.
An output filter can distinguish the final invocation for a
    given response by the presence of an EOS bucket in
    the brigade.  Any buckets in the brigade after an EOS should be
    ignored.
An output filter should never pass an empty brigade down the filter chain. To be defensive, filters should be prepared to accept an empty brigade, and should return success without passing this brigade on down the filter chain. The handling of an empty brigade should have no side effects (such as changing any state private to the filter).
apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
{
    if (APR_BRIGADE_EMPTY(bb)) {
        return APR_SUCCESS;
    }
    ...
A bucket brigade is a doubly-linked list of buckets.  The list
    is terminated (at both ends) by a sentinel which can be
    distinguished from a normal bucket by comparing it with the
    pointer returned by APR_BRIGADE_SENTINEL.  The list
    sentinel is in fact not a valid bucket structure; any attempt to
    call normal bucket functions (such as
    apr_bucket_read) on the sentinel will have undefined
    behaviour (i.e. will crash the process).
There are a variety of functions and macros for traversing and manipulating bucket brigades; see the apr_buckets.h header for complete coverage. Commonly used macros include:
APR_BRIGADE_FIRST(bb)APR_BRIGADE_LAST(bb)APR_BUCKET_NEXT(e)APR_BUCKET_PREV(e)The apr_bucket_brigade structure itself is
    allocated out of a pool, so if a filter creates a new brigade, it
    must ensure that memory use is correctly bounded.  A filter which
    allocates a new brigade out of the request pool
    (r->pool) on every invocation, for example, will fall
    foul of the warning above concerning
    memory use.  Such a filter should instead create a brigade on the
    first invocation per request, and store that brigade in its state structure.
It is generally never advisable to use
    apr_brigade_destroy to "destroy" a brigade unless
    you know for certain that the brigade will never be used
    again, even then, it should be used rarely.  The
    memory used by the brigade structure will not be released by
    calling this function (since it comes from a pool), but the
    associated pool cleanup is unregistered.  Using
    apr_brigade_destroy can in fact cause memory leaks;
    if a "destroyed" brigade contains buckets when its
    containing pool is destroyed, those buckets will not be
    immediately destroyed.
In general, filters should use apr_brigade_cleanup
    in preference to apr_brigade_destroy.
When dealing with non-metadata buckets, it is important to
    understand that the "apr_bucket *" object is an
    abstract representation of data:
->length field is set to
      the value (apr_size_t)-1.  For example, buckets of
      the PIPE bucket type have an indeterminate length;
      they represent the output from a pipe.FILE bucket type, for example,
      represents data stored in a file on disk.Filters read the data from a bucket using the
    apr_bucket_read function.  When this function is
    invoked, the bucket may morph into a different bucket
    type, and may also insert a new bucket into the bucket brigade.
    This must happen for buckets which represent data not mapped into
    memory.
To give an example; consider a bucket brigade containing a
    single FILE bucket representing an entire file, 24
    kilobytes in size:
FILE(0K-24K)
When this bucket is read, it will read a block of data from the
    file, morph into a HEAP bucket to represent that
    data, and return the data to the caller.  It also inserts a new
    FILE bucket representing the remainder of the file;
    after the apr_bucket_read call, the brigade looks
    like:
HEAP(8K) FILE(8K-24K)
The basic function of any output filter will be to iterate through the passed-in brigade and transform (or simply examine) the content in some manner. The implementation of the iteration loop is critical to producing a well-behaved output filter.
Taking an example which loops through the entire brigade as follows:
apr_bucket *e = APR_BRIGADE_FIRST(bb);
const char *data;
apr_size_t length;
while (e != APR_BRIGADE_SENTINEL(bb)) {
    apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
    e = APR_BUCKET_NEXT(e);
}
return ap_pass_brigade(bb);
The above implementation would consume memory proportional to
    content size.  If passed a FILE bucket, for example,
    the entire file contents would be read into memory as each
    apr_bucket_read call morphed a FILE
    bucket into a HEAP bucket.
In contrast, the implementation below will consume a fixed amount of memory to filter any brigade; a temporary brigade is needed and must be allocated only once per response, see the Maintaining state section.
apr_bucket *e;
const char *data;
apr_size_t length;
while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
    rv = apr_bucket_read(e, &data, &length, APR_BLOCK_READ);
    if (rv) ...;
    /* Remove bucket e from bb. */
    APR_BUCKET_REMOVE(e);
    /* Insert it into  temporary brigade. */
    APR_BRIGADE_INSERT_HEAD(tmpbb, e);
    /* Pass brigade downstream. */
    rv = ap_pass_brigade(f->next, tmpbb);
    if (rv) ...;
    apr_brigade_cleanup(tmpbb);
}
A filter which needs to maintain state over multiple
    invocations per response can use the ->ctx field of
    its ap_filter_t structure.  It is typical to store a
    temporary brigade in such a structure, to avoid having to allocate
    a new brigade per invocation as described in the Brigade structure section.
struct dummy_state {
    apr_bucket_brigade *tmpbb;
    int filter_state;
    ...
};
apr_status_t dummy_filter(ap_filter_t *f, apr_bucket_brigade *bb)
{
    struct dummy_state *state;
    state = f->ctx;
    if (state == NULL) {
        /* First invocation for this response: initialise state structure.
         */
        f->ctx = state = apr_palloc(f->r->pool, sizeof *state);
        state->tmpbb = apr_brigade_create(f->r->pool, f->c->bucket_alloc);
        state->filter_state = ...;
    }
    ...
If a filter decides to store buckets beyond the duration of a
    single filter function invocation (for example storing them in its
    ->ctx state structure), those buckets must be set
    aside.  This is necessary because some bucket types provide
    buckets which represent temporary resources (such as stack memory)
    which will fall out of scope as soon as the filter chain completes
    processing the brigade.
To setaside a bucket, the apr_bucket_setaside
    function can be called.  Not all bucket types can be setaside, but
    if successful, the bucket will have morphed to ensure it has a
    lifetime at least as long as the pool given as an argument to the
    apr_bucket_setaside function.
Alternatively, the ap_save_brigade function can be
    used, which will move all the buckets into a separate brigade
    containing buckets with a lifetime as long as the given pool
    argument.  This function must be used with care, taking into
    account the following points:
ap_save_brigade guarantees that all
      the buckets in the returned brigade will represent data mapped
      into memory.  If given an input brigade containing, for example,
      a PIPE bucket, ap_save_brigade will
      consume an arbitrary amount of memory to store the entire output
      of the pipe.ap_save_brigade reads from buckets which
      cannot be setaside, it will always perform blocking reads,
      removing the opportunity to use Non-blocking
      bucket reads.ap_save_brigade is used without passing a
      non-NULL "saveto" (destination) brigade parameter,
      the function will create a new brigade, which may cause memory
      use to be proportional to content size as described in the Brigade structure section.The apr_bucket_read function takes an
    apr_read_type_e argument which determines whether a
    blocking or non-blocking read will be performed
    from the data source.  A good filter will first attempt to read
    from every data bucket using a non-blocking read; if that fails
    with APR_EAGAIN, then send a FLUSH
    bucket down the filter chain, and retry using a blocking read.
This mode of operation ensures that any filters further down the filter chain will flush any buffered buckets if a slow content source is being used.
A CGI script is an example of a slow content source which is
    implemented as a bucket type. mod_cgi will send
    PIPE buckets which represent the output from a CGI
    script; reading from such a bucket will block when waiting for the
    CGI script to produce more output.
apr_bucket *e;
apr_read_type_e mode = APR_NONBLOCK_READ;
while ((e = APR_BRIGADE_FIRST(bb)) != APR_BRIGADE_SENTINEL(bb)) {
    apr_status_t rv;
    rv = apr_bucket_read(e, &data, &length, mode);
    if (rv == APR_EAGAIN && mode == APR_NONBLOCK_READ) {
        /* Pass down a brigade containing a flush bucket: */
        APR_BRIGADE_INSERT_TAIL(tmpbb, apr_bucket_flush_create(...));
        rv = ap_pass_brigade(f->next, tmpbb);
        apr_brigade_cleanup(tmpbb);
        if (rv != APR_SUCCESS) return rv;
        /* Retry, using a blocking read. */
        mode = APR_BLOCK_READ;
        continue;
    }
    else if (rv != APR_SUCCESS) {
        /* handle errors */
    }
    /* Next time, try a non-blocking read first. */
    mode = APR_NONBLOCK_READ;
    ...
}
In summary, here is a set of rules for all output filters to follow:
FLUSH buckets should be respected by passing
      any pending or buffered buckets down the filter chain.EOS bucket.ap_pass_brigade to pass a brigade
      down the filter chain, output filters should call
      apr_brigade_cleanup to ensure the brigade is empty
      before reusing that brigade structure; output filters should
      never use apr_brigade_destroy to "destroy"
      brigades.ap_pass_brigade, and must return appropriate errors
      back up the filter chain.FLUSH bucket down the
      filter chain if the read blocks, before retrying with a blocking
      read.The r1833875 change is a good
    example to show what buffering and keeping state means in the context of an
    output filter. In this use case, a user asked on the users' mailing list a
    interesting question about why mod_ratelimit seemed not to
    honor its setting with proxied content (either rate limiting at a different
    speed or simply not doing it at all). Before diving deep into the solution,
    it is better to explain on a high level how mod_ratelimit works.
    The trick is really simple: take the rate limit settings and calculate a
    chunk size of data to flush every 200ms to the client. For example, let's imagine
    that to set rate-limit 60 in our config, these are the high level
    steps to find the chunk size:
/* milliseconds to wait between each flush of data */ RATE_INTERVAL_MS = 200; /* rate limit speed in b/s */ speed = 60 * 1024; /* final chunk size is 12228 bytes */ chunk_size = (speed / (1000 / RATE_INTERVAL_MS));
If we apply this calculation to a bucket brigade carrying 38400 bytes, it means that the filter will try to do the following:
The above pseudo code works fine if the output filter handles only one brigade
    for each response, but it might happen that it needs to be called multiple times
    with different brigade sizes as well. The former use case is for example when
    httpd directly serves some content, like a static file: the bucket brigade
    abstraction takes care of handling the whole content, and rate limiting
    works nicely. But if the same static content is served via mod_proxy_http (for
    example a backend is serving it rather than httpd) then the content generator
    (in this case mod_proxy_http) may use a maximum buffer size and then send data
    as bucket brigades to the output filters chain regularly, triggering of course
    multiple calls to mod_ratelimit. If the reader tries to execute the pseudo code
    assuming multiple calls to the output filter, each one requiring to process
    a bucket brigade of 38400 bytes, then it is easy to spot some
    anomalies:
In this case, two things might help:
mod_ratelimit
        for each response handling cycle, to "remember" when the last sleep was
        performed across multiple invocations, and act accordingly.ap_save_brigade to set them aside.
        These bytes will be prepended to the next bucket brigade that will be handled
        in the subsequent invocation.The commit linked in the beginning of the section contains also a bit of code refactoring so it is not trivial to read during the first pass, but the overall idea is basically what written up to now. The goal of this section is not to cause a headache to the reader trying to read C code, but to put him/her into the right mindset needed to use efficiently the tools offered by the httpd's filter chain toolset.
Available Languages: en