Filters

A filter specification defines what conditions have to be met and what actions are taken when a condition is met.

Filter specification

A filter specification is usually written as for example:

{filter: rbl, org: bl.spamcop.net, action: delete}

where the filter clause specifies the name of the filter to apply, in this case “rbl”, a filter that checks whether the sending mail server’s IP is registered as spam source. The remaining clauses are parameters for the filter. The action clause is mandatory with all filters. In the example, the org clause, required by rbl filters, specifies which RBL service is to be queried.

Filter lists

Since a filter definition can easily occupy a whole line, it is more convenient to put each of them on its own line when specifying multuple filters for a folder. For this YAML’s block style is better suited to the task. So a list of filters would look like

folders:
    - INBOX
    -
        - {filter: header, part: subject, check: "has cana?dia?n\s+pha?rma?cy", action: delete}
        - {filter: header, part: from, check: "is-not .*@goodguy.org", action: delete}
        - {filter: header, part: subject, check: "is \[mail-list\].*", action: move-to mail-list}

But nothing prevents you from writing the above list in flow style, for example:

folders: [INBOX, [
        {filter: header, part: subject, check: "has cana?dia?n\s+pha?rma?cy", action: delete},
        {filter: header, part: from, check: "is-not .*@goodguy.org", action: delete},
        {filter: header, part: subject, check: "is \[mail-list\].*", action: move-to mail-list}]]

Nested filter lists

YAML provides a good deal of flexibility with data definitions. One particularly useful feature is anchors. To make use of these, filter lists can be nested. Together, easily switchable alternatives can be defined, for example:

my-alternatives-list:
    - &alternative1
        - {filter: header, part: subject, check: "has cana?dia?n\s+pha?rma?cy", action: delete}
        - {filter: header, part: from, check: "is-not .*@goodguy.org", action: delete}
        - {filter: header, part: subject, check: "is \[mail-list\].*", action: move-to mail-list}
    - &alternative2
        - {filter: header, part: subject, check: "has cana?dia?n\s+pha?rma?cy", action: move-to INBOX/Spam}
        - {filter: header, part: from, check: "is-not .*@goodguy.org", action: move-to INBOX/Spam}
        - {filter: header, part: subject, check: "is \[mail-list\].*", action: move-to INBOX/Spam}

accounts:
    Google Mail:
        address: imap.gmail.com
        username: john.doe@gmail.com
        password: _My_PaSswOrd_
        ssl: true
        folders:
            - [INBOX, *alternative1]

Thus the filter rules in the Google Mail account can be switched quickly from alternative1 to alternative2. The filter rule is seen by the program as a list of lists which, in itself, doesn’t make much sense. It’s only there to make use of YAML’s anchor feature. The possibilities are enless. Consider:

my-sub-alternatives:
    - &all-rules
        - &special-rule
            - {filter: header, part: subject, check: "has cana?dia?n\s+pha?rma?cy", action: delete}
        - &main-rules:
            - {filter: header, part: from, check: "is-not .*@goodguy.org", action: move-to INBOX/Spam}
            - {filter: header, part: subject, check: "is \[mail-list\].*", action: move-to INBOX/Spam}

You can switch between all-rules, main-rules and special-rule in an instant.

Action routines

Notice the repetition in the the previous examples’ action arguments. It would get worse if a list of actions should be run. Fortunately, YAML anchors provide the solution to this problem as well:

my-actions:
    - report-pyzor
    - report-badips
    - delete

my-rules:
    - {filter: header, part: from, check: "is-not .*@goodguy.org", action: *my-actions}
    - {filter: header, part: subject, check: "is \[mail-list\].*", action: *my-actions}

Action clause

The action clause, required by all filters, may be a single item or a list of items. Since it is the same for all filters, it is not mentioned in descriptions below. See Actions for a description of actions.

Built-in filters

The following filters are included in this packages. More filters can be added through extensions.

Filter: header

Arguments: part, check

The part argument specifies which header is to be examined. Common values are from (sender), to (recipient), subject, date (actually the time), but generally, anything occurring in the header section of a mail is possible.

For an explanation of the check argument, see The check argument in header and body filters.

Examples:

{filter: header, part: subject, check: "has cana?dia?n\s+pha?rma?cy", action: delete}
{filter: header, part: from, check: "is-not .*@goodguy.org", action: delete}
{filter: header, part: subject, check: "is \[mail-list\].*", action: move-to mail-list}

Filter: body

Arguments: check

Checks for contents in the body of the e-mail. For an explanation of the check argument, see The check argument in header and body filters.

Examples:

{filter: body, check: "has cana?dia?n\s+pha?rma?cy", action: move-to Spam}

Note

The is and is-not operators are less useful in the body filter. Use has instead.

Filter: rbl

Arguments: org

Checks whether the IP of the sending server is registered with an RBL (real-time block list). There is a vast number of organizations providing such black lists. The org parameter specifies which one is to be used. These black lists answer domain name requests in a specific format and the org parameter specifies the right part of it. Here is a list with the values for some of the most prominent RBLs in use today:

RBL organization org value
Spamcop bl.spamcop.net
Spamhaus zen.spamhaus.org
Barracuda b.barracudacentral.org

Examples:

{filter: rbl, org: bl.spamcop.net, action: delete}

Filter: pyzor

Checks whether there is a signature of the whole mail message registered with Pyzor. These signatures designate spam messages previously reported to Pyzor.

Examples:

{filter: pyzor, action: delete}

Filter: url

Checks whether the mail contains URLs known as spam source.

Examples:

{filter: url, action: delete}

Filter: all

This is not exactly a filter. It applies actions to all messages encountered during the scan process.

Examples:

{filter: all, action: [report-badips, report-pyzor, delete]}

Note

Be careful with this filter. It is intended to be used on folders where messages are copied or moved under some control. Do not use it on the INBOX folder (except if the account is a honey pot anyway).

The check argument in header and body filters

The check argument specifies how to compare the value. The format is operator operands, where operator is one of: is, is-not, has, has-no or has-all. For operands, a single value or a list of values can be given, depending on the requirements of the operator.

The is operator does a regular expression match of a single value against the pattern in the operand. The is-not operator succeeds if the pattern does not match.

To search for a regular expression anywhere in the header or body, use the has operator. If you want to check against more than one pattern and you want to make sure all of them are found, use has-all. If you want to make sure none of a number of patterns are found, use has-no.

Note

Since regular expressions tend to use characters also used by YAML, it is a good idea to quote the value.