dccproc(8)            Distributed Checksum Clearinghouse            dccproc(8)


NAME

     dccproc -- Distributed Checksum Clearinghouse Procmail Interface


SYNOPSIS

     dccproc [-VdAQCHER] [-h homedir] [-m map] [-w whiteclnt] [-T tmpdir]
             [-a IP-address] [-f env_from] [-t targets] [-x exitcode]
             [-c type,[log-thold,]rej-thold] [-g [not-]type] [-S header]
             [-i infile] [-o outfile] [-l logdir] [-B dnsbl-option]
             [-L ltype,facility.level]


DESCRIPTION

     Dccproc copies a complete SMTP message from standard input or a file to
     standard output or another file.  As it copies the message, it computes
     the DCC checksums for the message, reports them to a DCC server, and adds
     a header line to the message.  Another program such as procmail(1) can
     use the added header line to filter mail.  Dccproc does not support any
     thresholds of its own, because equivalent effects can be achieved with
     regular expressions and you can apply dccproc several times using differ-
     ent DCC servers and then score mail based what all of the DCC servers
     say.

     Error messages are sent to stderr as well as the system log.  Connect
     stderr and stdout to the same file to see errors in context, but direct
     stderr to /dev/null to keep DCC error messages out of the mail.  The -i
     option can also be used to separate the error messages.

     Dccproc sends reports of checksums related to mail received by DCC
     clients and queries about the total number of reports of particular
     checksums.  A DCC server receives no mail, address, headers, or other
     information, but only cryptographically secure checksums of such informa-
     tion.  A DCC server cannot determine the text or other information that
     corresponds to the checksums it receives.  It only acts as a clearing-
     house of counts of checksums computed by clients.

     For the sake of privacy for even the checksums of private mail, the
     checksums of senders of purely internal mail or other mail that is known
     to not be unsolicited bulk can be listed in a whitelist to not be
     reported to the DCC server.

     When sendmail(8) is used, dccm(8) is a better DCC interface.  Dccifd(8)
     is more efficient than dccproc because it is a daemon, but that has costs
     in complexity.  See dccsight(8) for a way to use previously computed
     checksums.

   OPTIONS
     The following options are available:

     -V   displays the version of the DCC procmail(1) interface.

     -d   enables debugging output from the DCC client library.  Additional -d
          options increase the number of messages.  One causes error messages
          to be sent to STDERR as well as the system log.

     -A   adds to existing X-DCC headers (if any) of the brand of the current
          server instead of replacing existing headers.

     -Q   only queries the DCC server about the checksums of messages instead
          of reporting and then querying.  This is useful when dccproc is used
          to filter mail that has already been reported to a DCC server by
          another DCC client such as dccm(8).  No single mail message should
          be reported to a DCC server more than once per recipient.

          It is better to use MXDCC lines in the global whiteclnt file for
          your MX servers

     -C   outputs only the X-DCC header and the checksums for the message.

     -H   outputs only the X-DCC header.

     -E   adds lines to the start of the log file turned on with -l and -c
          describing what might have been the envelope of the message.  The
          information for the inferred envelope comes from arguments including
          -a and headers in the message when -R is used.  No lines are gener-
          ated for which no information is available, such as the envelope
          recipient.

     -R   says the first Received lines have the standard
          "helo (name [address])..."  format and the address is that of the
          SMTP client that would otherwise be provided with -a.  The -a option
          should be used if the local SMTP server adds a Received line with
          some other format or does not add a Received line.  Received headers
          specifying IP addresses marked MX or MXDCC in the -w whiteclnt file
          are skipped.

     -h homedir
          overrides the default DCC home directory, which is often /var/lib/dcc.

     -m map
          specifies a name or path of the memory mapped parameter file instead
          of the default map in the DCC home directory.  It should be created
          with the new map operation of the cdcc(8) command.

     -w whiteclnt
          specifies an optional file containing SMTP client IP addresses and
          SMTP headers of mail that do not need X-DCC headers and whose check-
          sums should not be reported to the DCC server.  It can also contain
          checksums of spam.  If the pathname is not absolute, it is relative
          to the DCC home directory.  Thus, individual users with private
          whitelists usually specify them with absolute paths.  Common
          whitelists shared by users must be in the DCC home directory or one
          of its subdirectories and owned by the set-UID user of dccproc.  It
          is useful to include a common or system-wide whitelist in private
          lists.

          The format of the dccproc whiteclnt file is the same as the
          whitelist file required by dbclean(8) and dccm(8).  Unlike dccm, the
          dccproc whiteclnt file is optional.  When -w is not used, settings
          equivalent to these are used:
                option log-normal
                option dcc-on
                option DCC-reps-on
                option DNSBL-on
          When -w is used, the defaults mentioned in dcc(8) are used.  Those
          defaults differ and turn off DCC Reputations, and DNS blacklist
          (DNSBL )checking.

          Because the contents of the whiteclnt file are used frequently, a
          companion file is automatically created and maintained.  It has the
          same pathname but with an added suffix of .dccw.  It contains a mem-
          ory mapped hash table of the main file.

          A local whitelist entry ("OK) or two or more semi-whitelistings
          ("OK2") for one of the message's checksums prevents all of the mes-
          sage's checksums from being reported to the DCC server and the addi-
          tion of a X-DCC header line by dccproc.  Because it is run by or on
          behalf of a single user, dccproc ignores env_To entries in the
          whiteclnt file.  Users who don't want to use dccproc shouldn't.

     -T tmpdir
          changes the default directory for temporary files from the system
          default.  The system default is often /tmp.

     -a IP-address
          specifies the IP address (not the host name) of the immediately pre-
          vious SMTP client.  It is often not available.  -a 0.0.0.0 is
          ignored.  -a.  The -a option should be used instead of -R if the
          local SMTP server adds a Received line with some other format or
          does not add a Received line.

     -f env_from
          specifies the RFC 821 envelope "Mail From" value with which the mes-
          sage arrived.  It is often not available.  If -f is not present, the
          contents of the first Return-Path: or UNIX style From_ header is
          used.  The env_from string is often but need not be bracketed with
          "<>".

     -t targets
          specifies the number of addressees of the message if other than 1.
          The string many instead of a number asserts that there were too many
          addressees and that the message is unsolicited bulk email.

     -x exitcode
          specifies the code or status with which dccproc exits if the -c
          thresholds are reached or the -w whiteclnt file blacklists the mes-
          sage, unless the message is whitelisted.

          The default value is EX_NOUSER.  EX_NOUSER is 67 on many systems.
          Use 0 to always exit successfully.

     -c type,[log-thold,]rej-thold
          sets logging and "spam" thresholds for checksum type.  Each logged
          message placed in a separate file in the directory specified with
          -l.  The checksum types are IP, env_From, From, Message-ID,
          substitute, Received, Body, Fuz1, Fuz2, rep-total, and rep.  The
          first six, IP through substitute, have no effect except when a local
          DCC server configured with -K is used.  The substitute thresholds
          apply to the first substitute heading encountered in the mail mes-
          sage.  The string ALL sets thresholds for all types, but is unlikely
          to be useful except for setting logging thresholds.  The string CMN
          specifies the commonly used checksums Body, Fuz1, and Fuz2.
          Rej-thold and log-thold must be numbers, the string NEVER, or the
          string MANY indicating millions of targets.  Counts from the DCC
          server as large as the threshold for any single type are taken as
          sufficient evidence that the message should be logged or rejected.

          Log-thold is the threshold at which messages are logged.  It can be
          handy to log messages at a lower threshold to find solicited bulk
          mail sources such as mailing lists.  Messages that reach at least
          one of their rejection thresholds or that have complicated combina-
          tions of white- and blacklisting are logged regardless of logging
          thresholds.

          Rej-thold is the threshold at which messages are considered "bulk,"
          and so should cause the X-DCC header line to contain the string
          "bulk" or "bulk rep" and dccproc to exit with the value set by -x.

          DCC reputation thresholds in the commercial version of the DCC are
          controlled by thresholds on checksum types rep and rep-total.  Mes-
          sages from an IP address that the DCC database says has sent more
          than rep-total log-thold messages are logged.  A DCC reputation is
          computed for messages received from IP addresses that have sent more
          than rep-total rej-thold messages.  The DCC reputation of an IP
          address is the percentage of its messages that have been detected as
          bulk, or having at least 10 recipients.  The defaults are equivalent
          to -c rep,never and -c rep-total,never,10.

          The checksums of locally white-listed messages are not checked with
          the DCC server and so only the number of targets of the current
          instance of a white-listed message are compared against the thresh-
          olds.

          The default is -c ALL,NEVER, so that nothing is discarded or logged.
          A common choice is -c CMN,25,50 to reject or discard mail with com-
          mon bodies except as overridden by the whitelist of the DCC server
          and -g and -w.

     -g [not-]type
          indicates that white-listed, OK or OK2, counts from the DCC server
          for a type of checksum are to be believed.  They should be ignored
          if prefixed with not-.  Type is one of the same set of strings as
          for -c.  Only IP, env_From, and From are likely choices.  By default
          all three are honored, and hence the need for not-.

     -S hdr
          adds to the list of substitute or locally chosen headers that are
          checked with the -w whiteclnt file and sent to the DCC server.  The
          checksum of the last header of type hdr found in the message is
          checked.  As many as 6 different substitute headers can be speci-
          fied, but only the checksum of the first of the 6 will be sent to
          the DCC server.

     -i infile
          specifies an input file for the entire message instead of standard
          input.  If not absolute, the pathname is interpreted relative to the
          directory in which dccproc was started.

     -o outfile
          specifies an output file for the entire message including headers
          instead of standard output.  If not absolute, the pathname is inter-
          preted relative to the directory in which dccproc was started.

     -l logdir
          specifies a directory for copies of messages whose checksum target
          counts exceed -c thresholds.  The format of each file is affected by
          -E.

          If logdir starts with D?, log files are put into subdirectories of
          the form logdir/JJJ where JJJ is the current julian day.  H?logdir
          puts logs files into subdirectories of the form logdir/JJJ/HH where
          HH is the current hour.  M?logdir puts log files into subdirectories
          of the form logdir/JJJ/HH/MM where MM is the current minute.  See
          the FILES section below concerning the contents of the files.  See
          also the option log-subdirectory-{day,hour,minute} lines in
          whiteclnt files described in dcc(8).

          The directory is relative to the DCC home directory if it is not
          absolute

     -B dnsbl-option
          enables DNS blacklist checks of the SMTP client IP address, SMTP
          envelope Mail_From sender domain name, and of host names in URLs in
          the message body.  Body URL blacklisting has too many false posi-
          tives to use on abuse mailboxes.  It is less effective than
          greylisting with dccm(8) or dccifd(8) but can be useful in situa-
          tions where greylisting cannot be used.

          Dnsbl-option is either of the forms set:option or
          domain[,IPaddr[,bltype]].  Domain is a DNS blacklist domain such as
          example.com that will be searched.  IPaddr is the string "any" or
          the IP address in the DNS blacklist that indicates that the mail
          message is spam.  127.0.0.2 is assumed if IPaddr is absent.  IPv6
          addresses can be specified with the usual colon (:) notation.  Names
          can be used instead of numeric addresses.  The type of DNS blacklist
          is specified by bltype as name, IPv4, or IPv6.  Given an envelope
          sender domain name or a domain name in a URL of spam.domain.org and
          a blacklist of type name, spam.domain.org.example.com will be tried.
          Blacklist types of IPv4 and IPv6 require that the domain name in a
          URL be resolved into an IPv4 or IPv6 address.  The address is then
          written as a reversed string of decimal octets to check the DNS
          blacklist, as in 2.0.0.127.example.com,

          More than one blacklist can be specified.  They are searched in
          order.  All searching is stopped at the first positive result.

          Positive results are ignored after being logged unless an
          option DNSBL-on line appears in the global or per-user whiteclnt
          file.

          -B set:debug=X
               sets the DNS blacklist logging level

          -B set:msg-secs=S
               limits dccproc to S seconds total for checking all DNS black-
               lists.  The default is 25.

          -B set:URL-secs=S
               limits dccproc to at most S seconds resolving and checking any
               single URL.  The default is 11.  Some spam contains dozens of
               URLs and that some "spamvertised" URLs contain host names that
               need minutes to resolve.  Busy mail systems cannot afford to
               spend minutes checking each incoming mail message.  In order to
               use typical single-threaded DNS resolver libraries, dccm(8) and
               dccifd(8) use fleets of helper processes.

          -B set:no-envelope
               says that SMTP client IP addresses and sender Mail_From domain
               names should not be checked in the following blacklists.
               set:envelope restores the default for subsequently named black-
               lists.

          -B set:no-body
               says that URLs in the message body should not be checked in the
               in the following blacklists.  set:body restores the default for
               later blacklists.

          -B set:no-MX
               says MX servers of sender Mail_From domain names and host names
               in URLs should not be checked in the following blacklists.
               set:MX restores the default.

          -B set:no-NS
               says NS servers of sender Mail_From domain names and host names
               in URLs should not be checked in the following blacklists.
               set:NS restores the default.

     -L ltype,facility.level
          specifies how messages should be logged.  Ltype must be error or
          info to indicate which of the two types of messages are being con-
          trolled.  Level must be a syslog(3) level among EMERG, ALERT, CRIT,
          ERR, WARNING, NOTICE, INFO, and DEBUG.  Facility must be among AUTH,
          AUTHPRIV, CRON, DAEMON, FTP, KERN, LPR, MAIL, NEWS, USER, UUCP, and
          LOCAL0 through LOCAL7.  The default is equivalent to
                -L info,MAIL.NOTICE -L error,MAIL.ERR
          Something like this turns off the log messages:
                -L notice,MAIL.debug -L error,MAIL.DEBUG

     dccproc exits with 0 on success and with the -x value if the -c thresh-
     olds are reached or the -w whiteclnt file blacklists the message.  If at
     all possible, the input mail message is output to standard output or the
     -o outfile despite errors.  If possible, error messages are put into the
     system log instead of being mixed with the output mail message.  The exit
     status is zero for errors so that the mail message will not be rejected.

     If dccproc is run more than 500 times in fewer than 5000 seconds, dccproc
     tries to start Dccifd(8).  The attempt is made at most once per hour.
     Dccifd is significantly more efficient than dccproc.  With luck, mecha-
     nisms such as SpamAssassin will notice when dccifd is running and switch
     to dccifd.


FILES

     /var/lib/dcc   DCC home directory in which other files are found.
     map        memory mapped file in the DCC home directory of information
                concerning DCC servers.
     whiteclnt  contains the client whitelist in the format described in
                dcc(8).
     whiteclnt.dccw
                is a memory mapped hash table corresponding to the whiteclnt
                file.
     tmpdir     contains temporary files created and deleted as dccproc pro-
                cesses the message.
     logdir     is an optional directory specified with -l and containing
                marked mail.  Each file in the directory contains one message,
                at least one of whose checksums reached one of its -c thresh-
                olds.  The entire body of the SMTP message including its
                header is followed by the checksums for the message.


EXAMPLES

     The following procmailrc(5) rule adds an X-DCC header to passing mail

         :0 f
         | /usr/bin/dccproc -ERw whiteclnt

     This procmailrc(5) recipe rejects mail with total counts of 10 or larger
     for the commonly used checksums:

         :0 fW
         | /usr/bin/dccproc -ERw whiteclnt -ccmn,10
         :0 e
         {
             EXITCODE=67
             :0
             /dev/null
         }


SEE ALSO

     cdcc(8), dcc(8), dbclean(8), dccd(8), dblist(8), dccifd(8), dccm(8),
     dccsight(8), mail(1), procmail(1).


HISTORY

     Implementation of dccproc was started at Rhyolite Software in 2000.  This
     describes version 1.3.48.


BUGS

     dccproc uses -c where dccm(8) uses -t.

                               February 1, 2007

Man(1) output converted with man2html modified for the DCC $Date 2001/04/29 03:22:18 $