
# Archivists note:  This file has been replaced by an updated
version in this same directory, mailsort.tart.gz

Bill

Article 13050 of comp.lang.perl:
Xref: feenix.metronet.com comp.lang.perl:13050
Path: feenix.metronet.com!news.utdallas.edu!chpc.utexas.edu!cs.utexas.edu!swrinde!ihnp4.ucsd.edu!munnari.oz.au!hippo.ru.ac.za!caesar.wits.ac.za!concave!andras
From: andras@concave.cs.wits.ac.za (Andras Salamon)
Newsgroups: comp.lang.perl
Subject: [SCRIPT] mailsort
Date: 20 Apr 94 17:39:22 GMT
Organization: Computer Science, University of the Witwatersrand
Lines: 483
Message-ID: <137@concave.cs.wits.ac.za>
NNTP-Posting-Host: concave.cs.wits.ac.za
Summary: sort mbox-style mail folders by timestamp

Heeding Larry's call to post scripts, here is `mailsort' to sort mbox
mail folders by the timestamps in the `From ' message separator lines.
Mbox folders are used by most non-MH MUA's and are also a standard for
news folders.  This script is meant to supersede `sortmailbox' on the
UFL archives, although it was initially inspired by the gawk script
`mboxsort', by Roman Czyborra.

The major limitation is that it ignores time zones.  Unfortunately some
software insists on putting in a time zone when faking `From ' lines.
Pointers to a time zone aware dates package would be appreciated.

Comments are very welcome.

---cut here
#! /bin/sh
# This is a shell archive.  Remove anything before this line, then feed it
# into a shell via "sh file" or similar.  To overwrite existing files,
# type "sh file -c".
# The tool that generated this appeared in the comp.sources.unix newsgroup;
# send mail to comp-sources-unix@uunet.uu.net if you want that tool.
# Contents:  mailsort
# Wrapped by andras@concave.cs.wits.ac.za on Wed Apr 20 19:37:30 1994
PATH=/bin:/usr/bin:/usr/ucb ; export PATH
echo If this archive is complete, you will see the following message:
echo '          "shar: End of archive."'
if test -f 'mailsort' -a "${1}" != "-c" ; then 
  echo shar: Will not clobber existing file \"'mailsort'\"
else
  echo shar: Extracting \"'mailsort'\" \(12713 characters\)
  sed "s/^X//" >'mailsort' <<'END_OF_FILE'
X#!/usr/local/bin/perl
X'di ';
X'ds 00 \\"';
X'ig 00 ';
X#
X# $Id: mailsort,v 1.16 94/04/20 19:36:33 andras Exp Locker: andras $
X#
X#   THIS PROGRAM IS ITS OWN MANUAL PAGE.  INSTALL IN man & bin.
X#
X
X$ALTERNATE_TMPDIR = '/tmp'; # use this if TMPDIR is not defined
X$CP = '/bin/cp';
X$CP = 'cp' if (! -x $CP); # hope it's in the path
X
X($BCMD = $0) =~ s/.*\///;
X($REVISION) = ('$Revision: 1.16 $' =~ /[^\d\.]*([\d\.]*)/);
X$HELPSTRING = "For help, type: $BCMD -h";
X($IDENT = '@(#)mailsort: sort mbox-style mail folders by timestamp')
X    =~ s/^[^:]*: *//;
X
X$USAGE = "Usage: $BCMD [-dLrv] folder ...";
X
X########################################################################
X# process arguments
X
Xrequire('getopts.pl');
Xif (! &Getopts('dhLrv')) {
X    print STDERR "$USAGE\n$HELPSTRING\n";
X    exit 2;
X}
Xif ($opt_h) {
X    print <<EOT;
X$BCMD $REVISION: $IDENT
X$USAGE
X -d			print extra debugging information
X -L			display software license
X -r			reverse sort order
X -v			turn on verbose mode
X folder ...		mailx/Mail mail folders to sort
XUnless reversed by -r, the default sort order is increasing by timestamp.
X$BCMD can be used as a filter.  When `-' is specified as an argument,
Xstandard input is read and sorted to standard output; any other folders
Xspecified are processed as usual.
XEOT
X	exit 0;
X} elsif ($opt_L) {
X    print <<EOT;
X    Copyright 1994 Andras Salamon <andras@cs.wits.ac.za>
X    This program is free software; you can redistribute it and/or modify
X    it under the terms of the GNU General Public License as published by
X    the Free Software Foundation; either version 2 of the License, or
X    (at your option) any later version.
X
X    This program is distributed in the hope that it will be useful,
X    but WITHOUT ANY WARRANTY; without even the implied warranty of
X    MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE.  See the
X    GNU General Public License for more details.
X
X    If you do not already have a copy of the GNU General Public License,
X    you can obtain a copy by anonymous ftp from prep.ai.mit.edu
X    (file COPYING in directory /pub/gnu) or write to the Free Software
X    Foundation, Inc., 675 Mass Ave, Cambridge, MA 02139, USA.
XEOT
X	exit 0;
X}
X$VERBOSE = $opt_v;
X$DEBUG = $opt_d;
X
Xif (@ARGV < 1) {
X    if (-t STDIN) {
X	print STDERR "$USAGE\n$HELPSTRING\n";
X	exit 2;
X    } else {
X	unshift(ARGV, '-');
X    }
X}
X
X########################################################################
X# ishead
X#
X# See if the passed line buffer is a mail header.  Return true if yes.
X# Time zones and month/day names are only vaguely checked.
X
Xsub ishead {
X    local($l) = @_;
X    local($f, $d) = ('', '');
X
X    if ($l =~ /^From ((("[^"]*")|\S)*)\s*tty\s*(\S*)\s*(.*)/) {
X	($f, $d) = ($1, $5);
X    } elsif ($l =~ /^From ((("[^"]*")|\S)*)\s*(.*)/) {
X	($f, $d) = ($1, $4);
X    } else {
X	return(0);
X    }
X
X    if ($f eq '' || $d eq '') {
X	return(0);
X    }
X    # note that this rejects lines which have whitespace after the year
X    return(
X    $d =~ m#([A-Z][a-z]{2} ){2}[ \d]\d [012]\d(:[0-5]\d){2}( ([A-Za-z]{3}|[\d+-,;:/])+)? (\d{2}|\d{4})$#);
X}
X
X########################################################################
X# reportwarn
X#
X# print specified warning message; uses global $origname
X
Xsub reportwarn {
X    local($message) = @_;
X    if ($VERBOSE) {
X	print STDERR " --- Warning: $message, skipping\n";
X    } else {
X	print STDERR "Warning: $message, skipping $origname\n";
X    }
X}
X
X########################################################################
X# signal_handler
X#
X# catch interrupt signals; 1st argument is signal name
X# uses globals $exitstatus, $tmpfile and $origname
X
Xsub signal_handler {
X    local($sig) = @_;
X    if ($VERBOSE) {
X	print STDERR "\n*** Caught signal $sig, cleaning up\n";
X    } else {
X	print STDERR "Caught signal $sig processing $origname, stopping\n";
X    }
X    unlink $tmpfile;
X    exit(++$exitstatus);
X}
X
X
X########################################################################
X# main program
X
X$exitstatus = 0;
X@SIG{'INT', 'HUP', 'QUIT', 'PIPE'} = ('signal_handler') x 4;
X
X%ord = split(" ",
X"Jan 1 Feb 2 Mar 3 Apr 4 May 5 Jun 6 Jul 7 Aug 8 Sep 9 Oct 10 Nov 11 Dec 12");
X
XArgument:
Xwhile ($origname = $filename = shift) {
X    if (! open(CURRENT, $filename)) {
X	&reportwarn("cannot open file", $origname);
X	$exitstatus ++;
X	next Argument;
X    }
X    print STDERR (($filename eq '-') ? 'stdin' : "$filename") . ': reading'
X	if $VERBOSE;
X    $m_key = '0000000000000'; # the key for leading non-message text
X    undef @text; undef %found; undef $m_text;
X    $sort_this = 0; $wasblank = 1;
X    $m_count = 0;
X    while (<CURRENT>) {
X	if ($wasblank && /^From / && &ishead($_)) {
X	    # end of message processing for previous message
X	    $found{$m_key} .= "$m_count:";
X	    $previous = $m_key;
X	    push(@text, $m_text);
X	    undef $m_text;
X
X	    $m_count ++;
X	    @_ = split("[ \t]+", $_);
X	    ($m, $day, $t) = @_[3..5];
X	    $month = $ord{$m};
X	    ($hour, $min, $sec) = split(":", $t);
X	    $year = pop(@_); # last field, ignoring timezone if any
X	    $year += 1900 if ($year < 100);
X	    $m_key = sprintf("%04d%02d%02d%02d%02d%02d",
X			    $year, $month, $day, $hour, $min, $sec);
X
X	    # check if timestamp grows monotonically, ie. if already sorted
X	    $sort_this = 1 if ($m_key lt $previous);
X	}
X
X	$m_text .= $_;
X	$wasblank = ($_ eq "\n");
X    }
X    
X    # store end of last message, add a final blank line if needed
X    if (! $wasblank && $sort_this) {
X	$m_text .= "\n";
X    }
X    $found{$m_key} .= "$m_count:";
X    push(@text, $m_text);
X    print STDERR
X	($m_count
X	    ? ("\b\b\b $m_count message" . (($m_count > 1) ? 's' : ''))
X	    : ', not mbox file') if $VERBOSE;
X
X    if ($filename eq '-') {
X	$tmpfile = '';
X	open(TMPFILE, ">&STDOUT");
X    } else {
X	if (! $sort_this) {
X	    print STDERR ($m_count ? " - already sorted\n" : " - ignored\n")
X		if $VERBOSE;
X	    next Argument;
X	}
X	# open temporary file
X	$origname = $filename;
X	$tmpfile = "$filename+";
X	# for a symbolic link, read actual file and ignore link
X	if ($_ = readlink($filename)) {
X	    # try making temp file in actual directory
X	    $filename = $_;
X	    $tmpfile = "$filename+";
X	    if (! ($opened = open(TMPFILE, ">$tmpfile"))) {
X		# try making temp file in original directory
X		$tmpfile = "$origname+";
X	    }
X	}
X	$public = 0;
X	if (! $opened && ! open(TMPFILE, ">$tmpfile")) {
X	    # last chance: try making temp file in /tmp
X	    $_ = ($ENV{'TMPDIR'} || $ALTERNATE_TMPDIR);
X	    $tmpfile = "$_/$BCMD.$$";
X	    if (! open(TMPFILE, ">$tmpfile")) {
X		&reportwarn('cannot open temporary file', $origname);
X		$exitstatus ++;
X		next Argument;
X	    }
X	    $public = 1;
X	}
X
X	if (! (($dev, $mode, $uid, $gid) = (stat(CURRENT))[0,2,4,5])) {
X	    &reportwarn('cannot stat folder anymore (removed?)', $origname);
X	    $exitstatus ++;
X	    next Argument;
X	}
X	if (! (($tdev, $tmode) = (stat(TMPFILE))[0,2])) {
X	    &reportwarn("cannot stat temporary file $tmpfile", $origname);
X	    $exitstatus ++;
X	    next Argument;
X	}
X	$mode &= 07777; $tmode &= 07777; # discard device info
X	# can't rename the file if it is someone else's
X	# or if the temporary file is on a different device
X	$rename = (($> == 0) || ($> == $uid)) && ($dev == $tdev);
X	# check if this would make public a non-public file
X	if ($public && ($tmode & 044)) {
X	    # switch off public read permissions; tough if this fails
X	    chmod($tmode ^ ($tmode & 044), $tmpfile);
X	    $rename = 0;
X	} elsif ($rename) {
X	    # can't rename the file if setting the mode or owner fails
X	    $rename = chmod($mode, $tmpfile)
X		&& chown($uid, $gid, $tmpfile);
X	}
X	if ($DEBUG) {
X	    print STDERR "\n";
X	    printf STDERR "owner %d.%d permissions %o\n", $uid, $gid, $mode;
X	    print STDERR '$tmpfile="' . "$tmpfile\"\n";
X	    print STDERR "using rename()\n" if $rename;
X	}
X    }
X    # Now TMPFILE should be open for writing with appropriate permissions.
X
X    print STDERR ", sorting" if $VERBOSE;
X    # do sorting in reverse order if requested
X    if ($opt_r) {
X	@dates = sort {$b cmp $a} keys(%found);
X    } else {
X	@dates = sort keys(%found);
X    }
X
X    # print out sorted file
X    foreach $min (@dates) {
X	chop $found{$min}; # remove trailing ':'
X	# handle identical timestamps
X	foreach $message_number (split(':', $found{$min})) {
X	    if (! print TMPFILE $text[$message_number]) {
X		&reportwarn('error while writing temporary file', $origname);
X		$exitstatus ++;
X		close(TMPFILE); unlink $tmpfile;
X		next Argument;
X	    }
X	}
X    }
X
X    if (! close(TMPFILE)) {
X	&reportwarn('error while closing temporary file', $origname);
X	$exitstatus ++;
X	unlink $tmpfile;
X	next Argument;
X    } else {
X	if (($filename ne '-')
X	  && (! $rename || ! rename($tmpfile, $filename))) {
X	    if (system($CP, "$tmpfile", "$filename")) {
X		&reportwarn("cannot replace $filename", $origname);
X		die("Please check $tmpfile and $filename, stopping");
X	    }
X	    if (! unlink $tmpfile) {
X		print STDERR " --- " if $VERBOSE;
X		print STDERR "Warning: cannot remove temporary file $tmpfile\n";
X		next Argument;
X	    }
X	}
X	print STDERR " - done\n" if $VERBOSE;
X    }
X}
X
Xexit($exitstatus);
X
X# $Log:	mailsort,v $
X# Revision 1.16  94/04/20  19:36:33  andras
X# posted to comp.lang.perl
X# 
X################### BEGIN PERL/TROFF TRANSITION 
X.00 ;
X
X'di
X.nr nl 0-1
X.nr % 0
X.\\"'; __END__
X.\" ############## END PERL/TROFF TRANSITION
X.TH MAILSORT 1 "April 19, 1994"
X.SH NAME
Xmailsort \- sort mbox mail folders by date
X.SH SYNOPSIS
X.B mailsort
X[
X.BI -hLrv
X]
X[
X.IR folder \|.\|.\|.
X]
X.SH DESCRIPTION
X.I mailsort
Xsorts
X.I mbox
Xformat mail folders by the dates in the
X.I `From '
Xlines that separate mail messages in each folder.  Folders are reordered
Xin increasing date order (with the oldest message first), and any
Xleading non-mailbox items are left in place.  Files which have no mail
Xheaders, and files which are already sorted, are left untouched.  The
X.B -r
Xoption reverses the sorting order.
X.LP
XIf no arguments are specified, or if
X.B -
Xis an argument,
X.I mailsort
Xacts as a filter, reading a mail folder from standard input and writing
Xthe sorted folder on standard output, in addition to rewriting any
Xfolders passed as arguments.
X.LP
XNormally,
X.I mailsort
Xis silent.  Warnings are printed in case of problems encountered during
Xprocessing.  In verbose mode, an indication of processing is printed for
Xeach folder.
X.LP
XIf a folder needs sorting, a temporary file containing the sorted folder
Xis created;
X.I mailsort
Xwill try to create this file first in the directory where the folder
Xresides, then (if the folder is a symbolic link) in the directory
Xcontaining the symbolic link, and then in the fall-back temporary
Xdirectory.  The temporary file then replaces the original, if possible
Xby renaming, otherwise by copying the temporary file over the original
Xand deleting the temporary file.
X.LP
X.I mbox
Xformat files consist of possibly non-message material at the start of the file, and then at least one message that begins with a
X.I from
Xline.  This consists of the word `From' followed by a user name,
Xfollowed by anything, followed by a date in the format returned by the
X.IR ctime (3)
Xlibrary routine, optionally with a three-letter time zone indicator
Xbetween the time and the year.  A valid
X.I from
Xline would be of the form:
X.IP
XFrom andras@foobar.edu Mon Apr 18 12:01:45 GMT 1994
X.SH OPTIONS
X.TP
X.B -d
XDisplay additional information for debugging purposes.
X.TP
X.B -h
XDisplay a brief help message.
X.TP
X.B -L
XShow the software license.
X.TP
X.B -r
XReverse the order of sorting: the newest message in each folder will
Xbe placed first; the oldest, last.
X.TP
X.B -v
XVerbose mode.  Show the progress of the program.
X.SH ENVIRONMENT
X.TP
X.SM
X.B TMPDIR
XThe last-resort location for the temporary file, if the preferred
Xdirectories are not writable.  If not defined,
X.RI / tmp
Xis used instead.
X.SH FILES
XA temporary file for every folder which needs sorting.
X.SH SEE ALSO
XMail(1), mailx(1), mail(1), elm(1), pine(1), trn(1), nn(1), gawk(1).
X.SH BUGS
XThe time zone is ignored during sorting.  It probably should be used,
Xalthough it may not be possible to interpret non-standard timezone
Xnames.  Is there a standard, anyway?
X.LP
XA
X.I system()
Xcall to
X.I cp
Xis used to copy the temporary file across when
X.I rename()
Xis not sufficient.  This would perhaps be more elegantly done inside
X.IR mailsort ,
Xthough performance might suffer.  (And what about interrupts?).
X.SH AUTHOR
XCopyright 1994 Andras Salamon
X.IR <andras @ cs.wits.ac.za> \|.
X.LP
XThe original inspiration came from the
X.IR gawk -ish
Xscript
X.IR mboxsort ,
Xby Roman Czyborra
X.IR <czyborra @ cs.tu-berlin.de> ,
Xwho also provided feedback on an early version of
X.IR mailsort .
X.SH AVAILABILITY
XThe latest version of 
X.I mailsort
Xis available by anonymous ftp from
X.I ftp.cs.wits.ac.za
Xin the directory
X.IR pub / distrib / mailsort \|.
END_OF_FILE
  if test 12713 -ne `wc -c <'mailsort'`; then
    echo shar: \"'mailsort'\" unpacked with wrong size!
  fi
  chmod +x 'mailsort'
  # end of 'mailsort'
fi
echo shar: End of archive.
exit 0
-- 
Andr\'as Salamon                   andras@cs.wits.ac.za


