A few corpus-cleaning helper tools. | |
mass-find-nonspam | |
Find non-spam, or otherwise suspect, messages in a spam corpus. Has | |
patterns which will match a good subset of typical newsletters found in | |
trap data. | |
remove-tests-from-logs | |
Guess. Simply name the tests on the command line, reads log | |
from stdin, writes rewritten log to stdout. | |
uniq-mailbox | |
Trim duplicate mails from a mailbox. | |