Automatic download of extra SA rule sets
Gerry Doris
gdoris at ROGERS.COM
Tue Jan 20 01:45:36 GMT 2004
On Mon, 19 Jan 2004, Stephen Swaney wrote:
> Chris Thielen has written a VERY complete and well thought out script to
> download the most commonly used SA rules files and posted a link to his
> script on the SA mail list:
>
> http://sandgnat.com/cmos/rules_du_jour
>
> I have tested this script and it required only minor configuration changes
> to work with MailScanner. It would also be very easy to extend the script to
> get additional Rule Sets.
>
> A couple of caveats:
>
> 1. Test first with the Debug flag set.
> 2. my /etc/mail/spamassassin/local.cf was very old (and not needed). This
> kept spamassassin --lint from running with out errors. I removed the file
> and all was well.
>
> 3. Saving the file from a web browser created some problems, run:
>
> wget http://sandgnat.com/cmos/rules_du_jour
>
> to get the file.
>
> Steve
For what it's worth I've made a couple of changes to the script...
- there was a small typo in one of weeds.cf download sections. You got
weeds.cf instead of weeds_2.cf if you activated weeds_2.
- I changed the spamassassin restart to MailScanner reload
- and since we're going spamassassin rule crazy I added their latest
evilnumbers.cf rule set.
The updated script is attached.
--
Gerry
"The lyfe so short, the craft so long to learne" Chaucer
-------------- next part --------------
#!/bin/bash
# Version 1.04
## This file updates SpamAssassin RuleSet files from the internet.
##
## It is important that you *only* automatically update
## RuleSet files from people that you trust and that you
## *TEST* this.
##
## Note: When running this script interactively, debug mode is enable to allow you to view the results.
# Usage instructions:
# 1) Choose rulesets to update (TRUSTED_RULESETS below)
# 2) Configure Local SpamAssassin settings (SA_DIR, MAIL_ADDRESS, SA_RESTART below)
# 3) Run this script periodically (manually or crontab)
# 3a) To run manually, first make it executable (chmod +x rules_du_jour) then execute (./rules_du_jour)
# 3b) To run via cron, edit your cron (crontab -e) and add a line such as this:
# 28 2 * * * /root/bin/rules_du_jour
# The crontab line above runs /root/bin/rules_du_jour at 2:28AM every day. (choose a different time, please)
# Make sure the user who's crontab you are editing has permission to write files to the SA config dir.
# Choose Rulesets from this list:
# BIGEVIL TRIPWIRE POPCORN BACKHAIR WEEDS1 WEEDS2 CHICKENPOX
# IMPORTANT: Edit this line to choose which RuleSets to update
TRUSTED_RULESETS="BIGEVIL TRIPWIRE POPCORN BACKHAIR WEEDS2 CHICKENPOX EVILNUMBERS";
#### Local SpamAssassin/system Settings ####
#### Modify these to match your system. ####
SA_DIR="/etc/mail/spamassassin"; # Change this to your SA local config
# directory, probably /etc/mail/spamassassin.
# For amavisd chrooted, this may be:
# /var/amavisd/etc/mail/spamassassin
MAIL_ADDRESS="root"; # Where do Email notifications go
SA_RESTART="/etc/rc.d/init.d/MailScanner reload"; # Command used to restart spamd
# May be /etc/rc.d/init.d/spamassassin restart
# For amavisd, may be /etc/init.d/amavisd restart
# DEBUG="true"; # Uncomment this to turn debug mode on (or use -D)
#### End Local SpamAssassin Settings ####
TMPDIR="${SA_DIR}/RulesDuJour"; # Where we store old rulesets. If you delete
# this directory, RuleSets may be detected as
# out of date the next time you run rules_du_jour.
#### CF Files information ####
# These are bash Array Variables ("man bash" for more information)
declare -a CF_URLS; # Array that contains URLs of the files.
declare -a CF_FILES; # Local name of the CF file; eg: bigevil.cf
declare -a CF_NAMES; # Happy Name of CF file; eg: "Big Evil"
declare -a PARSE_NEW_VER_SCRIPTS; # Command to run on the file to retrieve new version info
declare -a CF_MUNGE_SCRIPTS; # This (optionally) modifies the file; eg: lower scores
#########################################
#### Begin Rules File Registry ####
#########################################
# If you add more RuleSets to your own registry, please contribute the settings to the www.exit0.us wiki
# http://www.exit0.us/index.php/RulesDuJourRuleSets
#### Here are settings for Tripwire. ####
TRIPWIRE=0; # Index of Tripwire data into the arrays is 0
CF_URLS[0]="http://www.merchantsoverseas.com/wwwroot/gorilla/99_FVGT_Tripwire.cf";
CF_FILES[0]="tripwire.cf";
CF_NAMES[0]="TripWire";
PARSE_NEW_VER_SCRIPTS[0]="grep -i '^[ ]*#.*version' | sort | tail -n1";
CF_MUNGE_SCRIPTS[0]="sed -e s/FVGT_TRIPWIRE_/TW_/g"; # shorten long names to workaround large mail header length
#### Here are settings for Big Evil. ####
BIGEVIL=1; # Index of Big Evil is 1
CF_URLS[1]="http://www.merchantsoverseas.com/wwwroot/gorilla/bigevil.cf";
CF_FILES[1]="bigevil.cf";
CF_NAMES[1]="Big Evil";
PARSE_NEW_VER_SCRIPTS[1]="head -n1";
#### Here are settings for Popcorn. ####
POPCORN=2; # Index of Popcorn is 2
CF_URLS[2]="http://www.emtinc.net/includes/popcorn.cf";
CF_FILES[2]="popcorn.cf";
CF_NAMES[2]="Jennifer's Popcorn";
PARSE_NEW_VER_SCRIPTS[2]="grep -i '^[ ]*#.*version[ ]*[0-9]' | sort | tail -n1";
# CF_MUNGE_SCRIPTS[2]="nothing, yet";
# TODO: Manipulate the scores.
#### Here are settings for Backhair. ####
BACKHAIR=3; # Index of Backhair is 3
CF_URLS[3]="http://www.emtinc.net/includes/backhair.cf";
CF_FILES[3]="backhair.cf";
CF_NAMES[3]="Jennifer's Backhair"; # ;-)
PARSE_NEW_VER_SCRIPTS[3]="grep -i '^[ ]*#.*version[ ]*[0-9]' | sort | tail -n1";
# CF_MUNGE_SCRIPTS[3]="nothing, yet";
# TODO: Manipulate the scores.
#### Here are settings for Weeds 1. ####
WEEDS1=4; # Index of Weeds Set 1 is 4
CF_URLS[4]="http://www.emtinc.net/includes/weeds.cf";
CF_FILES[4]="weeds.cf";
CF_NAMES[4]="Jennifer's Weeds Set (1)";
PARSE_NEW_VER_SCRIPTS[4]="grep -i '^[ ]*#.*version[ ]*[0-9]' | sort | tail -n1";
# CF_MUNGE_SCRIPTS[4]="nothing, yet";
# TODO: Manipulate the scores.
#### Here are settings for Weeds 2. ####
WEEDS2=5; # Index of Weeds Set 2 is 5
CF_URLS[5]="http://www.emtinc.net/includes/weeds_2.cf";
CF_FILES[5]="weeds_2.cf";
CF_NAMES[5]="Jennifer's Weeds Set (2)";
PARSE_NEW_VER_SCRIPTS[5]="grep -i '^[ ]*#.*version[ ]*[0-9]' | sort | tail -n1";
# CF_MUNGE_SCRIPTS[5]="nothing, yet";
# TODO: Manipulate the scores.
#### Here are settings for ChickenPox. ####
CHICKENPOX=6; # Index of ChickenPox is 6
CF_URLS[6]="http://www.emtinc.net/includes/chickenpox.cf";
CF_FILES[6]="chickenpox.cf";
CF_NAMES[6]="Jennifer's ChickenPox";
PARSE_NEW_VER_SCRIPTS[6]="grep -i '^[ ]*#.*version[ ]*[0-9]' | sort | tail -n1";
# CF_MUNGE_SCRIPTS[6]="nothing, yet";
# TODO: Manipulate the scores.
#### Here are settings for EvilNumbers. ####
EVILNUMBERS=7; # Index of EvilNumbers is 7
CF_URLS[7]="http://www.merchantsoverseas.com/wwwroot/gorilla/evilnumbers.cf";
CF_FILES[7]="evilnumbers.cf";
CF_NAMES[7]="Yackley's EvilNumbers";
PARSE_NEW_VER_SCRIPTS[7]="grep -i '^[ ]*#.*version' | sort | tail -n1";
# CF_MUNGE_SCRIPTS[7]="nothing, yet";
# TODO: Manipulate the scores.
#########################################
#### End Rules File Registry ####
#########################################
# Do not update beyond this line unless you know what you are doing.
#########################################
#### Begin rules update code ####
#########################################
# if invoked with -D, enable DEBUG here.
[ "$1" = "-D" ] && DEBUG="true";
# if running interactively, enable DEBUG here.
[ -t 0 ] && DEBUG="true";
# If we're not running interactively, add a random delay here. This should
# help reduce spikes on the servers hosting the rulesets (Thanks, Bob)
MAXDELAY=3600;
DELAY=0;
[ ! -t 0 ] && [ ${MAXDELAY} -gt 0 ] && let DELAY="${RANDOM} % ${MAXDELAY}";
[ "${DEBUG}" ] && [ ${DELAY} -gt 0 ] && echo "Probably running from cron... sleeping for a random interval (${DELAY} seconds)";
[ ${DELAY} -gt 0 ] && sleep ${DELAY};
# Save old working dir
OLDDIR=`pwd`;
# This variable is used to indicate if we should restart spamd. Currently empty (false).
RESTART_REQUIRED="";
[ "${DEBUG}" ] && [ -e ${TMPDIR} ] && echo "Temporary directory already existed: ${TMPDIR}";
[ "${DEBUG}" ] && [ ! -e ${TMPDIR} ] && echo "Temporary directory doesn't exist; creating: ${TMPDIR}";
[ ! -e ${TMPDIR} ] && mkdir ${TMPDIR};
[ "${DEBUG}" ] && echo "Changing to temporary directory: ${TMPDIR}";
cd ${TMPDIR};
for RULESET_NAME in ${TRUSTED_RULESETS} ; do
INDEX=${!RULESET_NAME};
CF_URL=${CF_URLS[${INDEX}]};
CF_FILE=${CF_FILES[${INDEX}]};
CF_NAME=${CF_NAMES[${INDEX}]};
PARSE_NEW_VER_SCRIPT=${PARSE_NEW_VER_SCRIPTS[${INDEX}]};
CF_MUNGE_SCRIPT=${CF_MUNGE_SCRIPTS[${INDEX}]};
CF_BASENAME=`basename ${CF_URL}`;
DATE=`date +"%Y%m%d-%H%M"`
if [ "${DEBUG}" ] ; then
echo "";
echo "------ ${RULESET_NAME} ------";
echo "RULESET_NAME=${RULESET_NAME}";
echo "INDEX=${INDEX}";
echo "CF_URL=${CF_URL}";
echo "CF_FILE=${CF_FILE}";
echo "CF_NAME=${CF_NAME}";
echo "PARSE_NEW_VER_SCRIPT=${PARSE_NEW_VER_SCRIPT}";
echo "CF_MUNGE_SCRIPT=${CF_MUNGE_SCRIPT}";
fi
[ "${DEBUG}" ] && [ -f ${TMPDIR}/${CF_BASENAME} ] && echo "Old ${CF_BASENAME} already existed in ${TMPDIR}...";
[ "${DEBUG}" ] && [ ! -f ${TMPDIR}/${CF_BASENAME} ] && [ ! -f ${SA_DIR}/${CF_FILE} ] && \
echo "This is the first time downloading ${CF_BASENAME}...";
[ "${DEBUG}" ] && [ ! -f ${TMPDIR}/${CF_BASENAME} ] && [ -f ${SA_DIR}/${CF_FILE} ] && \
echo "Copying from ${SA_DIR}/${CF_FILE} to ${TMPDIR}/${CF_BASENAME}...";
[ ! -f ${TMPDIR}/${CF_BASENAME} ] && [ -f ${SA_DIR}/${CF_FILE} ] && cp ${SA_DIR}/${CF_FILE} ${TMPDIR}/${CF_BASENAME} && touch -r ${SA_DIR}/${CF_FILE} ${TMPDIR}/${CF_BASENAME};
[ "${DEBUG}" ] && echo "Retrieving file from ${CF_URL}...";
wget -N ${CF_URL} > ${TMPDIR}/wget.log 2>&1;
grep -q 'saved' ${TMPDIR}/wget.log;
DOWNLOADED=$?;
# Check for 4xx
grep -q 'ERROR 4[0-9][0-9]' ${TMPDIR}/wget.log;
WAS404=$?;
# Check for random failure (dns doesn't exist, etc)
grep -i -q 'failed: ' ${TMPDIR}/wget.log;
FAILED=$?;
# Unset WAS404 if the file didn't return 404.
[ ! ${WAS404} = 0 ] && WAS404=;
# Unset FAILED if wget succeded
[ ! ${FAILED} = 0 ] && FAILED=;
[ "${FAILED}" ] && RULES_THAT_404ED="${RULES_THAT_404ED}\n${CF_NAME} had an unknown error: `cat ${TMPDIR}/wget.log`";
[ "${WAS404}" ] && RULES_THAT_404ED="${RULES_THAT_404ED}\n${CF_NAME} not found at ${CF_URL}";
[ "${DEBUG}" ] && [ ${WAS404} ] && echo "Got 404 from ${CF_NAME} (${CF_URL})...";
[ "${DEBUG}" ] && [ ! ${WAS404} ] && ([ ${DOWNLOADED} = 0 ] && echo "New version downloaded..." || echo "${CF_BASENAME} was up to date (skipped downloading of ${CF_URL})...");
if [ ${DOWNLOADED} = 0 ] ; then
if [ "${CF_MUNGE_SCRIPT}" ] ; then
[ "${DEBUG}" ] && echo "Munging output using command: ${CF_MUNGE_SCRIPT}";
sh -c "${CF_MUNGE_SCRIPT}" < ${TMPDIR}/${CF_BASENAME} > ${TMPDIR}/${CF_BASENAME}.2;
else
cp ${TMPDIR}/${CF_BASENAME} ${TMPDIR}/${CF_BASENAME}.2;
fi
# Set munged file to same timestamp as downloaded file...
touch -r ${TMPDIR}/${CF_BASENAME} ${TMPDIR}/${CF_BASENAME}.2;
[ -f ${SA_DIR}/${CF_FILE} ] && cmp -s ${TMPDIR}/${CF_BASENAME}.2 ${SA_DIR}/${CF_FILE} || {
[ "${DEBUG}" ] && echo "Old version ${SA_DIR}/${CF_FILE} differs from new version ${TMPDIR}/${CF_BASENAME}.2" ;
[ "${DEBUG}" ] && [ -f ${SA_DIR}/${CF_FILE} ] && echo "Backing up old version...";
[ -f ${SA_DIR}/${CF_FILE} ] && mv -f ${SA_DIR}/${CF_FILE} ${TMPDIR}/${CF_FILE}.${DATE};
# Save the command that can be used to undo this change, if rules won't --lint
[ -f ${TMPDIR}/${CF_FILE}.${DATE} ] && UNDO_COMMAND="${UNDO_COMMAND} mv -f ${TMPDIR}/${CF_FILE}.${DATE} ${SA_DIR}/${CF_FILE};";
[ ! -f ${TMPDIR}/${CF_FILE}.${DATE} ] && UNDO_COMMAND="${UNDO_COMMAND} rm -f ${SA_DIR}/${CF_FILE};";
[ "${DEBUG}" ] && [ -f ${TMPDIR}/${CF_BASENAME}.2 ] && echo "Installing new version...";
[ -f ${TMPDIR}/${CF_BASENAME}.2 ] && mv -f ${TMPDIR}/${CF_BASENAME}.2 ${SA_DIR}/${CF_FILE};
NEWVER=`sh -c "cat ${SA_DIR}/${CF_FILE} | ${PARSE_NEW_VER_SCRIPT}"`;
[ "${DEBUG}" ] && echo "${CF_NAME} has changed on `hostname`. The new ${CF_NAME} is ${NEWVER}";
echo -e "${CF_NAME} has changed on `hostname`. The new ${CF_NAME} is ${NEWVER}" \
| mail -s "RulesDuJour/`hostname`: ${CF_NAME} RuleSet has been updated" ${MAIL_ADDRESS}
RESTART_REQUIRED="true";
}
[ -f ${TMPDIR}/${CF_BASENAME}.2 ] && rm -f ${TMPDIR}/${CF_BASENAME}.2;
fi
done
[ "${DEBUG}" ] && echo "" && echo "";
[ "${RULES_THAT_404ED}" ] && echo -e "The following rules had 404 errors:${RULES_THAT_404ED}" | mail -s "RulesDuJour/`hostname`: 404 errors" ${MAIL_ADDRESS};
[ "${DEBUG}" ] && [ "${RULES_THAT_404ED}" ] && echo -e "The following rules had 404 errors:${RULES_THAT_404ED}" && echo "";
[ "${RESTART_REQUIRED}" ] && {
sleep 1
[ "${DEBUG}" ] && echo "Attempting to --lint the rules.";
spamassassin --lint > /dev/null 2>&1 ;
LINTFAILED=$?;
# Unset LINTFAILED if lint didn't fail.
[ "${LINTFAILED}" = "0" ] && LINTFAILED=;
[ "${DEBUG}" ] && [ "${LINTFAILED}" ] && echo "WARNING: spamassassin --lint failed." && echo "Rolling configuration files back, not restarting SpamAssassin." && echo "Rollback command is: ${UNDO_COMMAND}";
[ "${LINTFAILED}" ] && RESTART_REQUIRED= && sh -c "${UNDO_COMMAND}";
[ "${LINTFAILED}" ] && echo "spamassassin --lint failed. Rolling configuration files back, not restarting SpamAssassin. Rollback command was: ${UNDO_COMMAND}" | mail -s "RulesDuJour/`hostname`: lint failed. Updates rolled back." ${MAIL_ADDRESS};
[ "${DEBUG}" ] && [ "${RESTART_REQUIRED}" ] && echo "Restarting SpamAssassin using: ${SA_RESTART}";
[ "${RESTART_REQUIRED}" ] && ${SA_RESTART} > /dev/null 2>&1
}
[ "${DEBUG}" ] && [ ! "${RESTART_REQUIRED}" ] && echo "No files updated; No restart required.";
[ "${DEBUG}" ] && echo "Changing back to old working directory: ${OLDDIR}";
cd ${OLDDIR};
More information about the MailScanner
mailing list