A utility that will download posts from a newsgroup, newsarchiver.py
, is attached to this document (the Wiki insists on appending .txt
; just rename it). It will be wrapped by a program similar to the following, which should then be run frequently (every minute) from a cron job:
#!/bin/bash export PATH=`/bin/showpath gnu standard` cd /u/cs135/newsgroup ./newsarchiver.py uw.cs.cs135 >/dev/null tmpfile=a0_done.$$ realfile=a0_done grep '^From: ' `grep -l '^References:$' *`\ | perl -lne 'print $1 if /\<(.*?)@.*uwaterloo\.ca\>/' > $tmpfile mv $tmpfile $realfile
The portion after the string "^References:" in the grep line must be replaced by the message-ID of the initial newsgroup post, and the course name should be replaced on each line it occurs with the proper course.
Changes that can/should be made and tested include:
whoami
to avoid hard-coding in a particular course name in several places so this script can be copied with fewer modifications.
newsarchiver.py
(which it was anyway), so that the cd
line should be able to be changed to the less fragile cd $(dirname $0)
.
The test.exe
program can then take an action similar to the following:
#!/usr/bin/env bash readonly POSTS="/u/$course/newsgroup/a0_done" if [ `grep '^'$student'$' $POSTS | wc -w` -gt 0 ]; then echo '100' >&3 echo "The appropriate email address was used for a newsgroup post." >&4 else echo '0' >&3 echo "An e-mail from $student's uwaterloo address was not found on the newsgroup." >&4 fi
Please note the importance of wrapping the student userid in ^ and $. This is done to ensure that the userid is actually listed in full, and is not just a substring of another userid, which could lead to false positives.
POSTS
filename to contain whitespace, should the grep in text.exe
instead be
`grep "^$student$" "$POSTS" | ...`
?
I | Attachment | History | Action | Size | Date | Who | Comment |
---|---|---|---|---|---|---|---|
![]() |
newsarchiver.py.txt | r1 | manage | 5.7 K | 2009-07-23 - 14:15 | TerryVaskor | Utility that will download newsgroup articles |