Checking For Plagiarism With moss241
This page describes how to use the
moss241
script written by Rob Schluntz to check assignment submissions for plagiarism. This guide is written for CS 241 ISAs, but the script and most of the instructions below should work for other courses.
Basic Usage
- Log in to the course account and change directory to
/u/cs241/handin
.
- Download the submissions for the assignment problem you want to check and store them in this directory. Use MarmSql to do this easily. If you are downloading submissions for assignment problem "AxPy" you should store them
/u/cs241/handin/AxPy
. Make sure the "A" and "P" are uppercase (i.e. A3P4 rather than a3p4); if this naming convention is not followed then moss241
will not be able to find submissions from previous terms to compare with.For example, to download submissions for A3P4 and store them in /u/cs241/handin/A3P4, type: marm_sql -v -d A3P4
Alternatively, you can download the submissions through the Marmoset web interface by going to the "Utilities" page for the project and clicking "Download all students' best submissions". You will then have to unzip the submissions and copy them into a folder under /u/cs241/handin
.
- Run the following command (from
/u/cs241/handin
): ./moss-bin/moss241.sh AxPy
Where "AxPy" is the name of the folder storing the assignment submissions that you created in the previous step. Again, it is important that the "A" and "P" are uppercase.
The script will call moss-setup.sh, which will look to see if there is an AxPy folder in /u/cs241/archives/*/ where * is the number code each of the 3 previous terms to compare submissions with. It will also check to see if there is basecode found. If at least one term is missing, abort and temporarily rename (or copy) an available assignment problem in the archives/* folder to /u/cs241/archives/*/AxPy. Since problem numbers have changed between terms, you may have to check the assignment page to see if it's the same problem. For example, if you're running Moss on A4P8 and there wasn't an A4P8 two terms ago, but there was an A4P7, you'd temporarily rename that folder to A4P8 (make sure to rename it back when Moss is done). You may also chose to continue, but Moss will skip missing terms.
- When the script finishes, the results will be available in the
/u/cs241/handin/AxPy
folder. The results are separated by language, and if the submissions were compared with submissions from previous terms then they will also be separated by which term they were compared with. To view the results, navigate to /u/cs241/handin/AxPy/moss_summary.html
where the results for the current term and available past terms will be available by language. Note that Racket submissions are referred to as "Scheme" submissions for historical reasons.
Advanced Usage
Running Moss On A Batch Of Assignment Problems
There is a script for downloading all submission for a batch of assignment problems and run MOSS on all of them. From /u/cs241/handin, call /u/cs241/handin/moss-bin/MOSS_shortcut. It will ask for the assignment problems you want to run MOSS on. Separate the name of each problem by a space. When finish, press enter.
Here is a way to run Moss on many assignment problems at once, without having to type in the problems individually.
- Once you have downloaded all the submissions for all the problems, navigate to
~/handin/moss-bin
.
- Using shell globbing patterns, find an
ls
command that gives you the list of submission directories you want to run Moss on. For example, this command should capture all directories for Assignment 3 and Assignment 4 problems in the handin folder: ls -d ../A[34]P*
- Now use this for loop to run Moss on all the desired submission directories:
for dir in $(ls -d ../A[34]P*); do yes "n" | ./moss241.sh $(basename $dir); done
Replace the example ls
command with your own command.
Explanation: We loop over all the directories returned by the ls command. We use basename
to get just the "AxPy" part of the directory name, which is what moss241
expects as input. We use yes "n" to pass a stream of n's to the command because moss241
will prompt you to abort if no base code is found for an assignment problem; we want to say "n" (for "No") to all of these abort prompts.
Comparing With Particular Terms
The
moss241
script tries to compare the submissions from the current term to submissions from previous terms to catch cases where a student gets code from a friend who took the course in a previous term. Currently,
moss241
is set up to compare the current term with at most three previous terms. The terms to compare with are specified by variables called
TERM1NUM
,
TERM2NUM
and
TERM3NUM
, which are set by the
moss-setup
script in the
/u/cs241/handin/moss-bin
folder.
By default,
moss-setup
sets these variables to numbers that correspond to the previous three terms. For example, if the current term is 1175 (Spring 2017),
moss-setup
will pick 1171 (Winter 2017), 1169 (Fall 2016) and 1165 (Spring 2015) as the three terms to compare to. If you want different behavior, you will need to modify
moss-setup
to set the variables to different values.
You can essentially make
moss-setup
as complicated or simple as you want; the only requirement is that it sets the three variables
TERM1NUM
,
TERM2NUM
and
TERM3NUM
to some term numbers. For example, you could modify it to compare with the previous three years rather than the previous three terms (i.e. if the current term is Fall 2012 it would pick Fall 2011, Fall 2010 and Fall 2009). Or you could simply hard-code the terms you want to compare with by commenting out the existing code and replacing it with lines that directly set each variable (e.g.
TERM1NUM=1125
).
The term numbers you use should be in the "Registrar's Office" format, i.e. 1YRT where YR is a 2-digit number representing the year and T is a single digit representing the starting month of the term. The command
termcode
returns the number for the current term in this format. The reason for this is that the directories in the archives containing the assignment submissions for previous terms use this term number format;
moss241
will not be able to find the submissions for a previous term if you use a different format.
Supplying Base Code to Moss
Often CS assignments will have some instructor-supplied base code that students are required to use in this submission. Obviously, we do not want Moss to flag this base code as plagiarism. Moss attempts to detect when base code was provided and leave it out of the matches it reports, but this does not always work perfectly and there are often false positives. To reduce the number of false positives, the base code files themselves can be passed into Moss through a command line option.
The
moss241
script will look in the
/u/cs241/handin/base
folder for base code, and if it finds base code corresponding to the assignment you are checking it will pass the base code into Moss. The base code files must be stored in a particular format; there should be one base code file for each combination of assignment and programming language. For example, supposing assignment "AxPy" used base code and the allowed languages were C++ and Racket, you would create files called
AxPy.cc
and
AxPy.rkt
in
/u/cs241/handin/base
which contain the base code for the corresponding language.
In CS 241, all the base code files for assignments have been set up as of Fall 2012. However, if any of the assignments have changed significantly since that term, you may need to add new base code or modify the existing base code.
Some notes:
- For C++ base code you must use the
.cc
extension, not .cpp
.
- The languages supported by
moss241
are C, C++, Racket, Java and Scala. Any files that do not have a .c, .cc, .rkt, .java, or .scala extension are treated as ASCII text files by Moss. To supply base code that is not from one of these languages, store it in a file called AxPy
(no extension).
- This system only allows one base code file per language and assignment problem, but for problems where multiple base code files for the same language are provided, concatenating them all together seems to produce decent results.
- Aside from starter code provided in assignments, you may also want to include any code provided in tutorials in the base code files. At the end of the term you should remove this term-specific code from the base code files, as the next term might use different tutorials.
Supplying External Code To Moss
Some students have unfortunately posted their CS241 code on sites like Github, allowing other students to download and copy it. We want Moss to catch these cases, so
moss241
has a feature where you can supply "external" code like this to be compared with student submissions. It works by simply copying the external code into the student submission directory before actually running Moss.
For the copying to work, the external code must be set up in a specific format:
- It must be stored in the
~/handin/external
folder.
- For each "set" of external code you should have a subfolder in
~/handin/external
. For example, for a particular Github repository you could store the code in something like ~/handin/external/github00
. This folder name must be at most 8 characters long, all alphabetic characters in the folder name must be lowercase, and it should not match any student user ID. Otherwise moss241
will have issues copying the external code, or the external code could overwrite a student submission. Adding numbers to the end of the folder name is a good way to ensure it does not match any student user ID.
- Within the subfolder you should have one folder per assignment problem that contains the corresponding code. For example, you could have folders
~/handin/external/github00/A3P1
, ~/handin/external/github00/A3P2
, ~/handin/external/github00/A3P3
and so on.
There should already be at least one "set" of external code in the
~/handin/external
folder if you want an example to refer to.
The actual copying is done by a helper script called
copy-external.sh
. If this script is not present in the moss-bin folder, the external code feature will not work.
Use In Other Courses
The
moss241
script and the helper scripts it uses are available for download below. They should work in other course accounts, but this cannot be guaranteed. Read through the source code for the scripts (they are commented) to see if anything might need to be changed, and test the script in a sandbox environment until you are sure it works.
One possible issue is that the script assumes the only file formats that occur in assignment submissions are C/C++/Racket/Java/Scala source code, or ASCII text. If your course uses some other programming language, Moss will treat the source code as ASCII text (even if it is a language that Moss has support for) unless you extend the script.