Checking For Plagiarism With moss241

This page describes how to use the moss241 script written by Rob Schluntz to check assignment submissions for plagiarism. This guide is written for CS 241 ISAs, but the script and most of the instructions below should work for other courses.

Basic Usage

  1. Log in to the course account and change directory to /u/cs241/handin.
  2. Download the submissions for the assignment problem you want to check and store them in this directory. Use MarmSql to do this easily. If you are downloading submissions for assignment problem "AxPy" you should store them /u/cs241/handin/AxPy. Make sure the "A" and "P" are uppercase (i.e. A3P4 rather than a3p4); if this naming convention is not followed then moss241 will not be able to find submissions from previous terms to compare with.For example, to download submissions for A3P4 and store them in /u/cs241/handin/A3P4, type:
    marm_sql -v -d A3P4
    Alternatively, you can download the submissions through the Marmoset web interface by going to the "Utilities" page for the project and clicking "Download all students' best submissions". You will then have to unzip the submissions and copy them into a folder under /u/cs241/handin.
  3. Run the following command (from /u/cs241/handin):
    ./moss-bin/moss241.sh AxPy   
    Where "AxPy" is the name of the folder storing the assignment submissions that you created in the previous step. Again, it is important that the "A" and "P" are uppercase.

    The script will call moss-setup.sh, which will look to see if there is an AxPy folder in /u/cs241/archives/*/ where * is the number code each of the 3 previous terms to compare submissions with. It will also check to see if there is basecode found. If at least one term is missing, abort and temporarily rename (or copy) an available assignment problem in the archives/* folder to /u/cs241/archives/*/AxPy. Since problem numbers have changed between terms, you may have to check the assignment page to see if it's the same problem. For example, if you're running Moss on A4P8 and there wasn't an A4P8 two terms ago, but there was an A4P7, you'd temporarily rename that folder to A4P8 (make sure to rename it back when Moss is done). You may also chose to continue, but Moss will skip missing terms.
  4. When the script finishes, the results will be available in the /u/cs241/handin/AxPy folder. The results are separated by language, and if the submissions were compared with submissions from previous terms then they will also be separated by which term they were compared with. To view the results, navigate to /u/cs241/handin/AxPy/moss_summary.html where the results for the current term and available past terms will be available by language. Note that Racket submissions are referred to as "Scheme" submissions for historical reasons.

Advanced Usage

Running Moss On A Batch Of Assignment Problems

There is a script for downloading all submission for a batch of assignment problems and run MOSS on all of them. From /u/cs241/handin, call /u/cs241/handin/moss-bin/MOSS_shortcut. It will ask for the assignment problems you want to run MOSS on. Separate the name of each problem by a space. When finish, press enter.

Here is a way to run Moss on many assignment problems at once, without having to type in the problems individually.

  1. Once you have downloaded all the submissions for all the problems, navigate to ~/handin/moss-bin.
  2. Using shell globbing patterns, find an ls command that gives you the list of submission directories you want to run Moss on. For example, this command should capture all directories for Assignment 3 and Assignment 4 problems in the handin folder:
    ls -d ../A[34]P*
  3. Now use this for loop to run Moss on all the desired submission directories:
    for dir in $(ls -d ../A[34]P*); do yes "n" | ./moss241.sh $(basename $dir); done
    Replace the example ls command with your own command.

    Explanation: We loop over all the directories returned by the ls command. We use basename to get just the "AxPy" part of the directory name, which is what moss241 expects as input. We use yes "n" to pass a stream of n's to the command because moss241 will prompt you to abort if no base code is found for an assignment problem; we want to say "n" (for "No") to all of these abort prompts.

Comparing With Particular Terms

The moss241 script tries to compare the submissions from the current term to submissions from previous terms to catch cases where a student gets code from a friend who took the course in a previous term. Currently, moss241 is set up to compare the current term with at most three previous terms. The terms to compare with are specified by variables called TERM1NUM, TERM2NUM and TERM3NUM, which are set by the moss-setup script in the /u/cs241/handin/moss-bin folder.

By default, moss-setup sets these variables to numbers that correspond to the previous three terms. For example, if the current term is 1175 (Spring 2017), moss-setup will pick 1171 (Winter 2017), 1169 (Fall 2016) and 1165 (Spring 2015) as the three terms to compare to. If you want different behavior, you will need to modify moss-setup to set the variables to different values.

You can essentially make moss-setup as complicated or simple as you want; the only requirement is that it sets the three variables TERM1NUM, TERM2NUM and TERM3NUM to some term numbers. For example, you could modify it to compare with the previous three years rather than the previous three terms (i.e. if the current term is Fall 2012 it would pick Fall 2011, Fall 2010 and Fall 2009). Or you could simply hard-code the terms you want to compare with by commenting out the existing code and replacing it with lines that directly set each variable (e.g. TERM1NUM=1125).

The term numbers you use should be in the "Registrar's Office" format, i.e. 1YRT where YR is a 2-digit number representing the year and T is a single digit representing the starting month of the term. The command termcode returns the number for the current term in this format. The reason for this is that the directories in the archives containing the assignment submissions for previous terms use this term number format; moss241 will not be able to find the submissions for a previous term if you use a different format.

Supplying Base Code to Moss

Often CS assignments will have some instructor-supplied base code that students are required to use in this submission. Obviously, we do not want Moss to flag this base code as plagiarism. Moss attempts to detect when base code was provided and leave it out of the matches it reports, but this does not always work perfectly and there are often false positives. To reduce the number of false positives, the base code files themselves can be passed into Moss through a command line option.

The moss241 script will look in the /u/cs241/handin/base folder for base code, and if it finds base code corresponding to the assignment you are checking it will pass the base code into Moss. The base code files must be stored in a particular format; there should be one base code file for each combination of assignment and programming language. For example, supposing assignment "AxPy" used base code and the allowed languages were C++ and Racket, you would create files called AxPy.cc and AxPy.rkt in /u/cs241/handin/base which contain the base code for the corresponding language.

In CS 241, all the base code files for assignments have been set up as of Fall 2012. However, if any of the assignments have changed significantly since that term, you may need to add new base code or modify the existing base code.

Some notes:

  • For C++ base code you must use the .cc extension, not .cpp.
  • The languages supported by moss241 are C, C++, Racket, Java and Scala. Any files that do not have a .c, .cc, .rkt, .java, or .scala extension are treated as ASCII text files by Moss. To supply base code that is not from one of these languages, store it in a file called AxPy (no extension).
  • This system only allows one base code file per language and assignment problem, but for problems where multiple base code files for the same language are provided, concatenating them all together seems to produce decent results.
  • Aside from starter code provided in assignments, you may also want to include any code provided in tutorials in the base code files. At the end of the term you should remove this term-specific code from the base code files, as the next term might use different tutorials.

Supplying External Code To Moss

Some students have unfortunately posted their CS241 code on sites like Github, allowing other students to download and copy it. We want Moss to catch these cases, so moss241 has a feature where you can supply "external" code like this to be compared with student submissions. It works by simply copying the external code into the student submission directory before actually running Moss.

For the copying to work, the external code must be set up in a specific format:

  • It must be stored in the ~/handin/external folder.
  • For each "set" of external code you should have a subfolder in ~/handin/external. For example, for a particular Github repository you could store the code in something like ~/handin/external/github00. This folder name must be at most 8 characters long, all alphabetic characters in the folder name must be lowercase, and it should not match any student user ID. Otherwise moss241 will have issues copying the external code, or the external code could overwrite a student submission. Adding numbers to the end of the folder name is a good way to ensure it does not match any student user ID.
  • Within the subfolder you should have one folder per assignment problem that contains the corresponding code. For example, you could have folders ~/handin/external/github00/A3P1, ~/handin/external/github00/A3P2, ~/handin/external/github00/A3P3 and so on.
There should already be at least one "set" of external code in the ~/handin/external folder if you want an example to refer to.

The actual copying is done by a helper script called copy-external.sh. If this script is not present in the moss-bin folder, the external code feature will not work.

Use In Other Courses

The moss241 script and the helper scripts it uses are available for download below. They should work in other course accounts, but this cannot be guaranteed. Read through the source code for the scripts (they are commented) to see if anything might need to be changed, and test the script in a sandbox environment until you are sure it works.

One possible issue is that the script assumes the only file formats that occur in assignment submissions are C/C++/Racket/Java/Scala source code, or ASCII text. If your course uses some other programming language, Moss will treat the source code as ASCII text (even if it is a language that Moss has support for) unless you extend the script.

Topic attachments
I Attachment History Action Size Date Who Comment
Compressed Zip archivezip Scripts.zip r4 r3 r2 r1 manage 8.4 K 2020-04-06 - 11:06 SylvieDavies  
Edit | Attach | Watch | Print version | History: r11 < r10 < r9 < r8 < r7 | Backlinks | Raw View | Raw edit | More topic actions
Topic revision: r11 - 2021-12-23 - JeremyLuo
 
This site is powered by the TWiki collaboration platform Powered by PerlCopyright © 2008-2024 by the contributing authors. All material on this collaboration platform is the property of the contributing authors.
Ideas, requests, problems regarding TWiki? Send feedback