Grading Logistics: Ideas and Tips

I consistently teach over 200 students at a time. Therefore, I heavily rely on autograders and teaching assistants (TAs) to grade everything. I use and love Gradescope. But, like all interfaces, Gradescope is limited in the affordances of how to use it. So I thought I’d write a blog post on the logistics side of how I use Gradescope to manage my grading.

This post will cover a mix of small and big effort things I do to handle the logistics of grading homeworks and exams.

[Also posted on medium.]

The “Prof. Double Check” rubric

For those unfamiliar with Gradescope, it enables hand grading via rubric items that TAs can check off to mark student work. Rubrics help make grading more consistent, and Gradescope automatically handles live rubric updates and multiple TAs grading the same problem. In turn, Gradescope shows summary statistics on these rubrics and allows filtering by a rubric.

I take advantage of this feature by adding at the bottom a rubric called “Prof. Double Check,” which is worth 0 points. If a TA is unsure of their grading or how to grade a submission, they mark that rubric and add a comment to me in the Gradescope comment box. Before I publish grades, I filter on that rubric item for each problem. I go through each double-check to confirm/update the grading and the comment. Finally, I uncheck the double-check rubric (so I don’t confuse myself later).

This process gives me a sense of the edge cases and potential rubric confusion. If I am checking while the TA is still grading, I’ll discuss the confusing parts with them and update the rubric to reduce the number of double-check requests for the rest of the submissions.

When I publish, I leave the double-check rubric in place so that when I copy the rubric in future semesters, I don’t have to remember to add it. I don’t know what students make of the rubric. I suspect many don’t notice it since it’s not checked off when they see their grades.

TA Training Sets

I always grade 10 or more submissions per problem to show TAs how to apply the rubric. I teach data science, so in a way, it is a literal training set for the TAs to learn how to grade. If the rubric has been battled tested from prior semesters, I’ll sometimes have a grad TA create the training set, but it all depends on a bunch of factors whether I ask them to do it.

I grade “across and down” to create these training sets. What I mean by that is I start with the first problem and pick a random submission (Gradescope sorts them chronologically by last submission time) and grade 10 submissions in a row for that problem. If I don’t get much of a variety of grades for that problem (i.e., all were perfect scores, but the rubric has some important nuances the TAs need to be aware of), I’ll grade more until I’m satisfied. Then I “go down” one problem, start grading the next submission I haven’t graded yet, and repeat. This process creates a “waterfall” down the problems/submissions.

If I hit the bottom of the submissions, I grade the latest submissions I haven’t graded yet so that the training set is all in the same place chronologically on the page. For the next problem, I start at the top. So, for example, if there are 8 problems and 100 submissions and I start at submission 14, I’ll grade submissions 14 to 23 for problem 1, 24–33 for problem 2, etc. And if I grade more than 10 submissions per problem, so I end up with something like 93–100 for problem 8, I’ll go back and do submissions 91 and 92. That way, all 10 graded submissions are at the bottom of the page.

My goal with this waterfall from a random starting point is to see as many submissions as possible. It gives me a sense of how all the students are doing. And it also sometimes turns up submissions that are suspiciously too similar, though my odds of that are low.

Grading Work Units

Not all problems take the same time to grade. Therefore to ensure fairness in terms of time spent grading, I have a spreadsheet where TAs sign up to grade, and I assign how many “units” of work that problem is to grade. The TAs goal is to have an even split in how many units they have.

I decide on a problem’s work units based on how long it takes me to grade the training set. The units are very coarse and do not map to a particular number of hours, but that’s fine as long as they are internally consistent. What I mean by that is 1 work unit is half the work of 2 work units. I assign work units at a granularity of 0.5 units. And if a problem is 2 or more work units, I sometimes split it across multiple rows in the spreadsheet to indicate multiple TAs can sign up to grade it.

Regrade Window

Everything that involves manual grading has a one-week regrade period. This window starts about 1 day (what time the next day varies) after the assessment is published, and closes one week later at 11:59 pm.

Why it is one week is mildly arbitrary, but it feels not too long or too short to me. I make my students wait a day to get them to calm down, digest their grade, and carefully look at the grading before they hit the regrade button. In addition, there’s usually an opportunity to consult the teaching staff during those ~24 hours. Do students always do this, and therefore, there are no ill-advised regrade requests? No. However, I’ve noticed that it dramatically reduces the number of regrade requests of the style, “but I feel I deserve more points,” or the text is full of anger that my poor TA grader does not need to see.

Homework Grading Timeline

Here is this semester’s timeline, where homeworks are due weekly on Friday. However, depending on the due date, I simply shift the timeline with an emphasis on the workload happening on business days:

Friday — Homework is due
Sunday or Monday — A grad TA or I grade the training sets, update the spreadsheet, and notify the Undergrad TAs (UTAs) that problems are ready to be claimed.
Thursday — Grad TA in charge of that homework ensures the UTAs claim all work units.
Saturday morning — Grad TA checks grading progress and sends reminders/asks the UTAs for their timeline.
Sunday-ish — Check grading progress.
Monday — I look through all Prof. Double Check submissions and update/confirm as needed, set up the regrade window, and publish the homeworks.
Regrade window — I handle all of these myself when I have time, which means I sometimes don’t get to them until after the window closes. I am considering handing this to a grad TA to free up more of my time.

Exam Timeline

This timeline is similar to homework, except we have an in-person grading session on the first weekend after the exam. In addition, UTAs are much more likely to have a partner to grade a particular problem compared to homeworks. The in-person session becomes especially helpful because the partners go through the training set together, I’m available to ask questions, and they grade side-by-side for a while. Also, in-person grading sessions are honestly a lot more fun. In addition, I might stretch out the timeline more than a week if there is also a homework that needs grading. I usually ask my UTAs to prioritize the exam during these times.

Conclusion

There are many other things I could write about how I grade, but this hodgepodge list seems like the things someone would only learn through word of mouth (or invention in isolation) than from Gradescope itself. So I thought a blog post would be useful.

Do you have a tip or idea that helps make grading easier? Want to know more about my process? Leave a comment!

Search This Blog

Prof. Kristin Stephens-Martinez