Removing Bias and Incentivizing Precision in Peer-grading
Authors: Anujit Chakraborty, Jatin Jindal, Swaprava Nath
JAIR 2024 | Venue PDF | Archive PDF | Plain Text | LLM Run Details
| Reproducibility Variable | Result | LLM Response |
|---|---|---|
| Research Type | Experimental | Data from our classroom experiments is consistent with our theoretical assumptions and show that PEQA outperforms the popular median mechanism, which is used in several massive open online courses (MOOCs). |
| Researcher Affiliation | Collaboration | Anujit Chakraborty EMAIL University of California Davis; Jatin Jindal EMAIL Google (India); Swaprava Nath EMAIL Indian Institute of Technology Bombay |
| Pseudocode | Yes | Algorithm 1 PEQA mechanism; Algorithm 2 PEQA |
| Open Source Code | Yes | All data and code are available via: https://www.cse.iitb.ac.in/~swaprava/papers/Codes_Peer_Grading.zip |
| Open Datasets | Yes | All data and code of this paper are available at: https://www.cse.iitb.ac.in/~swaprava/papers/Codes_Peer_Grading.zip |
| Dataset Splits | Yes | We ran two experimental sessions: one with the median scoring mechanism (27 students), another with the PEQA mechanism (42 students). ... We partitioned each quiz into three sub-quizzes ... In every round, the students were asked to peer-grade five sub-quizzes (each corresponding to one of five of her anonymous peers). ... We randomly chose two of the five sub-quizzes that Median subjects graded in each round, and treated them as probes, and the rest as non-probes. |
| Hardware Specification | No | We conducted both sessions during the weekly Prog101 labs, that happen in a large computer lab. The paper does not specify particular CPU/GPU models or other detailed hardware components used. |
| Software Dependencies | No | The paper does not mention any specific software dependencies with version numbers (e.g., library names with versions). |
| Experiment Setup | Yes | The PEQA performance score on each question you have graded that is worth x points, is assigned on the scale of [0, x/2]. ... The relative weight α that an instructor assigns in PEQA (see Step 7 of the computet function in Algorithm 1) on the peer-grading performance score, determines what percentage of total grades come from own exam-score versus completing the peer-grading exercise ... In Section 6, we extend our analysis to how PEQA deals with social welfare in a world where increasing reliability is costly to the grader. ... we assume that each paper has a single question. All graders face the same reliability-cost function c while grading that paper/question. ... We paid students by the relative ranking of their total scores in the class, in both the sessions. The students who ranked in the first quartile of the total scores received M 650, the next three quartiles received M 450, M 250, and M 50 respectively. They also received a show-up fee of M 50, irrespective of their total score. |