Academia.edu no longer supports Internet Explorer.
To browse Academia.edu and the wider internet faster and more securely, please take a few seconds to upgrade your browser.
2009
…
7 pages
1 file
Consider a situation in which a group of assessors mark a collection of submissions; each assessor marks more than one submission and each submission is marked by more than one assessor. Typical scenarios include reviewing conference submissions and peer marking in a class. The problem is how to optimally assign a final mark to each submission. The mark assignment must be robust in the following sense. A small group of assessors might collude and give marks which significantly deviate from the marks given by other assessors. Another small group of assessors might give arbitrary marks, uncorrelated with the others’ assessments. Some assessors might be excessively generous while some might be extremely stringent. In each of these cases, the impact of the marks by assessors from such groups has to be appropriately discounted. Based on the work in [2], we propose a method which produces marks meeting the above requirements. The final mark assigned to each submission is a weighted averag...
… of the 7th Australasian conference on …, 2005
Once the exclusive preserve of small graduate courses, peer assessment is being rediscovered as an effective and efficient learning tool in large undergraduate classes, a transition made possible through the use of electronic assignment submissions and web-based support software.
Proceedings of the 28th ACM Conference on User Modeling, Adaptation and Personalization, 2020
Peer grading, in which students grade each other's work, can provide an educational opportunity for students and reduce grading effort for instructors. A variety of methods have been proposed for synthesizing peer-assigned grades into accurate submission grades. However, when the assumptions behind these methods are not met, they may underperform a simple baseline of averaging the peer grades. We introduce SABTXT, which improves over previous work through two mechanisms. First, SABTXT uses a limited amount of historical instructor ground truth to model and correct for each peer's grading bias. Secondly, SABTXT models the thoroughness of a peer review based on its textual content, and puts more weight on the more thorough peer reviews when computing submission grades. In our experiments with over ten thousand peer reviews collected over four courses, we show that SABTXT outperforms existing approaches on our collected data, and achieves a mean squared error that is 6% lower than the strongest baseline on average. CCS CONCEPTS • Applied computing → Learning management systems.
IEEE Transactions on Learning Technologies, 2019
With the wide spread large-scale e-learning environments such as MOOCs, peer assessment has been popularly used to measure the learner ability. When the number of learners increases, peer assessment is often conducted by dividing learners into multiple groups to reduce the learner's assessment workload. However, in such cases, the peer assessment accuracy depends on the method of forming groups. To resolve that difficulty, this study proposes a group formation method to maximize peer assessment accuracy using item response theory and integer programming. Experimental results, however, have demonstrated that the method does not present sufficiently higher accuracy than a random group formation method does. Therefore, this study further proposes an external rater assignment method that assigns a few outside-group raters to each learner after groups are formed using the proposed group formation method. Through results of simulation and actual data experiments, this study demonstrates that the proposed external rater assignment can substantially improve peer assessment accuracy.
Personality and Individual Differences, 2013
2016
Consider a person who needs to assess a large amount of information. For instance, think of a teacher of a massive open online course with thousands of enrolled students, or a senior program committee member in a large conference who needs to decide what are the final marks of reviewed papers, or a buyer in an e-commerce scenario who needs to build up her opinion about products. When assessing a large number of objects, sometimes it is simply unfeasible to evaluate them all and very often one needs to rely on the opinions of others. In this paper, we provide a model that uses peer assessments (assessments made by others) in an online community to approximate the assessments that a particular member of the community would generate given the occasion to do so (e.g. the tutor, the SPC member or the buyer--we refer to this person as the leader). Furthermore, we provide a measure of the uncertainty of the computed assessments and a ranking of the objects that should be assessed next. The...
Studies in Fuzziness and Soft Computing, 2011
In this contribution we propose a method for generating OWA weighting vectors from the individual assessments on a set of alternatives in such a way that these weights minimize the disagreement among individual assessments and the outcome provided by the OWA operator. For measuring that disagreement we have aggregated distances between individual and collective assessments by using a metric and an aggregation function. We have paid attention to Manhattan and Chebyshev metrics and arithmetic mean and maximum as aggregation functions. In this setting, we have proven that medians and the mid-range are the solutions for some cases. When a general solution is not available, we have provided some mathematical programs for solving the problem.
arXiv (Cornell University), 2016
Peer grading systems work well only if users have incentives to grade truthfully. An example of nontruthful grading, that we observed in classrooms, consists in students assigning the maximum grade to all submissions. With a naive grading scheme, such as averaging the assigned grades, all students would receive the maximum grade. In this paper, we develop three grading schemes that provide incentives for truthful peer grading. In the first scheme, the instructor grades a fraction p of the submissions, and penalizes students whose grade deviates from the instructor grade. We provide lower bounds on p to ensure truthfulness, and conclude that these schemes work only for moderate class sizes, up to a few hundred students. To overcome this limitation, we propose a hierarchical extension of this supervised scheme, and we show that it can handle classes of any size with bounded (and little) instructor work, and is therefore applicable to Massive Open Online Courses (MOOCs). Finally, we propose unsupervised incentive schemes, in which the student incentive is based on statistical properties of the grade distribution, without any grading required by the instructor. We show that the proposed unsupervised schemes provide incentives to truthful grading, at the price of being possibly unfair to individual students.