Accuracy & reliability

Maximising assessment validity

Hey πŸ‘‹

Wassup. This week, we’re extending our assessment theory series with a quick look at accuracy and reliability…

Big idea πŸ‰

Validity refers to the extent that any inferences we draw from an assessment are a true reflection of reality. If I weigh 70kg and my scales always show 70kg, then we might say that they are valid.

Reliability is one component of validity. It refers to the ability of a measure to produce a similar result under similar conditions. If my scales showed that I was 70kg in the bathroom but 75kg in the kitchen, then they wouldn’t be very reliable. And as a result, the inferences we could draw from them wouldn’t be very valid either.

Reliability contributes to validity. However, a reliability by itself is insufficient. Our weighing scales could be consistent, but they might not be properly calibrated. Despite being 70kg, they might always show me as weighing 75kg (regardless of the room I use). Validity requires accuracy as well as reliability.

There are various things that influence the reliability of school assessments:

  1. The questions we use across different assessments which try to measure the same thing.

  2. The conditions in which the assessments take place.

  3. The consistency of marking, between different people or even by the same person at different times.

For greatest reliability (and so validity), we want to get the same result regardless of the questions that were used, the time or place the assessment was conducted, or the person who marked it.

Note 1 β†’ Different subjects and question types lend themselves better to more reliable assessment. For example, math(s) and multiple-choice questions tend to have more definitive answers than literature and essay questions, which increases the chances that multiple markers will award similar results.

Note 2 β†’ There are often trade-offs between accuracy and reliability. For example, we could increase the reliability of a history assessment by using only multiple-choice questions, but in doing this we would reduce the accuracy (and so overall validity) of the inferences we could make as a result.


  • Reliability refers to the ability of a measure to produce a similar result under similar conditions.

  • Reliability is a component of validity, along with accuracy.

  • We should consider trade-offs between accuracy and reliability when seeking to maximise validity.

For double the links and more, sign up to Snacks PRO β†’ join here


Peps πŸ‘Š