Standard Setting: The What and Why
Standard setting is an activity in which subject matter experts evaluate data (e.g., test data, results data, examinee differences data) to set a minimum passing score or scores on an assessment. There are many volumes devoted to standard setting methods along with their costs and benefits. Whatever the method, the result is an "arbitrary but not capricious" score used to delineate performance into adjacent categories like pass/fail or meets expectations/exceeds expectations. The phrase "arbitrary but not capricious" is a common way to describe the outcome of a standard setting: a cut score recommendation. It is important to clarify that this does not mean "meaningless" or "haphazard." The "arbitrary" portion indicates that when performed repeatedly with different subject matter experts, the standard setting outcome (cut score) will move but should noticeably move less than 1 in 20 repeated standard settings. The "not capricious" portion means that there was a standardized, repeatable, well-constructed process that uses a meaningful thought process guiding the standard setting.
Now that we have covered the "what," let's explore the "why." At the ARRT, expectations from professionals, academic experts, and the public must be made cohesive to establish a minimum level of quality that the professional must demonstrate. The ARRT's formula of excellence (Ethics + Education + Exam = Excellence™) informs the knowledge skills and abilities (KSAs) that candidates must demonstrate from the start of their career. This is one aspect of validity.
Validity is often compared with a legal argument that has one directive that is supported by evidence. A test's construct and content validity, two key pieces of validity evidence, is first established in the test planning phase, where what topics a test will cover and what items might access those KSAs is determined. An example of something with poor content and construct validity would be measuring height among young schoolchildren to predict how well students read overall. Although, to a point, height and literacy correlate highly, so to do height and age, and age with amount of school-based reading instruction. To measure reading performance, it may be wiser to measure things directly related to literacy such as reading comprehension and fluency-things logically related to reading performance that are meaningful throughout a lifetime.
So, here we have navigated from standard setting to the fundamental credibility of the assessment. The nature of a validity argument is that at every step we must always work toward ensuring a high-quality assessment that delivers credible results. The validity argument starts before the first item has been written, continues through setting the standards or cut score, and concludes with evidence from examinee performance. Then it starts all over. In assessment, validity is at the heart of the efforts, and standard setting plays a pivotal role in maintaining the quality of the validity argument.
 
         
    
        
     
                