Background Information about Rubrics:
Rubrics, scoring guidelines, and criteria are all terms that refer to the guides
used to score performance assessments in a reliable, fair, and valid
manner. We have selected the term "rubric" to refer to the scoring
guidelines used on the PALM web site. When designing performance
assessments, the selection of targets, description of the assessment
tasks, and development of the rubric are all interrelated. Without
a rubric, a performance assessment task becomes an instructional
activity. Rubrics should include:
- dimensions of key behaviors
- examples of the behaviors
- scales (i.e., checklists, numerical, or qualitative)
- standards of excellence for specified performance levels
Clear dimensions of performance assessments specify the definitions of performance using the behaviors that students will actually demonstrate and that judges will rate. For example, the dimensions of performance can be stated as follows: " Student's capability of using measurement tools will be demonstrated by plotting the levels of two variables on a two-dimensional graph using a graphing calculator." Do not make broad statements, such as: "Students will show an understanding of graphing calculators." Dimensions also can be clarified by the use of questions. For example, fluency in writing can be assessed as follows: "Does the student use pre-writing strategies (e.g., drawing, listing, clustering)?" "Does the student have spelling problems that block the flow of ideas?" These questions focus the teacher's or rater's attention on the dimensions of writing fluency that the student should demonstrate during the writing assessment: drawing, listing, clustering, and spelling difficulties. The dimensions of performance should be defined so that the scorers, teachers, students, and other stakeholders understand the performance dimensions in the same way. Some performance assessment tasks are complex and may require performances on several dimensions to determine if students have acquired the desired content and skills.
Rubrics can use different types of scales to document student performance. For example, the presence or absence of a variety of behaviors can be documented using a simple checklist. Numerical scales, such as those ranging from "1" to "10," can be used to assign ratings that differentiate among levels of performance. A third type of scale, qualitative, assigns words to various levels of performance, such as "inadequate" to the lowest levels of performance and "excellent" to the highest levels. A variety of descriptive terms can be used to rate the performances depending on the content and skills being assessed. For example, a qualitative scale can be used to rate degree of organization in a student project (i.e., "well organized "to " disorganized") or levels of originality in a project (i.e., "highly creative" to "little evidence of new or original thought.")
For each level of performance specified in the rubric, specific behaviors and examples of performance should be provided. For example, in the Electrical Circuit and Switches rubric, there are four levels of performance (criteria) that range from "1" (minimal performance) to "4" (excellent performance). The student behaviors needed to achieve each of the levels are specified in the rubric. To attain Level 1 (Criterion 1), a student "provides a complete circuit." To attain Level 2, a student must "provide a complete working circuit and switch or provide a complete working circuit and modified switch or provide a complete working circuit and short circuiting switch." To attain Level 3, a student must make a "clear drawing of a modified switch (Switch or main parts must be labeled!)" To attain Level 4, a student must provide a "clear description of how a modified switch works." To attain Levels 3 and 4, the student must also show that they accomplished Levels 1 and 2. Please note that Level 2 incorporates Level 1.
Technically sound rubrics are:
- Continuous: The change in quality from score point to score point must be "equal": the
degree of difference between a 5 and 4 should be the same as between a 2 and 1.
Parallel: Similar language should be used to describe each level of performance (e.g.,
low skill level, moderate skill level, and high skill level), as opposed to non-parallel
constructions (e.g., low skill level, understands how to perform some of the task, excellent
Coherent: The rubric must focus on the same achievement target throughout, although each
level of the rubric will specify different degrees of attainment of that target. For example, if
the purpose of the performance assessment is to measure organization in writing, then each point on
the rubric must be related to different degrees of organization, not factual accuracy or
Highly Descriptive: Highly descriptive evaluative language ("excellent," "poor,") and
comparative language ("better than," "worse than") should be used to clarify each level of
performance in order to help teachers and raters recognize the salient and distinctive features of
each level. It also communicates performance expectations to students, parents, and other
Valid: The rubric permits valid inferences about performance to the degree that what is
scored is what is central to performance, not what is merely easy to see or score, or based on
factors other than the achievements being measured. The proposed differences in levels of
performance should a) reflect the key components of student performance, b) describe qualitative,
not quantitative differences in performance, and c) not confuse merely correlative behaviors with
authentic indicators of achievement (e.g., clarity and quality of information presented should be
a criterion in judging speaking effectiveness, not whether the speaker used note cards while
speaking). Valid rubrics reduce the likelihood of biased judgments of students' work by focusing
raters' attention on factors other than students' gender, race, age, appearance, ethnic heritage, or
prior academic record.
Reliable: In traditional assessments, such as multiple choice tests, where a student selects
a response from among several options, the reliability of the score has to do primarily with the
stability of the test score from one testing occasion to another in the absence of intervening growth
or instruction. Establishing the reliability of a rubric for a performance assessment, however, is
more complex. A reliable performance assessment rubric enables:
- Several judges rating a student's performance on a specific task to assign the same score or rating to the student's performance.
- Each judge to rate the student's performance on a specific task at about the same level on several occasions in the absence of intervening growth or instruction.
Rubrics can be generic or task-specific. A generic rubric can be
used for multiple tasks, while a task-specific rubric is only appropriate
for a particular task. The scoring guidelines within generic or
task-specific rubrics may be analytical or holistic. Holistic scoring
is typically based on a four- to six-point scale indicating specified
performance levels that reflect an overall impression of student
work. In contrast, analytic scoring provides separate scores, usually
on a four- to six-point scale, for multiple dimensions of each students
work. Analytic scoring allows for more specific and detailed feedback
than holistic scoring. On the PALS Web site you can find examples
of the following:
- Holistic, Generic rubric
of Its Parts"
- Holistic, Task-specific
Circuits and Switches"
- Analytic, Task-specific
Wash" (4-6-point scale for each criterion)