May 18th, 2018

Why All Evaluations Are Not Alike



Imagine that one day you come into the office and a fellow administrator asks you to review the following language from the draft of an evaluation that he or she is writing about a faculty member.

While Professor Smith has demonstrated some notable successes in teaching, it is also clear that there are certain areas in need of significant improvement. For example, the course syllabi developed by Professor Smith are exceptionally well organized and adhere closely to institutional guidelines. Moreover, students who complete the introductory sequence of courses with Professor Smith tend to score in the 90th percentile or higher on the nationally normed exam in this discipline. Peer reviews of Professor Smith’s courses are generally positive and note that current developments in the field tend to be well covered. On the other hand, student course evaluations at all levels of Professor Smith’s courses are a full standard deviation below those of the faculty member’s colleagues in both the department and college for comparable courses. Moreover, the attrition rate of students in these courses is unacceptably high. (During the most recent academic year, for example, only 55% of the students who began their coursework with Professor Smith remained enrolled in the section by the end of the course. Rates in similar courses in the department are above 95%, and the institution-wide retention rate for courses is 93%.) Most revealing, on the course evaluation question “Would you recommend this instructor to a friend?” Professor Smith averages only 15% in the “Yes, definitely” and “Probably” categories versus 65% in the “No, definitely not” and “Probably not” categories. As a result, while the quality of information conveyed in Professor Smith’s courses appears to be high or even admirable, it is clear that steps must be taken in order to improve this instructor’s rapport with the level of students who actually enroll at our institution. As a result, it is recommended that Professor Smith 1) be paired for the coming academic year with a faculty mentor who has been recognized for excellence in teaching a similar or related discipline, 2) attend two series of workshops sponsored by the Center for Excellence in Teaching on alternatives to the lecture and methods for promoting active, engaged learning, and 3) undergo an additional evaluation in two years as a way of monitoring the extent to which needed improvements have occurred.

Your colleague then asks you: Is this appropriate language to include in an evaluation? Now, your institutional policies may have guidelines and restrictions on the sort of language to be used in an evaluation. But if it is granted that the language above is permissible in an evaluation at your school, the correct response as to whether this language is appropriate ought to be, “I can’t tell whether this paragraph is advisable until I first know the kind of evaluation you are intending to write.” In other words, language such as that appearing above may be perfectly appropriate, even desirable, if the goal is to assist the professor and to offer constructive criticism for improvement. That is the type of evaluation that might occur, for instance, as a status check fairly early in the faculty member’s career and sufficiently before a tenure decision is due. It might even be appropriate language for an evaluation that occurs at the end of some process in which the faculty member’s application was not being passed to the next level at this time (say, for a promotion, merit increase, teaching award, or some other distinction), although a definite opportunity exists for the faculty member to apply again in the future. Nevertheless, this language would not be appropriate at all at the end of a process in which the faculty member’s application was being recommended for consideration at the next level of the institution. In other words, not all evaluations are alike. How we construct them and what we say in them depends to a great extent on the purpose that evaluation is intended to serve.

Evaluations that occur midway through a process are largely formative in nature. That is to say, they both build on existing strengths and offer constructive criticism about areas that need improvement. The purpose of a formative evaluation is thus to provide advice. In contrast to this, evaluations that occur at the end of some process are largely summative: They offer decisions or judgments that they then proceed to support. The purpose of a summative evaluation is to build a case. As a way of clarifying this distinction, let’s imagine that the paragraph we examined appeared in a dean’s recommendation to the provost of a candidate who had applied for tenure. Seen from this perspective, language that may have been perfectly acceptable in an annual review suddenly seems extremely inappropriate: Is the dean seriously recommending this candidate or not? In a summative evaluation, the author’s initial conclusion that the faculty member “has had some notable successes in teaching” is entirely undone by the criticisms and suggestions for improvement that appear later on.

By tempering a recommendation and continuing to provide advice in a summative evaluation, the administrator has failed in his or her primary duty of making a clear case. In other words, once the administrator decides that the faculty member either meets or exceeds a criterion required for a summative evaluation, it is his or her duty to state the reasons why that criterion is met and not to undermine that position by sending mixed signals. Naturally, it is never necessary to lie or exaggerate in an evaluation—there’s no reason to call good teaching “stellar” or “exceptional”—but academic leaders ought to highlight what a person does successfully without paying undue attention to any weaknesses that may still exist.

In a similar way, if an administrator concludes that an individual does not meet the criteria for a specific personnel decision, an evaluation that appears to focus too much on positive accomplishments can lead to difficulties. It is only natural to want to mix good news in with the bad, but the issue is the extent to which this is done. If an evaluation concludes that performance is subpar but yet appears to contain, on the whole, more evidence of success than of failure to meet established criteria, the message it sends to the reader tends to be puzzling. The faculty member may even use that evaluation as the basis for an appeal, arguing that the level of success it records in significant areas more than counterbalances the relatively minor suggestions for improvement it contains.

Certain types of evaluations are clearly either formative or summative. For instance, an annual review is almost always formative. A tenure decision is invariably summative. For other types of evaluations, however, it may be important to ask, if it is not absolutely clear: What is the primary goal of this process? Am I advising the faculty member or am I rendering a judgment? These questions are particularly important in processes such as post-tenure review, which do not serve the same role at all institutions.

Some schools approach post-tenure review as though it were a formative process: Its primary purpose is to give the faculty member advice for continual improvement. At other schools, either overtly or implicitly, post-tenure review actually results in a summative decision: Has the faculty member ceased to perform at a high enough level that a reconsideration of tenure is in the school’s best interest? Depending on the goal of the process at the institution, the type of evaluation that an administrator writes could thus be extremely different.

The real problem at most schools is that evaluation processes have been established that are never clearly one thing or the other. They try to render judgments on faculty members (for annual continuation, merit increases, promotion, tenure, and the like) while at the same time offering constructive suggestions for improvement. This inherent lack of clarity creates a situation in which administrators find themselves all but compelled to write evaluations that send mixed messages to their recipients. The best solution in such a case is to try to specify in the evaluation itself precisely what the criteria are on which the decision is being made and then, in a separate section, if possible, what constructive recommendations are being offered for the sake of future growth and later personnel decisions. Too often, however, the formats required for faculty evaluation are so prescribed that this type of distinction is impossible.

The general rule, however, should be to distinguish the formative and summative roles of evaluation to the greatest extent that you can. When giving advice, be as constructive and forward looking as possible. When rendering a decision, make the case as strongly as you can. And try not to undermine what you are doing in one of these capacities by combining it with the other.


Jeffrey L. Buller is dean of the Harriet L. Wilkes Honors College of Florida Atlantic University and senior partner in ATLAS: Academic Training, Leadership & Assessment Services. His latest book, the second edition of The Essential Academic Dean or Provost: A Comprehensive Desk Reference, is available from Jossey-Bass.

Reprinted from “Why All Evaluations Are Not Alike” in Academic Leader 24.12(2008)x4,5© Magna Publications. All rights reserved.