
From the “Staff and Hammer” Blog
By Dan Hult
I think it is safe to assume that those who have worked in government for a substantial length of time have witnessed at least one significant change in the method or process in which their performance is evaluated. And even those who have much shorter tenures are likely harbor frustrations with whatever system they are evaluated under. Every new evaluation system promises to accurately and objectively measure people’s performance—and then fails to live up to that promise. At some point, that failure pushes the organization to modify that system or even try a completely new system, which merely restarts the cycle. My organization is currently in the middle of the largest such change in my career, though I have seen several smaller changes. Why can we never seem to get performance appraisal right? Why does every adjustment we make to our performance evaluation method—from minor tweak to major overhaul—ultimately fail to deliver on its promises? And most importantly, what can be done about it?
The Fundamental Flaw
The reason we can never find a performance appraisal system that works is because the entire concept of performance appraisal is fundamentally flawed. This is why W. Edwards Deming advocated its complete abolishment in the twelfth of his fourteen points of management, referring to it as a major barrier to salaried workers’ right to take pride in their work.[1] He also named it the third of his seven deadly diseases of management.[2] In discussing the latter, Deming explains his reasoning:
“Basically, what is wrong is that performance appraisal or merit rating focuses on the end product, at the end of the stream, not on leadership to help people. This is a way to avoid the problem of people. A manager becomes, in effect, manager of defects….The idea of merit rating is alluring…pay for what you get; get what you pay for; motivate people to do their best, for their own good. The effect is the exact opposite of what the words promise. Everyone propels himself forward, or tries to, for his own good on his own life preserver. The organization is the loser.” -W. Edwards Deming, Out of the Crisis, 1986, 102
He then goes on to explain how performance appraisal causes different teams within the organization to work against each other rather than working together for mutual good.[3] Peter Scholtes espouses a similar view in The Leader’s Handbook, devoting an entire chapter to the abolition of performance appraisal.[4] He observed that performance appraisal is based on the assumptions that:
- evaluation will improve performance,
- employee has control over results,
- employee’s individual contributions can be discerned,
- all seemingly identical processes are actually identical,
- evaluation standards are related to what is important to business and customers,
- standards are reasonable and achievable,
- each system is stable and can deliver expected results,
- evaluation covers entire rating period not just what the evaluator can remember, and
- all evaluators are consistent between each other in how they rate all of their employees.
These assumptions are rarely, if ever, true.[5] The main reason given by both Deming and Scholtes is that it is impossible to determine anyone’s contribution independent of all other factors. And even if it was possible, Deming stated that roughly 94% of problems (and we can argue by extension 94% of successes) are due to the system independent of the individual.[6] His contemporary, Joseph Juran, put the number at 85%—a concept so central to his teachings that it became known as “Juran’s Rule”—though Myron Tribus noted that even some of the remaining 15% was attributable to the system as well.[7] When we combine that with natural variation inherent to any process and system, it becomes impossible to determine to truly evaluate people by their output.
But even if it was possible to objectively and consistently measure people’s performance enough to evaluate them against each other, this perfect system would still create the problems Deming mentioned. Scholtes notes that such a system would identify half the people as below average, which would cause a small number of people to stive for improvement but many more to become discouraged or cynical. “We all want to work together in a place without discouragement, all on the same side, proud of our work, joyful in our work. Performance appraisal assures that we will never approach such an ideal workplace”.[8] The bottom line according to Deming and Scholtes is that performance appraisal is at best unnecessary and at worst detrimental to an organization.
Pick Your Poison
Some may argue that Deming and Scholtes are only referring to pure merit rating, where people are ranked based solely on quantifiable output. If that is the case, then most organizations that use performance appraisal systems would agree that pure merit rating is ineffective. This is because it has been tried and found wanting. In the 1980s and 1990s, Jack Welch’s leadership at GE was considered the gold standard. One aspect of this was the “rank and yank” system, in which all leaders would be rated based on the profit they generated. Based on this, the top 10% would be promoted and the bottom 10% would be fired. Since GE saw unprecedented profits during this time, many tried to imitate this. As the new millennium dawned, leaders have discovered the fatal flaws with this approach. In addition to its inevitable long-term damage to organizational culture, rank and yank did not contribute to GE’s profits in any appreciable way, instead owing to the general economic growth of the time and the success of GE Capital, which operated under a different culture than the rest of GE.[9] To anyone familiar with Deming’s famous Red Bead Experiment, the flaws of rank and yank are obvious, since that is essentially what the fictitious White Bead Company does.[10] Like anyone who has observed the Red Bead Experiment, many organizations have seen the ineffectiveness of merit rating and have evolved their performance appraisal systems away from it.
These alternative systems try to subjectively measure people’s performance using a more holistic approach in which raters evaluate people based various traits. While this escapes many of the pitfalls of merit rating, it is subjective, depending largely on the writing style of the rater. This flaw is also well understood, so some organizations attempt to remedy it by using multiple raters. They compile feedback from a person’s superiors, subordinates, and peers to create a “360-degree” perspective. But this is still ultimately subjective. These systems also tend to produce inflated ratings since evaluators are very hesitant to put negative comments on an evaluation because of their possible impact to the person’s career. As a result, evaluations often make everyone look like superman. If everyone is superman, no one is, so this habit of inflated comments diminishes the utility of any subjective performance appraisal method in which they are found. Even if policy forbids or dissuades such comments, supervisors will continue to use them until it can be proven that people with less-than-superhuman performance repots are consistently promoted. Until that point, supervisors will not be comfortable risking negative career implications to their employees of saying anything less than stellar on evaluations. This creates a catch-22 that is virtually impossible to remedy without getting rid of the subjectivity entirely, which is itself impossible as has already been discussed. Therefore, organizations must choose between ineffective merit rating that will be detrimental to their culture in the long term and subjective rating systems that are nearly as ineffective and of little real value.
The Stakhanov Problem
Both approaches to performance appraisal (especially the latter) also suffer from another core problem. Since the goal of anyone under a performance appraisal system is to stand out from everyone else in a good way, they not only pursue that end to the detriment of the organization as Deming and Scholtes mentioned, but also to the detriment of themselves. This is the legacy of 1930s Soviet miner Alexei Stakhanov, whose seemingly superhuman performance served as the ultimate ideal for Soviet workers to emulate. Stakhanov and those like him seem to prove that human potential is endless. This means that with hard work and dedication, anyone can become like Stakhanov. When organizations promote that message, it feeds on our desire for self-actualization, inspiring us to put forth ever-increasing effort to stand out from everyone else to become an elite performer.[11] The competition this produces destroys teamwork and trust, obliterates any possibility of a healthy work-life balance by forcing people to constantly over-achieve, and tying people’s self-worth to their performance and perception at work. Performance appraisal methods that rate the whole person (especially 360-degree methods) then exacerbate these problems by making every aspect of a person’s life and personality not only a potential factor in drawing scrutiny from everyone else but also a factor in promotion or termination. This puts immense pressure on people that is detrimental in the long-term—for individuals and ultimately the organization.
The truth is that human potential is not limitless.[11] People do have limits on performance, creativity, and potential. This is why the “keys to lasting success” on my leadership page include recognizing both short-term and long-term limits of people and systems, being careful to avoid exceeding them. Humans can only run so fast—and that speed differs drastically for a sprint vs. a marathon. When combined with the pervasiveness of the Stakhanovite philosophy and its self-focused quest to tap seemingly limitless human potential, performance appraisal systems—and the organizational policies of which they are part—are essentially driving people to sprint a marathon, which is clearly not sustainable.
The Solution: Leadership and Humility
So if the problems with performance evaluation methods are inescapable, what is the remedy? After all, any organization still needs ways to monitor and improve performance, determine who to promote or terminate, provide feedback, decide pay and bonuses, and various other needs that performance appraisal is supposed to meet. Scholtes addresses each of these individually in The Leader’s Handbook, but it will suffice for now to say that each of these can be accomplished without performance appraisal. Instead of merit-based pay, he suggests seniority-based pay. When people object that seniority-based pay encourages “dead wood”, he responded with “Why do you hire dead wood Or, why do you hire live wood and kill it?”.[12] His point is that the pitfalls of seniority pay can be avoided by both hiring the right people and leading them well. For promotion, he recommended a much more involved method of steadily grooming people for leadership roles then evaluating them for specific roles as they came open.[13] Essentially, this means leaders need to be more involved in actually leading people rather than relying on systems such as performance appraisal. The type of leadership required for this is what I discuss in some detail in my leadership paper, so I will not go into detail here. In short, leadership that can replace performance appraisal must be servant leadership that is heavily involved in developing and guiding people.
Deming similarly claimed that performance appraisal can be replaced by leadership. Once leaders are actually trained in what their job entails, they can then be much more selective about who they hire in the first place and then intentionally develop them over time. These leaders need to be much more involved on a day-to-day basis with their people, not to micromanage or judge them but to be a counselor and colleague to take care of them for their benefit and the good of the organization. This would include long interviews with them to learn about them and their career aspirations to help the leader to lead them better, using metrics to fix systems rather than people.[14] This gets at one of the biggest problems of performance appraisal observed by both Deming and Scholtes: the attempt to use performance appraisal as a substitute for leadership. There is no substitute for leadership: no system, policy, or technology can ever replace ledership.
The other essential element in replacing performance appraisal is to shift the emphasis away from self. The competition between people and teams to the detriment of the organization stems from the desire of people to exalt themselves. In a previous post, I pointed out how self-centeredness in Haman ultimately led to his downfall, whereas its lack led to Mordecai’s success. In this, the humility of Mordecai serves as an example for all people—leader and follower alike—should strive to emulate. If we are all more focused on helping others and the organization than ourselves, the entire team will perform better. With no more need to backstab and climb over others to get ahead, people can focus their energy on their work and improving the organization, replacing hostility with trust. If we’re honest, that’s the workplace we all desire. This may seem like a pipe dream, but the closer we can come to it, the better it will be for our organizations and ourselves.
In the end, whether based on metrics of output that is impossible to attribute to any individual or on subjective criteria attempting to assess the whole person, performance evaluation methods have all failed to deliver what the promised. And since even the perfect appraisal system could never accomplish its objective, we can safely say that no appraisal system ever will. Instead, people need to be led, and leadership can never truly be done through a policy or system responding to various metrics. Only when we understand this and shift away from relying on such systems in favor of actual leadership will we be able to see the benefits that performance appraisal systems promise but cannot deliver, and both our organizations and the people working in them will be better off for it.
[1] W. Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Press: 1986: 77-85.
[2] W. Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Press: 1986: 101-120.
[3] W. Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Press: 1986: 107.
[4] Peter R. Scholtes, The Leader’s Handbook: Making Things Happen, Getting Things Done, New York: McGraw-Hill: 1998: Chapter 9, “Performance Without Appraisal”
[5][5] Peter R. Scholtes, The Leader’s Handbook: Making Things Happen, Getting Things Done, New York: McGraw-Hill: 1998: 295.
[6] W. Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Press: 1986: 315.
[7] Myron Tribus, “The Germ Theory of Management”, Swiss Deming Institute, 31 March 2002: 7.
[8] Peter R. Scholtes, The Leader’s Handbook: Making Things Happen, Getting Things Done, New York: McGraw-Hill: 1998: 317.
[9] Jim Collins, Good to Great: Why Some Companies Make the Leap…And Others Don’t, New York, NY: HarperCollins: 2001: 33; Simon Sinek, Leaders Eat Last: Why Some Teams Pull Together and Other Don’t, New York, NY: Portfolio: 2014: 210-211; Simon Sinek, The Infinite Game, New York, NY: Portfolio: 2019: 111.
[10] W. Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Press: 1986: 110-112, 346-350.
[11] Bogdan Costea and Peter Watt, “How a Soviet Miner From the 1930s Helped Create Today’s Intense Corporate Workplace Culture”, The Conversation, 29 June 2021, link.
[12] Peter R. Scholtes, The Leader’s Handbook: Making Things Happen, Getting Things Done, New York: McGraw-Hill: 1998: 331.
[13] Peter R. Scholtes, The Leader’s Handbook: Making Things Happen, Getting Things Done, New York: McGraw-Hill: 1998: 346.
[14] W. Edwards Deming, Out of the Crisis, Cambridge, MA: MIT Press: 1986: 116-118.
Next Post
Previous Post