This article was updated with new data and technical information on December 7, 2020.
Rating scales on employee performance reviews often get a bad rap. They are viewed as being impersonal, inaccurate, and lacking in nuance when it comes to evaluating employee performance.
While these are valid concerns, the problems with rating scales tend to stem from bad design and bad data, not the rating system itself. Industry analyst Josh Bersin echoes this sentiment in one of his recent articles:
Let me simply say that after a decade of discussion on the topic, the concepts of ratings themselves are not the issue. Organizations need to make decisions about people…and these decisions themselves are essentially evaluative by nature. The key today is to use lots of data and feedback to make these decisions; do them in a transparent and fair way; clearly communicate what is valued in the company; and give people visibility into others’ goals and projects.
– Josh Bersin
In this blog post, we explore why rating scales may be a valuable component of performance reviews and share recommendations on how to create one that works for your organization.
Why rate employee performance?
You may be wondering why we have to rate performance in the first place. The simple answer is organizations need an understanding of how employees are performing so the individuals and organization can continue to compete and grow. As an organization, you need to be able to make data-informed decisions when it comes to promotions, compensation increases, and development opportunities. Without a universal way to gather structured data, you often open yourself up to inaccuracies and biases when making decisions. Further, employees want to understand what’s expected of them to get a raise, get promoted, or develop their skills, and that requires some form of measurement around established performance criteria.
Properly-designed performance ratings help differentiate high performance from low performance, identify areas for improvement, and offer transparency in decision making. Research shows that top performing employees perform at 400% of the level of the rest of the organization – yet without rating performance, you wouldn’t be able to systematically identify and act on these performance differences.
What makes using rating scales difficult?
One reason why many people are against the idea of rating scales is that traditional performance reviews are frequently the source of bad rating data – in the sense that they don’t measure the behaviors and business impact they should be measuring. This means you could unintentionally be promoting the wrong people and overlooking high performing employees. Furthermore, many traditional approaches to performance ratings tend to rely on numerical based ratings or forced rankings which can be incredibly demotivating!
Unaddressed rater biases are also a common problem that can lead to inflation or deflation of employee ratings, which can have serious implications for performance reviews. Two common performance evaluation biases that can be impacted by rating scale design are leniency bias, which is the tendency to give high ratings to almost everyone, and centrality bias, which is the tendency to rate people somewhere in the middle of the scale. Lack of awareness around these biases and proper manager training can lead to skewed ratings that don’t offer valuable or accurate data.
How to create a rating scale that works for your organization
The good news is that you have the power to create a customized rating scale for your organization that both produce useful data and helps reduce common biases.
We have three recommendations to get you started on the right path with performance review rating scales:
Tip #1: Understand spread and validity
The most important concepts to understand when creating an employee performance rating scale are spread and validity. These are the two areas where most traditional performance ratings and reviews tend to be weak:
Spread: This is also known as variance, differentiation, or range. If you use a measuring stick to assess the performance of different people, does that measuring stick actually pick up on the nuanced differences in performance? Many old-school performance tools don’t effectively differentiate and create any meaningful spread. Further, if you notice leniency bias at play in your organization (where it seems like every employee is high performing), it may be due to managers not knowing how to meaningfully distinguish between top-performing employees. Designing a scale with multiple, well-defined response options for “above average” performance in addition to training raters and running calibration sessions are a few ways you can address this bias.
Validity: Does the question or tool measure what we say it measures and what the organization really cares about? Do the ratings actually matter and help drive better decisions based on what’s important?
For example, if you have a measure of caloric intake, does it predict, affect, or drive anything else in the real world like your weight, health, or longevity? It’s important to ask yourself whether a question and its response options actually identify and differentiate outcomes in a meaningful and relevant way.
Tip #2: Customize your wording
It’s critical to make sure you articulate a rating scale that aligns with your unique brand and business goals. This means being flexible with how you customize the wording and design of your questions and response options, based on the criteria you’re trying to measure and the behaviors of your employees. The more specific you are in defining each of the response option descriptions (i.e., “anchors”), the better and more consistent your raters will be in using the scales. Creating clear, differentiated descriptions becomes even more important as the number of response options increases (e.g., five options versus three).
If Centrality Bias has been an issue for managers in past performance reviews, you may want to eliminate any “neutral” options on your scales in order to force a decision.
We encourage customization, and the Culture Amp platform allows you to reformat your questions and rating scales to address your organization’s goals and minimize the biases you may have seen in previous performance review cycles.
Tip #3: Be transparent
Finally, be prepared to train your employees on the scales you’re going to use and how they should interpret the response options. The foundation of good measurement is making sure everyone understands how the organization defines success.
Transparency helps build trust and perceptions of fairness around how employees are being measured. One of the biggest mistakes you can make is to tell employees you’re eliminating the rating system only to use it behind closed doors with the executive and management teams.
Performance review rating scale examples
The four-point rating scale
Many organizations have used the standard three-point rating scale, in our research looking at the distribution of performance responses we have found that a 4-point rating scale is often the best option to go for. While a three-point scale may work for measuring certain criteria, it often lacks the nuance needed to make appropriate assessments.
Culture Amp’s VP of Product, Srinivas Krishnamurti, recommends developing a scale that has more gradations for top performance and fewer gradations for low performing employees. “Let’s say you have three ratings: ‘not meeting,’ ‘meeting,’ and ‘exceeding.’ It’s hard to justify giving bonuses to everyone who is in the ‘exceeding’ category. So you want to make the ratings more fine-grained and maybe introduce another rating where you’re going beyond ‘exceeding’ and should, therefore, be paid more. On the other hand, if you’re underperforming, you don’t want multiple gradations of that. If you’re not cutting it, that should be clear,” says Srinivas.
As mentioned earlier, the validity of scales can be improved by adding more detail and specificity to the response options. Here’s an example for adding context to a more well-defined, four-point scale:
- Needs Development: Does not consistently meet expectations that are appropriate for the position. Additional direction and support is needed. Willing or able to improve but lacks results required for role.
- Consistently Meets Expectations: Consistently meets expectations and sometimes exceeds expectations. Achieves a majority of core goals for the role.
- Often Exceeds Expectations: Regularly exceeds expectations. Requires little to no additional direction to achieve core goals of the role.
- Sets a new standard: Consistently exceeds expectations and delivers to the goals of the position, or consistently delivers beyond the goals of the role. Influences others to perform better.
A great way to combat centrality bias (when your managers provide an “average” rating across the board) is to force managers to make a clear choice when it comes to rating. By creating a four-point scale, managers are no longer given the option to give average ratings across the board, but rather have to determine what meaningful differences exist between employees.
Reducing leniency bias and improving “above average” spread
If your organization faces challenges distinguishing top-performing employees, consider using a scale that helps identify greater variation between higher performing individuals. We recommend a five-point scale with the second point as “Meeting Expectations” (or where the majority of employees will likely sit) to drive greater distribution among those performing above average and further differentiate higher performance. This also helps combat centrality bias, as “average” is no longer the middle-point of the rating scale, further encouraging managers to differentiate between levels of performance.
How well does this person deliver on their objectives?
Conversely, most managers aren’t comfortable using a rating more than one point below “Meeting Expectations”, so it makes sense to simplify that “lower” end of the scale. If your organization embraces more of a growth mindset, consider choosing language like “getting there” rather than “below average.”
Of course, we recommend using the above response options as a starting point and customize them with clarifying descriptors that align with your organization’s objectives and culture.
Evaluating softer skills
For areas of evaluation focused on softer topics like interpersonal skills, you can use an observation scale that indicates how frequently the desired behavior is being demonstrated by the employee. With this type of scale, it’s critical that the rater works closely enough with the employee to have a well-informed understanding of their behavior – though that’s important for all types of performance ratings. It’s also imperative to ensure that you are asking about observed behaviors, rather than the person’s intentions. For example: Instead of ‘This person cares about diversity’ go with ‘This person treats colleagues with respect, regardless of their background.’
Rating scales have the potential to be a powerful piece of a holistic performance management system that your people trust and embrace. To learn more about how to rebrand and rebuild your performance review process to align with organization goals, download our Performance eBook today.
Thank you to our People and Data Scientists for providing the foundational knowledge and data for this article.
Ready to raise the standard for performance management?
Get all of the insights in this article and more in our eBook