Guide
How to Build a QA Scorecard for a Call Center
A QA scorecard is the foundation of every call center quality program. Get it right and you have a consistent, trusted framework for measuring and improving agent performance. Get it wrong and your scores are arbitrary numbers that agents resent and managers cannot act on.
Step 1: Define the Call Types You Will Score
A single generic scorecard for all call types is a trap. What constitutes a good customer service call looks different from a good collections call, a good retention call, or a good outbound sales call. If you force them into the same scorecard, you will end up with criteria that are irrelevant to some calls and missing criteria that matter on others.
Start by listing your call types. For most contact centers, the list is 2–4 distinct types. Build a separate scorecard for each. Share criteria where they overlap (greeting, close) but weight them independently and add type-specific criteria where needed.
Step 2: Choose Your Scoring Criteria and Weights
For each call type, define the criteria you will score and how many points each is worth. Criteria should sum to 100 — this makes scores immediately interpretable without conversion math.
| Criterion | Points | What it measures |
|---|---|---|
| Greeting & identification | 10 | Brand-compliant opening, agent identification |
| Active listening | 20 | Paraphrasing, no interruption, issue confirmation |
| Problem resolution | 25 | Accurate resolution, correct process followed |
| Empathy & tone | 20 | Appropriate emotional response throughout |
| Compliance | 15 | Required disclosures delivered, data handled correctly |
| Professional close | 10 | Confirms resolution, thanks customer, clean ending |
These weights are a starting point for a customer service call. Adjust based on your call type: collections and financial services calls should weight compliance at 25–30%. Retention calls should weight empathy higher. Outbound sales calls need different resolution criteria entirely.
Step 3: Write Behavioral Anchors
Criteria names are not enough. "Empathy & tone: 20 points" tells an evaluator nothing about what earns 20 points versus 10 points versus zero. You need behavioral anchors — specific descriptions of what each performance level looks like.
Full credit (20/20)
Agent explicitly acknowledges customer frustration at least once. Tone remains warm throughout. Apology is proportionate and sincere. No robotic or scripted delivery.
Partial credit (10/20)
Agent maintains professional tone but does not explicitly acknowledge customer's emotional state. No empathy statement delivered. Call is functional but emotionally cold.
No credit (0/20)
Agent sounds impatient, dismissive, or talks over the customer. Multiple instances of flat or robotic tone. Customer frustration escalates over the course of the call.
Write anchors for every criterion. This step takes time but it is the difference between a scorecard QA managers can apply consistently and one that produces 15-point scoring variance between reviewers.
Step 4: Set Auto-Fail Criteria
Some behaviors should result in a failed call regardless of overall score. A 92-point call where the agent shared a customer's account information with the wrong person is not a 92-point call — it is a compliance event.
Common auto-fail criteria:
- Agent confirms account details with a caller who fails verification
- Required regulatory disclosure (Mini-Miranda, opt-out) not delivered
- Agent makes a promise the company cannot honor
- Agent records incorrect information in the system
- Call is disconnected without resolution attempt or customer consent
Auto-fail items should be defined in your rubric before you start scoring. If you are using AI scoring, configure them as hard rules that override the composite score.
Step 5: Calibrate Before You Launch
Before the scorecard goes live, run at least one calibration session. Select 5 calls that represent the range of call quality on your team. Have each evaluator score them independently, then compare results.
If two evaluators disagree by more than 5 points on a criterion, the rubric language for that criterion is too vague. Work through the disagreement, identify the ambiguity, and rewrite the behavioral anchor before launching.
After launch, run calibration monthly. Criteria that consistently produce disagreement need refinement. Criteria where all evaluators consistently agree can have their anchors simplified.
Step 6: Communicate the Scorecard to Agents
Agents should see the scorecard before they are scored by it. Share the criteria, weights, and behavioral anchors in a team meeting. Walk through examples of calls that scored well and calls that scored poorly, and explain the reasoning for each criterion.
Agents who understand the scorecard before their first scored call are far more likely to trust the process. Agents who see a score sheet for the first time after a low score tend to reject it. The order of operations matters.
Automating the Scorecard with AI
Once your scorecard is defined, calibrated, and trusted — the natural next step is automation. AI call scoring applies your rubric to 100% of calls, using the same behavioral criteria you defined, with the consistency that human reviewers cannot maintain across hundreds of calls per week.
The scorecard does not change when you add AI — it becomes more thoroughly applied. Your QA managers shift from spending their time listening to calls and filling in rubric spreadsheets, to reviewing AI-generated results, handling disputes, and focusing on the coaching conversations that require human judgment.
See Your Scorecard Scored Automatically on Every Call
Call Coach IQ lets you configure your rubric — criteria, weights, auto-fail items, and behavioral anchors — and applies it to every call automatically. Book a demo and see it run on your call type.
Request a DemoNext reading: Call Center QA Metrics: What Actually Predicts Performance →

