metricsdeliveryprocess

Flow Metrics vs. Velocity: Why Story Points Are Misleading Your Team

Your team's velocity is 42 points. Great — but what does that actually mean? Here's why flow metrics give you better answers and how to start using them.

Alvaro Burga ·

"Our velocity is 42 points this sprint."

Quick — what does that tell you? Can you ship the next feature by April? Is the team getting faster or slower? Should you hire more developers?

The honest answer: velocity tells you almost nothing useful. And yet, it's the most commonly tracked metric in software teams.

After 15 years of working with engineering teams, I've seen velocity cause more confusion than clarity. Here's why — and what to measure instead.

The Problem With Velocity and Story Points

Story points mean different things to different people

Ask five developers to estimate the same task in story points and you'll get five different answers. A "5-point story" on Team A is completely different from a "5-point story" on Team B. Even within the same team, estimates drift over time as people unconsciously recalibrate.

This makes velocity useless for planning. If your team's velocity is 42 points, that number only means something if every estimate was perfectly calibrated — which it never is.

Velocity incentivizes the wrong behavior

When teams are measured by velocity, two things happen:

Estimate inflation. That task that was a 3 last month becomes a 5 this month. Velocity goes up. Did the team get faster? No — they just changed the numbers.

Avoiding hard work. A critical infrastructure task estimated at 13 points gets deprioritized in favor of three easy 5-point tasks. The team "delivered more" but the hard, important work didn't get done.

Velocity can't answer the questions that matter

Your CEO asks: "When will the payment integration be done?"

With velocity, here's what you'd need to do: estimate every remaining task in story points, divide by average velocity, add buffer for uncertainty, and pray the estimates are right. The result is a guess wrapped in math that looks precise but isn't.

Velocity doesn't show you problems

Your team's velocity was 40, 42, 38, 41 over the last four sprints. Looks stable, right? But what you can't see is that three developers are working 60-hour weeks, half the items are getting done in the last two days of the sprint, and bugs are piling up because the team is rushing to hit the numbers.

Velocity hides problems behind a single number.

What Flow Metrics Are

Flow metrics measure how work actually moves through your team's process. Instead of estimating how hard something is, you measure how long it takes and how much gets done. There are four key metrics:

1. Cycle Time — How long does work take?

Cycle time measures the elapsed time from when someone starts working on an item to when it's done and deployed. Not estimated time — actual time.

Why it matters: If your average cycle time is 8 days, and someone asks "how long will this feature take?", you have a data-backed answer: "Based on our track record, similar items take about 8 days." No estimation meeting required.

What to watch for: If cycle time is increasing over time, something is getting worse — more blockers, more complexity, more interruptions. You can investigate before it becomes a crisis.

2. Throughput — How much gets done?

Throughput is simply the number of items your team completes per week or sprint. Not story points — actual items delivered.

Why it matters: If your team completes an average of 6 items per sprint, you can plan around that. When someone asks "can we fit these 20 items in the next 3 sprints?", you have an honest answer: "Probably not — we typically complete 18. Let's prioritize the most important 18."

What to watch for: A sudden drop in throughput signals a problem — a key person is out, the team is blocked, or too much unplanned work hit.

3. Work in Progress (WIP) — How much is happening at once?

WIP counts how many items are actively being worked on right now. Not planned — actually in progress.

Why it matters: High WIP is the silent killer of delivery. When a team of 5 developers has 15 items in progress, nobody is finishing anything. Reducing WIP is often the single fastest way to improve delivery speed.

The rule of thumb: WIP should be roughly equal to the number of developers, or lower. A team of 5 should have 5-7 items in progress, not 15.

4. Work Item Age — How long has this been sitting here?

Work item age measures how long an in-progress item has been open. It's like cycle time, but for items that aren't done yet.

Why it matters: This is your early warning system. If your average cycle time is 8 days and an item has been in progress for 12 days, something is wrong. Maybe it's blocked. Maybe it's bigger than expected. Either way, you know to investigate now — not at the end of the sprint when it's too late.

How to Make the Switch

You don't need to abandon story points overnight. Here's a practical transition:

Week 1: Start measuring without changing anything

Most project management tools (Jira, Linear, Shortcut) can track these metrics automatically. You just need to make sure your team updates item status accurately — "in progress" when they start, "done" when it's deployed.

Start tracking:

  • How many items were completed this week (throughput)
  • How many items are currently in progress (WIP)
  • When each item was started and finished (cycle time)

Week 2: Use the data in planning

Instead of playing planning poker and debating story points, look at your data:

  • "We completed 6 items last sprint. Let's commit to 6 this sprint."
  • "Our average cycle time is 8 days. This means items started today should be done by the 15th."
  • "We have 14 items in progress right now. Let's finish some before starting new ones."

Week 3: Stop estimating (optional but recommended)

This is the controversial part. Many teams find that once they have throughput and cycle time data, estimation adds no value. If you know your team completes 6 items per sprint, you don't need to know if an item is "a 3 or a 5." You just need to know it's roughly similar in size to your typical work items.

If an item is clearly huge, break it down. If it's clearly tiny, it doesn't need a ticket. For everything in between, your throughput data tells you what you need to know.

Week 4: Review and adjust

Compare your flow metrics to what velocity was telling you:

  • Was your "velocity of 42" actually translating to predictable delivery?
  • Now that you see cycle time, are there bottlenecks you didn't know about?
  • Is your WIP too high?

Most teams that make this switch are surprised by what they find. The data reveals problems that velocity was hiding.

What "Good" Looks Like

After working with teams on this transition, here are the benchmarks I typically see:

Cycle time: Struggling = 15+ days. Healthy = 5-10 days. Excellent = 2-5 days.

Throughput consistency: Struggling = varies 50%+. Healthy = varies 20-30%. Excellent = varies less than 20%.

WIP per developer: Struggling = 3+ items. Healthy = 1-2 items. Excellent = 1 item.

Work item age alerts: Struggling = never checked. Healthy = checked weekly. Excellent = checked daily.

The goal isn't to optimize these numbers to perfection. The goal is to make delivery visible and predictable. When you can see how work flows through your team, you can spot problems early, plan realistically, and give honest answers about when things will be done.

The Bottom Line

Story points and velocity feel scientific, but they're just organized guessing. Flow metrics measure what actually happens. The switch takes a few weeks, costs nothing, and gives you better answers to the only question that matters: "Can we deliver what we promised, when we promised it?"

If you're drowning in estimation meetings and still can't predict when things will ship, let's talk about how to set up flow metrics for your team.

Ready to fix your delivery?

Let's talk about your challenges in a free 30-minute call.

Book a Discovery Call
Book a Call