@@ ✦ · review craft @@

How big should a pull request be?

Othman Shareef · June 14, 2026 · 5 min read · The Craft of Code Review

Every team eventually argues about this number. Here’s the honest version: the research points to a range, not a magic value, and the range is smaller than most teams’ habits.

Where the ~400-line figure comes from

The most-cited data point in this debate comes from a large study of peer review at Cisco, popularized by SmartBear: reviewers’ ability to find defects degrades as the change grows, with ~400 lines as the practical ceiling and ~200 as the comfortable middle. It’s old data, gathered on human-written code, but nothing about human working memory has improved since, and it matches what reviewers feel: past a few hundred lines, you stop reviewing and start skimming.

Google’s public review guidance arrives at the same place from experience rather than measurement: small changelists get reviewed faster and more thoroughly, are less likely to import bugs, and are easier to roll back. Their reviewer docs treat “can this be split?” as a standard first question.

Why size matters more than ever

AI assistants lowered the cost of producing lines dramatically. In a controlled GitHub experiment, developers with Copilot finished a task 55% faster than the control group. Nothing comparable happened to the cost of understanding lines. When authoring accelerates and review doesn’t, PR size is the valve where the pressure shows up. Holding the size line is how a team keeps review quality while adopting AI tooling.

When it’s fine to break the rule

Mechanical bulk: renames, formatting, codemods, lockfiles, generated artifacts. Reviewers verify legitimacy, not logic, so flag these in the description.
Migrations and schema changes that lose meaning when fragmented.
Coherence beats arithmetic: a 600-line change with one clear idea reviews better than five 120-line fragments nobody can evaluate alone.

The test isn’t the line count; it’s whether a reviewer can hold the idea of the change in their head. Size is a proxy: a good one, but a proxy.

How to split a PR that grew

Refactor first, behavior second. The classic two-PR split: a no-behavior-change refactor that makes the feature diff small, then the feature.
Slice vertically. One thin end-to-end path first (schema → API → UI for one case), then the variations.
Land the leaves. Utilities, types, and tests that stand alone can merge ahead of the trunk change that uses them.

And when the big PR ships anyway

It will; agents are prolific and deadlines are real. For the review side of that problem, we wrote a separate field guide: how to review large pull requests. And if the bottleneck is the review surface itself, that’s the problem Pyor exists to fix: triage-first file rail, folder-level viewed tracking, and a diff that keeps your place at any size. (Our product; free for individuals.)

Frequently asked questions

Is there an official maximum PR size?

No standard body defines one. The most-cited guidance comes from SmartBear’s study of peer review at Cisco (keep reviews under ~400 lines) and Google’s engineering practices (small changes, reviewed quickly). Treat 400 changed lines as a soft ceiling, not a law.

Do generated files and lockfiles count toward PR size?

Practically, no. Reviewers scan them for legitimacy rather than reading them. What matters is the number of lines a human must actually understand. A 2,000-line PR that is 1,800 lines of lockfile is a small PR wearing a big coat.

My AI assistant produces large changes. Should I force-split everything?

Split when there is a real seam (refactor vs. behavior change, layer by layer). Don’t manufacture artificial fragments that can’t be understood alone; a coherent 600-line change can beat five incoherent 120-line ones. When a large PR is unavoidable, review it with triage and passes instead.

← All posts