How Does Zuban Achieve Zero False Negatives in Python Type Checking?

Mar 17, 2026

I was debugging a production incident last month. The type checker had passed our code, but we still got a TypeError at runtime. After hours of investigation, I discovered the culprit: a false negative from our type checker.

That’s when I started looking at Zuban, a Python type checker that claims zero false negatives. Here’s what I learned about why this matters and how it works.

What’s the Problem with False Negatives?

False negatives occur when a type checker fails to report an error that should be caught according to the typing specification. Let me show you a real example:

from typing import overload

@overload
def process(x: int) -> str: ...
@overload
def process(x: str) -> int: ...

def process(x: int | str) -> int | str:
    return x  # Bug: returns wrong type for each overload

Some type checkers might not flag this implementation. The overloads promise that:

process(5) returns str
process("hello") returns int

But the implementation returns whatever you pass in. A false negative here means this bug slips through silently.

Why This Is More Dangerous Than False Positives

I used to get annoyed by false positives - type errors that weren’t really errors. But here’s the thing: false positives are just annoying. False negatives are dangerous.

┌─────────────────┬──────────────────┬─────────────────────┐
│ Error Type      │ Immediate Impact │ Long-term Impact    │
├─────────────────┼──────────────────┼─────────────────────┤
│ False Positive  │ Annoyance        │ Add # type: ignore   │
│                 │ Friction         │ Developer learns     │
│                 │                  │ why it's flagged     │
├─────────────────┼──────────────────┼─────────────────────┤
│ False Negative  │ Silence          │ Production bug      │
│                 │ False confidence │ Customer impact     │
│                 │                  │ Debugging nightmare │
└─────────────────┴──────────────────┴─────────────────────┘

The silence of false negatives is what makes them deadly. You think your code is safe. It isn’t.

The Benchmark Data

I found recent benchmark data from the Python Typing Specification Test Suite (March 2026):

┌────────────┬───────────────┬───────────┬─────────────────┬─────────────────┐
│ Checker    │ Fully Passing │ Pass Rate │ False Positives │ False Negatives │
├────────────┼───────────────┼───────────┼─────────────────┼─────────────────┤
│ pyright    │ 136/139       │ 97.8%     │ 15              │ 4               │
│ zuban      │ 134/139       │ 96.4%     │ 10              │ 0               │
└────────────┴───────────────┴───────────┴─────────────────┴─────────────────┘

The numbers surprised me:

Pyright has a slightly higher pass rate (97.8% vs 96.4%)
But Pyright has 4 false negatives, Zuban has 0
Zuban actually has fewer false positives too (10 vs 15)

This doesn’t match the usual trade-off narrative. Usually, strict checkers have more false positives. Zuban is both strict AND more precise.

How Zuban Achieves Zero False Negatives

While the specific implementation details aren’t publicly documented, achieving zero false negatives typically requires a fundamentally different approach.

Conservative Type Inference

When the type is ambiguous, Zuban takes the stricter path:

from typing import TypeVar, Generic

T = TypeVar('T')

class Box(Generic[T]):
    def __init__(self, value: T) -> None:
        self.value = value

    def get(self) -> T:
        return self.value

def process(box: Box[int] | Box[str]) -> None:
    value = box.get()
    # Some checkers might incorrectly narrow this

    if isinstance(value, int):
        # Zuban guarantees value is int here
        result = value + 10
    else:
        # Zuban guarantees value is str here
        result = value.upper()

Complete Rule Coverage

Zuban implements all Python typing specification requirements without shortcuts. Let me show you a case where this matters:

from typing import Protocol

class Drawable(Protocol):
    def draw(self) -> None: ...

class Circle:
    def draw(self) -> None:
        print("Drawing circle")

def render(shape: Drawable) -> None:
    shape.draw()

# Zuban catches this - some checkers don't
def process_invalid():
    render("not a drawable")  # TypeError waiting to happen

A type checker with false negatives might pass this code. At runtime, you’d get:

AttributeError: 'str' object has no attribute 'draw'

When This Matters Most

Not every project needs this level of strictness. But for certain use cases, zero false negatives is worth the trade-off.

Production Systems

def calculate_discount(price: int) -> float:
    return price * 0.1

def apply_discount(item: dict) -> None:
    price = item.get("price", 0)  # Type: int | None

    # A false negative checker might not flag this:
    discount = calculate_discount(price)
    # Bug: price could be None if "price" key exists but is None

If your type checker has false negatives, it might not warn about price being potentially None. In production, you’d get a TypeError.

High-Stakes Applications

Financial systems: A type error could mean incorrect transactions
Healthcare software: Bugs could affect patient care
Security-sensitive code: Type confusion vulnerabilities

For these, the cost of a production bug far exceeds the cost of fixing a few extra type annotations.

My Trial-and-Error Process

I tested Zuban on an existing codebase. Here’s what happened:

# Install Zuban
pip install zuban

# Check your project
zuban check src/

src/handlers/payment.py:45: error: Argument 1 has type "int | None"
                                      but expected "int"
src/models/order.py:23: error: Incompatible return type
src/utils/validators.py:12: error: "str" has no attribute "parse"
...

At first, I had more errors than with Pyright. But here’s the key insight: every single one was a genuine type safety issue.

With Pyright, I had 15 false positives to suppress. With Zuban, I had 0 false negatives to worry about. The code that passed Zuban was genuinely type-safe.

Common Mistakes I’ve Seen

Mistake 1: Prioritizing Pass Rate Over Correctness

A higher pass rate sounds better, but if it includes false negatives, you’re getting false confidence.

Higher Pass Rate ≠ Better Type Safety

A checker with:
- 99% pass rate
- 10 false negatives

Is less safe than:
- 96% pass rate
- 0 false negatives

The 3% difference are errors you NEED to fix.

Mistake 2: Ignoring False Negative Rates

Teams often focus on developer experience (fewer false positives = happier developers). But they overlook the correctness dimension.

Mistake 3: Assuming All Type Checkers Are Equivalent

They’re not. Different implementations make different trade-offs:

┌─────────────┬────────────────┬─────────────────┐
│ Checker     │ Priority       │ Trade-off       │
├─────────────┼────────────────┼─────────────────┤
│ mypy        │ Compatibility  │ 76 false negs   │
│ pyright     │ Speed + compat │ 4 false negs    │
│ zuban       │ Correctness    │ 0 false negs    │
└─────────────┴────────────────┴─────────────────┘

The Limitation

I couldn’t find detailed technical documentation about Zuban’s specific implementation. The evidence is primarily from benchmark results and community insights. If you know more about how it achieves this, I’d love to learn.

When to Choose Zuban

Based on my experience, Zuban is ideal for:

New projects where you can start strict
Critical systems where type safety is non-negotiable
Teams that value correctness over convenience
Projects with comprehensive test coverage (so type errors are caught in CI)

It might not be ideal for:

Legacy codebases with many type issues
Teams that prioritize minimal friction
Projects where rapid iteration matters more than correctness

Key Takeaways

False negatives are more dangerous than false positives - They let bugs slip through silently
Zero false negatives means trust - When Zuban passes, your code is type-safe
The trade-off isn’t what you’d expect - Zuban has fewer false positives than Pyright too
Know your priorities - Correctness vs convenience is a real choice

Final Words + More Resources

My intention with this article was to help others share my knowledge and experience. If you want to contact me, you can contact by email: Email me

Here are also the most important links from this article along with some further resources that will help you in this scope:

👨‍💻 Pyrefly Blog: Typing Conformance Comparison

Oh, and if you found these resources useful, don’t forget to support me by starring the repo on GitHub!