Experiment · Jun 17, 2026

Same prompt.
Five models.
Five reports.

LangChain audit: Opus 4.8, Fable 5, Sonnet 5, Sonnet 4.6, and Haiku 4.5 — which model to use, and how to combine them?

📋 Four-phase prompt 🐍 Repo: langchain 🎬 Presentation mode

Executive summary

Repo health grades

Same code, different calibration. The honest grade isn't the highest.

Opus 4.8

A−

Threat model

Fable 5

A−

Strategy + plan

New

Sonnet 5

B+

Primary auditor

Sonnet 4.6

B+

Ops & CI

Haiku 4.5

A

Inflated grade

Caution

Verdict

There's no single winner

The reports are complementary. Sonnet 5 improves on 4.6 but doesn't repeat all of its findings.

Pipeline = Haiku + Sonnet 5 + Sonnet 4.6 + Opus + Fable + human.

New · Sonnet 5

Sonnet 5 — Primary auditor

What it adds (that 4.6 doesn't)

SSRF transport only in 2 call sites — adoption gap
graph_mermaid.py:461 — unprotected requests.get
except Exception blocks that swallow without logging
208 type:ignore · 240 noqa · zero bare except
AGENTS ≡ CLAUDE · leftover audit reports at repo root
Acknowledges limits: shallow clone, no coverage %

Doesn't repeat: lockfile CI, default load(), README gpt-5.4 → see Sonnet 4.6

Strengths by model

Opus — Security & rigor

Elite threat modeling

TOCTOU / DNS rebinding on SSRF — Opus only
ShellToolMiddleware with default shell host
Dual LANGCHAIN_ENV bypass, broader than the docstring
[Fact] / [Judgment] labels
Partners without pre-commit · AGENTS.md ≡ CLAUDE.md
Plan: IP pinning via _transport.py

⚠ Missed: default load(), lockfile CI, broken README

New · Fable 5

Fable — Strategy & plan

From findings to executable backlog

Grade A− — same honest calibration as Opus
God files: 5 files >1,800 LOC · base.py 6,574
Unsafe default load() — top risk (like S4.6)
208 type:ignore · parked BLE/ERA rules · C90 off
Vendored mustache.py 704 LOC · usage.py swallows AttributeError
M0–M3 milestones with effort, risk, and non-goals

Missed: TOCTOU, default shell, SSRF 2 sites, lockfile CI → see Opus + S5 + S4.6

Sonnet 4.6 · Ops

Sonnet 4.6 — CI & adoption

Operational auditor

Lockfile check commented out in CI — High
Default load() with allowed_objects='core'
README with invalid gpt-5.4 · no SECURITY.md
Counts 16/8/7 except Exception in hot paths
Quick wins executable in hours
Complementary pass to Sonnet 5

Sonnet evolution

Sonnet 5 vs 4.6

Same B+ grade — different focus. Use them in a pipeline, don't pick just one.

Sonnet 5 wins on…

SSRF / transport adoption
Methodology (no guessing)
Silent exceptions
Repo hygiene (audit artifacts)
noqa / type:ignore count

Sonnet 4.6 wins on…

Commented-out lockfile CI
Unsafe default load()
Broken README gpt-5.4
Missing SECURITY.md
langchain-classic deps

Strengths by model

Haiku — Architecture & map

Fast exploration

LOC table: base.py 6,574 lines
Callback ↔ tracer cycles — unique find
Duplication in block_translators/ (~900 lines)
v2 refactor roadmap (split Runnable into 5 modules)

⚠ Factual error: claims lockfile is validated in CI (incorrect)

Security

Three layers of security

Opus

TOCTOU design
Default shell host
Dual env bypass

Sonnet 5

Barely adopted transport
graph_mermaid without SSRF
IP pinning as a strength

Sonnet 4.6

Default load() core
subprocess S603
Downstream CVE

Full security coverage = Opus + Sonnet 5 + Sonnet 4.6 + Fable (defaults & debt)

Exclusive findings

Who saw what?

Finding	Op	Fb	S5	S4.6	Hk
TOCTOU / DNS rebinding	✓	—	—	—	—
Default shell host	✓	—	—	—	—
SSRF transport only 2 sites	—	—	✓	—	—
graph_mermaid without SSRF	—	—	✓	—	—
Unsafe default load()	—	✓	—	✓	—
Actionable M0–M3 plan	—	✓	—	—	—
mustache.py vendored / C90 off	—	✓	—	—	—
Commented-out lockfile CI	—	—	—	✓	✗
Callback/tracer cycles	—	—	—	—	✓
Audit reports at repo root	—	✓	✓	—	—

Ranking by goal

Which model for what?

Sonnet auditor

Sonnet 5

Sonnet 4.6

Security adoption

Sonnet 5

Opus

S4.6

CI / Ops

Sonnet 4.6

Sonnet 5

Opus

Threat model

Opus

Sonnet 5

Haiku

Strategy / plan

Fable

Sonnet 5

Opus

Ready to act on

S5+S4.6+Opus+Fb

Haiku

Architecture

Haiku

Opus

Sonnet 5

Recommended flow

6-step pipeline

1

Haiku

LOC and architecture map.

2

Sonnet 5

Primary audit + SSRF adoption.

3

Sonnet 4.6

Ops pass: lockfile, README.

4

Opus

Threat model and default shell.

5

Fable

Strategy, M0–M3 milestones, quick wins.

6

Human

Merge into a single backlog.

Conclusion

Tiering by model,
not a single winner

Opus → threat review (A−)

Fable → strategy and M0–M3 plan (A−)

Sonnet 5 → primary auditor (B+)

Sonnet 4.6 → complementary ops pass

Haiku → exploration (always verify)

The honest takeaway for product: choose the model for the task, not for the most expensive tier's marketing.

CtrlNode · Prompts Catalog 17-06-fable

Same prompt.Five models.Five reports.

Repo health grades

There's no single winner

Sonnet 5 — Primary auditor

What it adds (that 4.6 doesn't)

Opus — Security & rigor

Elite threat modeling

Fable — Strategy & plan

From findings to executable backlog

Sonnet 4.6 — CI & adoption

Operational auditor

Sonnet 5 vs 4.6

Sonnet 5 wins on…

Sonnet 4.6 wins on…

Haiku — Architecture & map

Fast exploration

Three layers of security

Opus

Sonnet 5

Sonnet 4.6

Who saw what?

Which model for what?

6-step pipeline

Haiku

Sonnet 5

Sonnet 4.6

Opus

Fable

Human

Tiering by model,not a single winner

Same prompt.
Five models.
Five reports.

Tiering by model,
not a single winner