Experiment · Jun 17, 2026

Same prompt.
Five models.
Five reports.

LangChain audit: Opus 4.8, Fable 5, Sonnet 5, Sonnet 4.6, and Haiku 4.5 — which model to use, and how to combine them?

📋 Four-phase prompt 🐍 Repo: langchain 🎬 Presentation mode
Executive summary

Repo health grades

Same code, different calibration. The honest grade isn't the highest.

Opus 4.8
A−
Threat model
Fable 5
A−
Strategy + plan
New
Sonnet 5
B+
Primary auditor
Sonnet 4.6
B+
Ops & CI
Haiku 4.5
A
Inflated grade
Caution
Verdict

There's no single winner

The reports are complementary. Sonnet 5 improves on 4.6 but doesn't repeat all of its findings.

Pipeline = Haiku + Sonnet 5 + Sonnet 4.6 + Opus + Fable + human.
New · Sonnet 5

Sonnet 5 — Primary auditor

What it adds (that 4.6 doesn't)

  • SSRF transport only in 2 call sites — adoption gap
  • graph_mermaid.py:461 — unprotected requests.get
  • except Exception blocks that swallow without logging
  • 208 type:ignore · 240 noqa · zero bare except
  • AGENTS ≡ CLAUDE · leftover audit reports at repo root
  • Acknowledges limits: shallow clone, no coverage %

Doesn't repeat: lockfile CI, default load(), README gpt-5.4 → see Sonnet 4.6

Strengths by model

Opus — Security & rigor

Elite threat modeling

  • TOCTOU / DNS rebinding on SSRF — Opus only
  • ShellToolMiddleware with default shell host
  • Dual LANGCHAIN_ENV bypass, broader than the docstring
  • [Fact] / [Judgment] labels
  • Partners without pre-commit · AGENTS.md ≡ CLAUDE.md
  • Plan: IP pinning via _transport.py

⚠ Missed: default load(), lockfile CI, broken README

New · Fable 5

Fable — Strategy & plan

From findings to executable backlog

  • Grade A− — same honest calibration as Opus
  • God files: 5 files >1,800 LOC · base.py 6,574
  • Unsafe default load() — top risk (like S4.6)
  • 208 type:ignore · parked BLE/ERA rules · C90 off
  • Vendored mustache.py 704 LOC · usage.py swallows AttributeError
  • M0–M3 milestones with effort, risk, and non-goals

Missed: TOCTOU, default shell, SSRF 2 sites, lockfile CI → see Opus + S5 + S4.6

Sonnet 4.6 · Ops

Sonnet 4.6 — CI & adoption

Operational auditor

  • Lockfile check commented out in CI — High
  • Default load() with allowed_objects='core'
  • README with invalid gpt-5.4 · no SECURITY.md
  • Counts 16/8/7 except Exception in hot paths
  • Quick wins executable in hours
  • Complementary pass to Sonnet 5
Sonnet evolution

Sonnet 5 vs 4.6

Same B+ grade — different focus. Use them in a pipeline, don't pick just one.

Sonnet 5 wins on…

  • SSRF / transport adoption
  • Methodology (no guessing)
  • Silent exceptions
  • Repo hygiene (audit artifacts)
  • noqa / type:ignore count

Sonnet 4.6 wins on…

  • Commented-out lockfile CI
  • Unsafe default load()
  • Broken README gpt-5.4
  • Missing SECURITY.md
  • langchain-classic deps
Strengths by model

Haiku — Architecture & map

Fast exploration

  • LOC table: base.py 6,574 lines
  • Callback ↔ tracer cycles — unique find
  • Duplication in block_translators/ (~900 lines)
  • v2 refactor roadmap (split Runnable into 5 modules)

⚠ Factual error: claims lockfile is validated in CI (incorrect)

Security

Three layers of security

Opus

  • TOCTOU design
  • Default shell host
  • Dual env bypass

Sonnet 5

  • Barely adopted transport
  • graph_mermaid without SSRF
  • IP pinning as a strength

Sonnet 4.6

  • Default load() core
  • subprocess S603
  • Downstream CVE
Full security coverage = Opus + Sonnet 5 + Sonnet 4.6 + Fable (defaults & debt)
Exclusive findings

Who saw what?

Finding Op Fb S5 S4.6 Hk
TOCTOU / DNS rebinding
Default shell host
SSRF transport only 2 sites
graph_mermaid without SSRF
Unsafe default load()
Actionable M0–M3 plan
mustache.py vendored / C90 off
Commented-out lockfile CI
Callback/tracer cycles
Audit reports at repo root
Ranking by goal

Which model for what?

Sonnet auditor
Sonnet 5
Sonnet 4.6
Security adoption
Sonnet 5
Opus
S4.6
CI / Ops
Sonnet 4.6
Sonnet 5
Opus
Threat model
Opus
Sonnet 5
Haiku
Strategy / plan
Fable
Sonnet 5
Opus
Ready to act on
S5+S4.6+Opus+Fb
Haiku
Architecture
Haiku
Opus
Sonnet 5
Recommended flow

6-step pipeline

1

Haiku

LOC and architecture map.

2

Sonnet 5

Primary audit + SSRF adoption.

3

Sonnet 4.6

Ops pass: lockfile, README.

4

Opus

Threat model and default shell.

5

Fable

Strategy, M0–M3 milestones, quick wins.

6

Human

Merge into a single backlog.

Conclusion

Tiering by model,
not a single winner

Opus → threat review (A−)

Fable → strategy and M0–M3 plan (A−)

Sonnet 5 → primary auditor (B+)

Sonnet 4.6 → complementary ops pass

Haiku → exploration (always verify)

The honest takeaway for product: choose the model for the task, not for the most expensive tier's marketing.

CtrlNode · Prompts Catalog 17-06-fable
or Space · F fullscreen