Ingest
Four-agent team turns source documents into a curated supply chain graph.
AI agent swarms that simulate dynamics no single model can produce.
Portfolio project · May 2026
I wanted to see how AI agents could be used to simulate supply chains, and what kind of value that could produce.
I'd been seeing tools come out that use AI agent swarms as one-shot prediction tools. Tons of agents running in parallel, weighting their outputs against each other to produce a single forecast. I was already curious about supply chains and geopolitics and risk, and I noticed nobody had applied this technique to a domain where the interaction dynamics between actors are the whole point, not just an aggregation of opinions.
So I started building. Dvora is what came out of that.
Inputs in. Coordination dynamics out. Hours, not weeks.
Overview · scenario to findings.
The way it works: Dvora maps a real supply chain into a graph of every entity that matters. Companies, governments, banks, regulators, end customers. Every entity has its own data attached to it, who they are, how they act, their filings, their relationships. A brain of memories for each agent.
Then every entity becomes its own AI agent reasoning in that entity's best interest with that entity's actual context. You give the swarm a scenario, like an export control or a sanctions package or a permit denial, and the simulation runs round by round. Every actor responds based on what they know, what they want, and what the others just did.
This is for people who think about what would actually happen when a supply chain gets hit, not as a forecast but as a play-by-play. Defense procurement analysts. Manufacturers trying to harden or expand their supply. Government policy offices working on industrial strategy. Specialty risk underwriters pricing political and supply-chain disruption coverage. Hedge funds with critical-minerals or commodity exposure. People building agent systems and running into similar problems.
What you see in the output: where coalitions form and fracture, where binding constraints emerge from somewhere unexpected, where one actor's response opens a vulnerability or closes one. The findings aren't predictions. They're the dynamics of the system itself, made visible.
Spots positions about to shift. Reads coordination dynamics.
Spots supply-side vulnerabilities. Tests procurement strategies.
Surfaces policy gaps. Stress-tests interventions.
Maps strategic dependencies. Identifies single points of failure.
Dvora is in prototype. The core engine works, but there's a long way to go. Here's what runs today.
The data layer mattered most, so that's where I started. An ingestion engine reads source documents (industry reports, regulatory filings, academic papers, government strategy documents) and produces a curated supply chain graph. Four agents handle this: extraction, modeling, mapping, and quality. Entity identity gets enriched against Wikidata, OpenSanctions, GLEIF, and SEC EDGAR. I'm not rebuilding canonical data that already exists in public sources.
Before any agent runs, a deterministic setup tier pulls live data from ten external APIs (trade flows, commodity production, macroeconomic indicators, sanctions, export controls, geopolitical risk, public filings) and runs three computational layers on top (concentration metrics, network topology, critical-minerals classification). The output is a per-scenario world state, cached and reproducible.
Then the simulation runs. Every entity in the graph gets instantiated as its own Claude Agent SDK instance with its own system prompt, model assignment, and access to a five-tool toolkit (forecasting, optimization, network analysis, propagation, web search).
The custom message bus routes everything between them with strict schema validation, write-ahead persistence, structural visibility rules. The runner orchestrates rounds, manages rate limits, and persists state after every round, so the simulation is restartable from any checkpoint.
What comes out: full transcript, per-agent reasoning traces, tool call logs, state-block history, deterministic run summary.
Here's the pipeline, end to end.
Documents in. Curated graph. Running simulation.
Four-agent team turns source documents into a curated supply chain graph.
Deterministic Python pulls live data and runs computational layers. Output is a cached, reproducible world state.
Every entity runs as its own Claude Agent SDK instance with a tier-routed model. Custom message bus mediates everything.
Pipeline · documents to simulation.
May 2026, rare earth magnets supply chain under MOFCOM HREE export controls.
These numbers are from the first test run. They vary by scenario and supply chain.
Three coordination chains worth pulling out:
In rounds 1 through 4, Chinese ministries (MIIT, NDRC, MOFCOM, MLR) negotiated a combined Dy+Tb mass-fraction threshold for finished magnet exports. This addresses a leakage vector that an upstream-only oxide control would miss. Each step responded to an operational problem the prior step exposed.
Demonstrates: heterogeneous agent interaction at multi-step depth. Each step responding to a specific operational problem the prior step exposed.
In round 1, Apple challenged MP Materials on whether their existing US supply commitment actually protected against heavy-REE exposure routed through Chinese processors. MP's response, JPMorgan's structured-finance proposal to bridge the gap, and DLA's stockpile inquiry formed a cross-actor chain on a real vulnerability.
Demonstrates: emergent operational specificity from agent interaction. Apple's question wasn't prompted. MP's answer addresses a specific real vulnerability (Shenghe routing). The chain of subsequent commitments forms only because each agent had its own institutional context.
Lynas, asked about US heavy-REE expansion timelines, named a specific Texas wastewater permitting situation as the binding constraint. Not capital, not technology, not ore. The kind of operational specificity a real Government Affairs lead would write.
Demonstrates: institutional specificity from per-entity context. The Seadrift permit, TCEQ jurisdiction, EPA Region 6 are real-world specifics the agent surfaced because the entity's data substrate included them.
Here's what one agent does in a single round.
Per-entity reasoning, mediated by the message bus.
One agent, one round.
What it is: every entity in the supply chain runs as its own independent agent, built on the Claude Agent SDK. The agent has its own institutional context, its own LLM calls, its own reasoning trace, its own state.
Why this choice: a single model role-playing many entities produces narrative consistency, not interaction. Real per-entity agents reason in parallel and respond to each other's specific output. That's what produces emergent dynamics.
What it is: all inter-agent communication routes through a custom asyncio message bus. Messages get validated against a strict schema. Visibility is enforced structurally: point-to-point, broadcast, or observable (regulators with jurisdiction see it automatically).
Why this choice: information asymmetry is what makes negotiation realistic. A regulator should only see what it has jurisdiction to see. A private commercial conversation should stay private. Building this into the bus instead of agent prompts means the structure can't be bypassed.
What it is: each entity is assigned a tier reflecting strategic importance. Tier 1 entities default to Opus, tier 2 to Sonnet, tier 3 to Haiku, with optional manual promotion.
Why this choice: reasoning depth should scale with what's being reasoned about. A government ministry's policy decision deserves more compute than a tier-3 long-tail entity's procurement check. This keeps cost controlled while preserving fidelity where it matters.
What it is: entity identity (legal name, country, public identifiers, parent/subsidiary structure, classification) gets sourced from canonical public databases like Wikidata, OpenSanctions, GLEIF, and SEC EDGAR. The ingestion engine extracts only the chain-specific structure on top.
Why this choice: canonical entity data is a solved problem. Building it from scratch reproduces work the public databases have already done. The differentiation is the simulation, not the data pipeline.
Three things matter most, in order.
First, the deliverable layer. A synthesis output that turns a 9MB transcript into a 30-page scenario report with ranked vulnerabilities, cascading effects, and evidence trails. This is what closes the gap between an interesting simulation run and something actually useful.
Second, calibration. Running the engine against historical events whose outcomes are known, to surface what kinds of dynamics the simulation captures well and where it misses.
Third, scenario engagements. Running Dvora against specific customer questions: a fund's specific position, a manufacturer's specific exposure, a policy office's specific decision. Not generic disruption runs.
I'm building this in public. If supply chains, scenario analysis, geopolitical risk, or agent-based modeling is your space, or if you're building something similar and running into the same problems, I'd like to hear from you. Open to ideas, pushback, or pointers to people I should be reading.
adinzweigbaum@gmail.com · linkedin.com/in/adin-zweigbaum-11996019b