Project dossier
Sentinel
Multi-agent market microstructure simulator for liquidity shocks and hidden order behavior.
What it solves
Overview
Sentinel models market participants inside a simulated limit order book so Ayush can study how institutional traders, market makers, and retail flow create emergent price behavior. Interview focus: be ready to explain the event-driven simulation loop, price-time matching, WebSocket update model, agent latency, market maker inventory, optional PPO or genetic-programming policies, live-shadow data replay, and why simulator state is kept in a single long-running backend process.
Target audience
System design
Architecture
The system is layered around agent perception, strategy reasoning, order execution, and anomaly detection. A FastAPI backend owns the simulation loop while the Next.js dashboard renders book depth, agent positions, and liquidity alerts. The source repo also documents an event kernel that schedules agent wakeups and order arrivals, specialized agents beyond the basic three archetypes, predictor endpoints for liquidity and large-order signals, a Zustand-powered dashboard store, and deployment limits caused by in-memory simulator state and WebSocket streaming.
Architecture diagram
Visualization layer
Interactive Next.js screens show live book depth, price movement, agent inventory, and anomaly flags.
Simulation API layer
FastAPI exposes simulation controls, scenario presets, and streaming endpoints for dashboard updates.
Market engine layer
The matching engine maintains bids, asks, cancellations, trade history, and price-time priority.
Agent intelligence layer
Institutional, market maker, and retail archetypes run independent perceive-reason-execute loops.
Realtime transport layer
A WebSocket endpoint broadcasts market updates so the dashboard can render price, depth, liquidity, large-order, and agent panels without polling every metric separately.
Policy experimentation layer
Optional PPO and genetic-programming market-maker policies can be loaded locally to compare rule-based agents against trained policies without making production startup heavy.
Implementation surface
Tech stack
Simulation engine, agent models, matching logic, and anomaly detectors.
Async API surface for starting simulations and streaming market state.
Dashboard shell for scenario exploration and real-time visualization.
Type-safe rendering of order book rows, event payloads, and metrics.
Frontend store for live market state, connection status, dashboard panels, and buffered WebSocket updates.
Charting layer for price curves, liquidity gauges, and dashboard time-series panels.
Training environment interface for optional reinforcement-learning market-maker policies.
Provider-backed replay path for comparing synthetic simulator behavior with market-data shaped scenarios.
Operational flow
How it works
A scenario starts with a configured order book and a population of agents. Each tick lets agents perceive state, choose actions, submit orders, and feed the resulting market events into detection logic.
Initialize market state
The engine creates a limit order book with starting liquidity, spread settings, tick size, and participant inventory.
This gives every run a reproducible baseline before institutional or retail behavior changes the book.
Run agent loops
Agents read depth, recent trades, volatility, and their own inventory before choosing limit, market, or cancel actions.
Match orders
The matching engine applies price-time priority, updates queues, emits trades, and recalculates top-of-book state.
Detect market anomalies
Liquidity shock logic watches depth collapse, while hidden-order logic watches repeated small prints that behave like a larger parent order.
Stream state to the dashboard
The frontend receives book snapshots, trade events, and detection markers for interview-ready visual explanations.
Schedule wakeups and latency
The event kernel schedules agent wakeups, order arrivals, and simulated latencies so not every participant reacts at the same instant.
This matters in interviews because latency and sequencing change queue position, spread capture, and the realism of HFT or market-maker behavior.
Publish WebSocket packets
After each meaningful state update, the backend broadcasts market state, prediction scores, and agent metrics to the dashboard over a live socket.
The REST endpoints are useful for snapshots and controls, but the socket is the right path for high-frequency dashboard updates.
Optionally compare trained policies
Local runs can load PPO or GP market-maker policies through environment flags, then compare their behavior against deterministic rule agents.
PPO is disabled by default for deployment speed, which is an important tradeoff between demo reliability and ML sophistication.
Sequence diagram
Concept depth
Key concepts
A limit order book is the core exchange data structure. Bids are ranked from highest to lowest price, asks from lowest to highest, and orders at the same price are filled by arrival time.
In Sentinel: Sentinel uses the book as the shared environment where every autonomous trader competes for queue position and liquidity.
Confidence
Implementation evidence
Code highlights
Price-time matching loop
A compact version of the engine logic that fills marketable orders by best price and queue age.
The queue always checks the opposite side because buys match asks and sells match bids.
Remaining limit quantity rests on the book; remaining market quantity does not.
Liquidity shock signal
Depth collapse is treated as a change-rate problem rather than a raw depth threshold.
The ratio makes the detector work across different scenario sizes.
A threshold near 0.38 catches abrupt depletion without flagging normal queue churn.
WebSocket update coalescing
Dashboard updates should be buffered so the UI renders the newest state at a controlled cadence instead of re-rendering for every simulator packet.
Only the latest packet is rendered, which protects the browser during dense simulation bursts.
This is a UI performance decision, not a backend correctness shortcut.
Policy loading guard
The runtime can keep ML policy support optional so production startup does not depend on large training dependencies.
Feature flags keep local research paths separate from the default deployment path.
The simulator can fall back to deterministic behavior when model artifacts are unavailable.
Contracts
API design
Base URL: http://localhost:8000
/simulation/startCreates a scenario run with agent mix, seed, tick count, and starting liquidity.
{ "seed": 42, "ticks": 1200, "institutionalAgents": 3, "retailAgents": 40 }{ "runId": "sentinel-42", "status": "running" }/simulation/{runId}/snapshotReturns current best bid/ask, depth by level, agent inventory, and alert state.
{ "midPrice": 101.25, "spread": 0.05, "alerts": ["liquidity_shock"] }/simulation/{runId}/streamStreams book snapshots and trade events for the dashboard replay.
/api/simulation/modeSwitches between synthetic simulation and live-shadow replay behavior without changing the dashboard contract.
{ "mode": "LIVE_SHADOW", "provider": "upstox" }{ "mode": "LIVE_SHADOW", "status": "ready" }/api/prediction/large-orderReturns the large-order or iceberg-style detection signal computed from recent order-flow statistics.
{ "warningLevel": "medium", "score": 0.64, "patterns": ["twap_like_child_orders"] }/api/agents/metricsReturns per-agent inventory, P&L, order counts, and strategy health metrics for the dashboard.
/api/live-shadow/upstox/ltpFetches live last-traded-price data for selected Upstox instruments used by live-shadow scenarios.
State model
Database design
Data relationship diagram
run_state
Ephemeral simulation state keyed by run id, seed, tick count, and scenario configuration.
order_book
Bid and ask queues grouped by price level and ordered by arrival sequence.
trade_log
Execution events used by replay, metrics, and hidden-order detection.
agent_metrics
Per-agent metrics derived during a run: position, cash, P&L, active orders, fills, and risk counters.
prediction_snapshot
Computed liquidity and large-order signals attached to a simulator tick for replay and debugging.
live_shadow_candle
Provider-derived market data normalized into simulator replay inputs when live-shadow mode is used.
Architecture decisions
Trade-offs
Backend framework
FastAPI over Django
FastAPI is lighter for a simulation API and gives async WebSocket support without pulling in Django's full ORM/admin stack.
Market realism model
Multiple agent archetypes over Single aggregate price model
The purpose is to explain emergent microstructure behavior, so independent actors are more useful than one averaged process.
State persistence
In-memory run state over Relational persistence for every tick
Simulations generate dense event streams. Keeping hot state in memory makes experimentation faster; durable storage can be added for saved runs.
Realtime channel
WebSocket feed over REST polling
The dashboard needs fast-changing market state, prediction flags, and agent metrics. A socket avoids repeated HTTP overhead and keeps the UI connected to a long-running simulation.
Training dependency loading
Feature-flagged PPO/GP policies over Always loading ML artifacts
Policy experiments are valuable locally, but production demos should start quickly and still work when heavy RL dependencies or model files are missing.
Simulator hosting
Long-running backend process over Serverless API functions
The simulation loop and WebSocket stream require process memory and durable connection state, which fit App Service or containers better than short-lived serverless handlers.
Lessons learned
Challenges and solutions
Problem
A naive order book can become slow when every tick scans all orders.
Solution: Separate queues by price level so the engine only inspects the best executable levels.
Lesson: Market simulators need data structures that match exchange rules, not generic arrays of orders.
Problem
Hidden orders are not directly visible because only child executions reach the tape.
Solution: Detect repeated small fills and replenishment patterns instead of searching for a single large order.
Lesson: A detector should measure observable behavior and clearly state what remains inferred.
Problem
A browser cannot render every simulation tick when the backend emits high-frequency packets.
Solution: Buffer the latest WebSocket update in the frontend store and flush on a controlled interval.
Lesson: Realtime visualization is a sampling problem as much as a transport problem.
Problem
RL policy support can make deployment fragile if the runtime assumes model files are always present.
Solution: Guard policy loading with RL_POLICY_ENABLED and provide deterministic strategy fallbacks.
Lesson: Research features should degrade cleanly when they are not essential to the product path.
Problem
Live-shadow provider data has different identifiers, authentication, and freshness behavior.
Solution: Normalize provider inputs behind dedicated Groww and Upstox fetch/replay endpoints before they reach the simulator view.
Lesson: Provider integration should be isolated from the market engine contract.
Runbook
Requirements and future work
Requirements
- Python 3.x with FastAPI for the simulation API.
- Node.js and npm for the Next.js dashboard.
- Scenario presets that define agent mix, seed, and liquidity settings.
- Browser support for live dashboard rendering and streaming updates.
- Backend environment supports WebSockets; a serverless-only backend is not enough for the live simulator.
- RL_POLICY_ENABLED defaults to false unless local PPO or GP artifacts and dependencies are installed.
- NEXT_PUBLIC_WS_URL must point at the backend WebSocket endpoint for dashboard streaming.
- Groww or Upstox credentials are optional and only required for live-shadow replay paths.
Future improvements
- Persist selected simulation runs for replay and comparison.
- Add ABIDES-style event scheduling for more realistic time handling.
- Support custom agent strategy plug-ins from the dashboard.
- Externalize simulator state so multiple backend replicas can coordinate saved runs and active sessions.
- Add replay compression so long simulations can be stored without persisting every raw packet.
- Expose an agent plug-in interface with validation around inventory limits, latency, and allowed order types.
- Add a side-by-side policy lab comparing rule-based, PPO, and GP market-maker behavior on the same seed.
Active recall
Interview Q&A
Why is a multi-agent simulator a better fit than a single price model for Sentinel?
How does price-time priority affect agent strategy?
What would you change if simulations needed to run for millions of ticks?
How does the event kernel make Sentinel more realistic than a fixed tick loop?
Why is WebSocket the right transport for the dashboard?
What should be included in a market update packet?
How would you validate a hidden-order detector?
Why keep simulator state in memory for the first version?
How do PPO and genetic-programming policies differ in this project?
What dashboard signals would you use to explain a liquidity shock?
What would fail if Sentinel were deployed only as serverless functions?