Virtual Trading Plus

Experiment settings

Trade settings (N = 6)

Fundamentalists N(F) 0

Trend followers N(T) 0

Prior Bias Prior Noise

First T2 · R4-⅔ First T4 · R4-⅓

Risk preferences

33%

34%

33%

Risk-loving · 33%

Risk-neutral · 34%

Risk-averse · 33%

AI endpoint Plan II · LLM + utility forms

Required for Plan II. Every period boundary, each Utility agent calls the LLM for a direct trading action (BUY_NOW, SELL_NOW, BID_1, BID_3, ASK_1, ASK_3, HOLD). The structured prompt includes market rules, decision-making principles, role-specific guidance, and the explicit closed-form utility function ($U_L$, $U_N$, $U_A$). Fields are read fresh from the DOM on every run (no localStorage); if the key is empty the Start button will refuse to launch.

Provider

API key

Endpoint (optional)

Model

AI endpoint Plan III · LLM + risk label only

Required for Plan III. Every period boundary, each Utility agent calls the LLM for a direct trading action. The structured prompt supplies market rules, decision-making principles, and only a natural-language risk-preference label (risk-loving, risk-neutral, or risk-averse) — no closed-form utility function is provided. Fields are read fresh from the DOM on every run (no localStorage); if the key is empty the Start button will refuse to launch.

Provider

API key

Endpoint (optional)

Model

Paper constants Dufwenberg, Lindqvist & Moore (2005), §I. Design

N = 6: Subjects per session; “At each session, six subjects participated in a sequence of four consecutive markets for an experimental asset.”§I, p. 1733
rounds / session = 4: Consecutive markets played by the same subjects; “A session involved four consecutive markets. In the following, we shall talk in terms of four different rounds. Note the distinction between rounds and periods; a round (being a market) consists of ten periods.”§I, p. 1733
= 10: Asset life, in periods (per round); “An asset's life span is ten periods.”§I, p. 1732
dividend ∈ {0, 20}¢: Per-period draw, equiprobable; “In each period, it pays a dividend of 0 or 20 U.S. cents, with equal probability.”§I, p. 1732
= 10¢: Expected dividend per period; “The expected dividend in each period is 10 cents (= ½ × 0 cents + ½ × 20 cents).”§I, footnote 5
: Fundamental value, by backward induction; “With k periods remaining, the fundamental value is k × 10 cents.”§I, p. 1732
endowment A 200¢, 6 shares: One of two discrete starting bundles; “Before a market opened, half of the traders started with 200 cents and six assets, while each of the other traders started with 600 cents and two assets.”§I, p. 1733
endowment B 600¢, 2 shares: The other discrete starting bundle; Both bundles have an identical buy-and-hold value of 1,000¢ under the risk-neutral fundamental ; the two types differ only in inventory/cash mix, so the split is distributional, not value-creating.§I, p. 1733
round-4 replacement R4-⅔ or R4-⅓: Two-treatment design; “In the fourth round, depending on treatment, two or four experienced subjects who had participated in the first three rounds were randomly selected, removed, and replaced by the same number of inexperienced subjects.” The paper labels these conditions by the fraction of experienced subjects remaining in round 4: R4-⅔ (four veterans + two fresh, shorthand T2) and R4-⅓ (two veterans + four fresh, shorthand T4); the R4-⅔ / R4-⅓ notation appears in the hypothesis row of Table 2.§I, p. 1733; Table 2, p. 1735
sessions = 10: Five per treatment (R4-⅔ and R4-⅓); The multi-session batch runner in the DLM panel reproduces DLM's 10-session design by sequencing 5 × T2 (R4-⅔) then 5 × T4 (R4-⅓) through the simulator in one click; each session uses a fresh engine seed and a fresh two-type endowment draw.§I, Table 1
payoff Σ final cash + 500¢: Session payoff per subject; “Subjects were privately paid, in cash, the amount of their final cash holdings from each round. They were also paid a show-up fee of $5.” All four rounds count; shares held at the end of a round are worth nothing (the asset’s life span has ended).§I, p. 1735

Hidden Constants

ticks / period = 18: Agent decision rounds inside one period; DLM 2005 runs a continuous 2-minute z-Tree double auction per period; this simulator discretizes that window into 18 decision rounds (≈ one agent turn every 6.7 real-time seconds) so the engine loop can step deterministically. 18 is dense enough to reproduce the bubble-crash pattern while keeping the replay buffer compact.engine.js — period-boundary trigger
naive prior weight = 0.60: Belief blend for naive Utility agents; Weight on the agent's own prior when blending incoming peer messages: $V_i^{\text{post}} = 0.60 \cdot V_i^{\text{prior}} + 0.40 \cdot \bar{m}$, i.e. $w = 0.60$ in the Plan I formula (see Architecture Figure 3). Not specified by DLM 2005, which studies human subjects and has no belief-update model. Chosen so naive agents move noticeably toward peers without collapsing onto them.agents.js — UTILITY_DEFAULTS.naivePriorWeight
skeptical prior weight = 0.90: Belief blend for skeptical Utility agents; Same convex combination as the naive weight but $w = 0.90$: $V_i^{\text{post}} = 0.90 \cdot V_i^{\text{prior}} + 0.10 \cdot \bar{m}$, so a skeptical agent hears messages but is barely moved by them. Not in DLM 2005; introduced so the strategy cube contains a "listen but don't trust" archetype.agents.js — UTILITY_DEFAULTS.skepticalPriorWeight
adaptive weight cap = 0.50: Max one-period belief shift toward peers; Upper bound on the fraction of belief an adaptive agent can shift toward the trust-weighted message mean $\bar{m}$ in a single period: even with fully-trusted senders, $w \geq 0.50$ so $V_i^{\text{post}}$ is at most 50% $\bar{m}$ + 50% $V_i^{\text{prior}}$. Not in DLM 2005; guards against runaway over-update from a single high-trust period.agents.js — UTILITY_DEFAULTS.adaptiveWeightCap
valuation noise = ±3%: Per-tick uniform noise on the Utility-agent prior; Before any bias or message update, each Utility agent draws $\text{prior}_t = \mathrm{FV}_t \cdot (1 + b_i + \varepsilon)$, $\varepsilon \sim \mathcal{U}[-n,\, n]$, $n = 0.03$ (see Architecture Figure 2). Not in DLM 2005; added so trade decisions do not degenerate to lockstep when every biased/unbiased agent starts each tick from the identical prior.agents.js — UTILITY_DEFAULTS.valuationNoise
trust λ = 0.30: EMA learning rate for the pairwise trust update; Pairwise trust is updated as $\tau_{r \to s} \leftarrow (1 - \lambda)\,\tau_{r \to s} + \lambda \cdot \text{closeness}$, where closeness $= \max(0,\, 1 - |\hat{v}_s - \mathrm{VWAP}_t| / \mathrm{VWAP}_t)$. $\lambda = 0.30$ weights each new observation at 30%. Not in DLM 2005, which has no messaging layer; chosen for a balance between responsiveness and stability.engine.js — TrustTracker period close-out
passive fill probability = 0.30: $p_{\text{fill}}$ heuristic for scoring non-crossing quotes; Expected-utility score for a passive quote is $\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1 - p_{\text{fill}}) \cdot U(w_0)$ with $p_{\text{fill}} = 0.30$ (see Architecture Figure 2). A full model would estimate $p_{\text{fill}}$ from order-book state; this is a deliberate constant placeholder and is not proposed by DLM 2005.agents.js — UtilityAgent scoring loop
bias magnitude = 15%: Persistent over/under-valuation of biased Utility agents; Applied as $b_i = \delta_i \cdot \beta$ with $\beta = 0.15$; sign set by the per-slot bias direction $\delta_i \in \{-1, 0, +1\}$ (see Architecture Figure 2). Drives the biased U-agent slots in the default strategy cube (U2, U4, U5). Not in DLM 2005; chosen large enough to perturb the market without dominating the risk-preference split.agents.js — UTILITY_DEFAULTS.biasAmount

Session— / 10

Round1 / 4

Period1 / 10

Tick0

Price—

Fundamental—

Mispricing—

Volume · period0

Agents Pre-run draft · editable before the simulation starts

Note

Cash: experimental-currency balance held by agent i at tick t, used to finance bids and grown by realized sales plus end-of-period dividend receipts. The pre-run editable value is the initial endowment ; in Dufwenberg, Lindqvist & Moore (2005) subjects were seeded with either 200¢ or 600¢, while this simulator draws each slot uniformly from [800, 1200] ¢.
Shares: holding of the finite-life asset at tick t (initial endowment ). Each held share pays a random dividend drawn from {0, 2} at the end of every trading period (DLM 2005), so the theoretical risk-neutral fundamental value at the start of period t is . DLM endowment classes held 6 or 2 shares; this simulator draws from {2, 3, 4}.
Wealth: mark-to-fundamental total wealth, defined as + · , or for Utility agents as + · (Lopez-Lira 2025). The Normalized Agent Utility plot is .
P&L: running change in total wealth relative to the initial endowment , reported in experimental cents. Positive values render in green, losses in red. Aggregated across all agents, P&L equals the cumulative dividends paid so the market is zero-sum up to the dividend stream, as in the Smith–Suchanek–Williams design replicated by DLM.
Subj V: the Utility agent's private subjective valuation per share — the posterior $V_i^{\text{post}}$ from the active plan (Architecture Figure 3), updated each tick from $\text{prior}_t = \mathrm{FV}_t \cdot (1 + b_i + \varepsilon)$ via the Plan I/II/III belief-revision protocol. Corresponds to the valuation field in Lopez-Lira's (2025) TradeDecisionSchema.
Report: the valuation the Utility agent broadcasts to peers in its messages. Under communication strategy $\sigma_m = D$ (deceptive), ≠ via the distortion multiplier $\phi_m$ (see Architecture Figure 3, Plan I card); the lie-gap magnitude drives the trust EMA update $\tau_{r \to s} \leftarrow (1 - \lambda)\,\tau + \lambda \cdot \text{closeness}$ and the mean-lie-magnitude statistic in the Experiment Metrics table.
Last action: the most recent decision taken by agent i at tick t, displayed as a coloured tag on the card. In Plan I the agent selects $\alpha^\star_{i,t} = \arg\max_\alpha \mathrm{EU}(\alpha)$ over $\alpha_{i,t} \in \{\text{hold},\, \text{buy@}A_t,\, \text{sell@}B_t,\, \text{bid},\, \text{ask}\}$ scored under the risk-typed utility functional (see Architecture Figure 2).
Subtitle: for classic agents, the strategy class (Fundamentalist, Trend follower, Random ZI, Experienced) together with set membership (, , , ). For Utility agents, the risk preference and its functional: Risk-loving (convex, upside-seeking), Risk-neutral (linear expected value), and Risk-averse (concave, downside-sensitive).

Trade & Dividend Feed

Figure 1

Transaction Price Trajectory versus Risk-Neutral Fundamental Value

risk-neutral fundamental value at the start of period t

Tick-level transaction prices (accent line, one dot per executed trade) plotted against the deterministic step function (amber dashes). Alternating vertical bands delimit the ten trading periods. In the Dufwenberg, Lindqvist & Moore (2005) design a rational market should track the step line exactly; persistent excursions above it are the bubble and the crash toward = in the final period is the collapse.

Note

: observed transaction price at tick t
: theoretical fundamental value at the start of period t
: expected per-period dividend (drawn uniformly from {0, 2})
: terminal period of the finite-life asset

Order Book

BIDS

PriceQtyAgent

ASKS

PriceQtyAgent

Figure 2

Mispricing Magnitude and Price-to-Fundamental Ratio

· absolute and relative mispricing

Absolute departure of the observed price from the theoretical fundamental value, filled as a red area. In Lopez-Lira (2025) the same information is expressed as the price-to-fundamental ratio : values above one mark an overvaluation regime, values below one mark an undervaluation regime, and ≈ 1 is consistent with rational pricing. The Experiment Metrics panel reports the normalized-deviation and amplitude statistics derived from this series.

Note

: absolute mispricing at tick t
: price-to-fundamental ratio (Lopez-Lira 2025)

Figure 3

Trade Volume per Period

shares transacted in period t

Sum of share quantities exchanged within each trading period. High and persistent bars indicate active speculation; the classic Smith–Suchanek–Williams bubble is typically associated with a volume peak in the inflation phase followed by a cliff as the asset approaches expiry.

Note

: total share volume traded in period t
: order quantity of a single executed trade

Figure 4

Transaction Density over Price × Period

two-dimensional trade histogram

Two-dimensional histogram of share quantity binned by transaction price (vertical axis) and trading period (horizontal axis). Warm cells concentrate the market's liquidity. Comparing the heat cloud against the downward-sloping fundamental staircase reveals whether the market is trading near rational value or persistently above it.

Note

: cumulative share volume in the (price, period) bin

Figure 5

Agent Action Timeline

per-tick agent decision

Bid Ask Hold Executed

One row per agent, one mark per decision. Column colour encodes the action type and a small accent dot below the mark records whether the submitted order was filled on the same tick. Contiguous green runs identify accumulators; contiguous red runs identify distributors; holds are the market's waiting population.

Note

: action taken by agent i at tick t

Figure 6

Subjective Valuation: True versus Reported

· lie gap = private belief versus broadcast claim

Solid lines trace each Utility agent's private belief over time. Filled dots mark broadcast messages carrying a reported valuation ; deceptive reports are ringed red and connected to the sender's true belief by a dotted segment — the vertical distance between ring and line is the lie gap. The amber step line is the fundamental value for reference.

Note

: agent i's private (true) subjective valuation at tick t
: valuation reported in a broadcast message
: lie gap for deceptive messages

Figure 7

Normalized Agent Utility over Time

risk-adjusted wealth, normalized to initial endowment

Per-agent expected utility evaluated at the running wealth = + · , divided by the agent's own initial utility so every trajectory starts at 1.0. Lines above the dashed baseline indicate positive risk-adjusted PnL; lines below indicate loss. The risk preference attached to each agent (convex, linear, concave) determines how aggressively a given wealth change is penalised or rewarded.

Note

: risk-typed utility: , ,
: mark-to-fundamental wealth at tick t

Figure 8

Asset Ownership over Time

shares held; total supply conserved

Stacked area of each agent's inventory across ticks. Because the double auction conserves shares, the total height is always the aggregate endowment . Widening bands identify agents who are accumulating, shrinking bands identify distributors, and any dramatic redistribution in the last few periods is typically the experienced trader liquidating before the asset expires worthless.

Note

: shares held by agent i at tick t
: total shares outstanding (conserved across time)

Figure 9

Broadcast Message Log

per-tick public broadcast to all other agents

Buy signal Sell signal Hold signal Deceptive

One dot per broadcast message, placed on the sender's row at the tick the message was sent. Dot colour encodes the signal (buy/sell/hold) and a red ring flags messages whose reported valuation diverges sufficiently from the sender's private belief to be classified as deceptive by the logger. Reading a column shows the instantaneous rumour mill; reading a row shows each agent's rhetorical stance over time.

Note

: broadcast from agent i to the population at tick t

Figure 10

Pairwise Trust Matrix

exponential-moving-average update

Heatmap of receiver-to-sender trust values in [0, 1]. The diagonal is masked. Each off-diagonal cell records how well sender s's recent valuation claims aligned with the period's volume-weighted average price, as seen by receiver r. Warm rows identify agents who tend to trust broadly; warm columns identify agents whose claims the population finds credible.

Note

: trust held by receiver r in sender s
: trust learning rate (exponential-moving-average weight)
: 1 − |claim − VWAP| / VWAP, clipped to [0, 1]

Table 1

Market-Quality Statistics (Current Session)

Quantitative summary in the notation of Dufwenberg, Lindqvist & Moore (2005) and Lopez-Lira (2025). Haessel R² measures fit of the per-period mean price to fundamental value; the two normalized deviations capture total and average mispricing per share outstanding; amplitude is the peak-to-trough excursion of the mean-price residual normalized by the initial fundamental; turnover is the total shares traded divided by shares outstanding. The lower group reports allocative efficiency, aggregate welfare, and the deception statistics unique to the Utility population.

Table 2

10-Session Batch Results

Per-round market-quality metrics across the 10-session DLM batch (5 × first treatment + 5 × second treatment). Each row is labelled Rr_Ss (Round r of Session s). dev = mean absolute deviation |P − FV| in ¢; turn = shares traded / shares outstanding; vol = total shares exchanged; payoff = aggregate agent cash at round end.

Replay & Trace Inspector

Live — tick 0

Term	Expansion	Meaning
Plan I	Algorithmic posterior	Deterministic baseline — $V_i^{\text{post}} = w\cdot V_i^{\text{prior}} + (1-w)\cdot\bar{m}$ with $w = 0.6 + 0.1\,\min(3, k_i)$.
Plan II	LLM posterior · utility forms	One chat completion per Utility agent per period. LLM returns a discrete action from {BUY_NOW, SELL_NOW, BID_1, BID_3, ASK_1, ASK_3, HOLD}. Prompt includes the closed-form $U_L / U_N / U_A$ expressions for the agent's risk type.
Plan III	LLM posterior · risk label only	Same wiring as Plan II but the prompt only names the risk-preference category; no functional form is supplied. Same seven-action output set.
DLM	Dufwenberg, Lindqvist & Moore (2005)	Source paper for the shared market substrate: $T$, $\mathbb{E}[d]$, $\mathrm{FV}_t$, and the four-round session loop.
U	Utility	EU-maximising agent — the sole agent class ($N = 6$). Per-period belief update is what Plans I, II, and III compare.
FV	Fundamental value	$\mathrm{FV}_t = \mathbb{E}[d] \cdot (T - t + 1)$ — risk-neutral value at the start of period $t$.
EU	Expected utility	$\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1-p_{\text{fill}})\cdot U(w_0)$ — the Utility agent's scoring functional over $\alpha \in \{\text{hold},\, \text{buy@}A_t,\, \text{sell@}B_t,\, \text{bid},\, \text{ask}\}$.
VWAP	Volume-weighted average price	Per-period average trade price weighted by quantity; baseline for the trust EMA update.
ND	Normalized deviation	Total absolute mispricing: $\mathrm{ND} = \sum_j \|p_j - \mathrm{FV}_{t(j)}\| \cdot q_j \,/\, Q$, where $j$ indexes trades, $q_j$ is trade quantity, and $Q$ is total shares outstanding.
R²	Haessel R²	Coefficient of determination of mean price against fundamental value.
TO	Turnover	Total shares traded divided by total shares outstanding — reports speculative intensity.
AE	Allocative efficiency	Realized aggregate valuation divided by the theoretical maximum: $\mathrm{AE} = \sum_i \hat{V}_i q_i \,/\, (\hat{V}_{\max} \cdot Q)$, where $\hat{V}_i = V_i^{\text{post}}$.
Session	10-session DLM batch	One click of Start runs 10 sessions (5 × first treatment + 5 × second treatment). Each session is a complete $R = 4$ round game; data is collected per round with labels $\texttt{R\{r\}\_S\{s\}}$.
Rr_Ss	Round–session label	Identifies Round $r$ of Session $s$ in the batch results table. Example: R3_S7 = round 3 of session 7.
T2 / T4	DLM treatment sizes	T2 (R4-⅔): 2 agents replaced in R4, 4 veterans remain. T4 (R4-⅓): 4 replaced, 2 veterans remain. First 5 sessions use the selected treatment, last 5 use the other.

Symbol	Definition	Where it appears
$\mathrm{FV}_t$	Fundamental value at the start of period $t$. $\mathrm{FV}_t = \mathbb{E}[d]\cdot(T - t + 1)$, with $\mathbb{E}[d] = \tfrac{1}{2}(0) + \tfrac{1}{2}(20) = 10$¢ and $T = 10$. Yields a staircase from $\mathrm{FV}_1 = 100$¢ to $\mathrm{FV}_{10} = 10$¢, resetting at every round boundary.	Shared substrate — drives every agent's prior (Figures 1–4)
$\text{prior}_t$	Agent $i$'s pre-update valuation at period $t$. $\text{prior}_t = \mathrm{FV}_t \cdot (1 + b_i + \varepsilon)$, clamped to $\geq 0$. Identical across all three plans.	Prior Formation stage (Figures 1–2)
$b_i = \delta_i \cdot \beta$	Persistent per-agent valuation bias. $\delta_i \in \{-1, 0, +1\}$ is the bias direction drawn at birth (pessimistic, unbiased, optimistic) and $\beta = 0.15$ is the bias magnitude. Enters the prior as $\text{prior}_t = \mathrm{FV}_t \cdot (1 + b_i + \varepsilon)$.	Prior formation (Figure 2)
$\varepsilon \sim \mathcal{U}[-n, n]$	Per-tick i.i.d. valuation noise, $n = 0.03$. Drawn fresh each decision via the seeded PRNG.	Prior formation — jitter term in $\text{prior}_t$ (Figure 2)
$k_i$	Agent $i$'s experience counter. Starts at $0$; incremented by $1$ at every round boundary. Controls the Plan I blend weight: $w = 0.6 + 0.1\,\min(3, k_i)$, so $w \in \{0.6, 0.7, 0.8, 0.9\}$ for $k_i = 0, 1, 2, \geq 3$.	Plan I posterior weight; DLM experience channel (Figure 3)
$\hat{v}_m$	Claimed valuation reported by peer agent $m$. Computed as $\hat{v}_m = \max(0,\, V_m \cdot \phi_m)$ where $\phi_m$ is a distortion multiplier determined by $m$'s communication strategy $\sigma_m \in \{H, B, D\}$ (see Figure 3). The peer-message mean is $\bar{m} = \tfrac{1}{\|M\|}\sum_{m \in M} \hat{v}_m$ where $M$ is the set of non-self messages received this period.	Plan I posterior — blended with prior via weight $w$ (Figure 3)
$\sigma_m \in \{H, B, D\}$	Communication strategy of agent $m$: $H$ = truthful (small uniform jitter), $B$ = biased (fixed-sign tilt), $D$ = strategic (inventory-dependent over/understatement). Assigned at birth and persistent across rounds.	Distortion multiplier $\phi_m$ in $\hat{v}_m$ (Figure 3)
$\phi_m$	Communication distortion multiplier. $\phi_m = 1 + \mathcal{U}[-h, h]$ if $\sigma_m = H$; $\phi_m = 1 + \delta_m \gamma$ if $\sigma_m = B$; $\phi_m = \kappa^+$ or $\kappa^-$ if $\sigma_m = D$ (depending on $q_m$ vs $q_m^0$). Parameters: $h = 0.02$, $\gamma = 0.10$, $\kappa^+ = 1.25$, $\kappa^- = 0.75$.	$\hat{v}_m = \max(0,\, V_m \cdot \phi_m)$ (Figure 3)
$V_i^{\text{post}}$	Agent $i$'s period-end valuation — the output of the active plan. Plan I (algorithmic): $V_i^{\text{post}} = w \cdot V_i^{\text{prior}} + (1 - w) \cdot \bar{m}$, or $V_i^{\text{prior}}$ if no messages. Plans II/III: set directly by the LLM's chosen action.	Becomes next period's prior in all three plans (Figure 3)
$U_L, U_N, U_A$	Risk-typed utility families. $U_L(w) = (w/w_0)^2$ (risk-loving, strictly convex); $U_N(w) = w/w_0$ (risk-neutral, linear); $U_A(w) = \sqrt{w/w_0}$ (risk-averse, strictly concave). All normalized by initial wealth $w_0$.	EU scoring; formulas appear explicitly in Plan II prompts (Figures 2–3)
$w_0, w_1$	Wealth states for EU evaluation. $w_0 = c_i + q_i \cdot \hat{V}_i$ (wealth if no trade); $w_1 = (c_i \pm p_{\text{order}}) + (q_i \pm 1) \cdot \hat{V}_i$ (wealth if the order fills at price $p_{\text{order}}$), where $c_i$ is cash, $q_i$ is inventory, and $\hat{V}_i \equiv V_i^{\text{post}}$ is the agent's subjective valuation.	EU scoring — $\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1 - p_{\text{fill}}) \cdot U(w_0)$ (Figure 2)
$p_{\text{fill}} = 0.30$	Assumed fill probability for a non-crossing (passive) quote. Used in the EU functional: $\mathrm{EU}(\alpha) = p_{\text{fill}} \cdot U(w_1) + (1 - p_{\text{fill}}) \cdot U(w_0)$. For crossing actions (buy@$A_t$, sell@$B_t$), $p_{\text{fill}} = 1$ (deterministic); for passive actions (bid, ask), $p_{\text{fill}} = 0.30$ (tunable).	EU scoring — $\alpha^\star_{i,t}$ action evaluation (Figure 2)
$\alpha^\star_{i,t}$	Optimal action for agent $i$ at tick $t$. $\alpha^\star_{i,t} = \arg\max_\alpha \mathrm{EU}(\alpha)$ over the five-element set $\alpha \in \{\text{hold},\, \text{buy@}A_t,\, \text{sell@}B_t,\, \text{bid},\, \text{ask}\}$, where $A_t$ is the current best ask and $B_t$ is the current best bid. buy@$A_t$ crosses the book at the resting ask (deterministic fill, $p_{\text{fill}} = 1$); sell@$B_t$ lifts the resting bid (deterministic fill); bid and ask post passive quotes ($p_{\text{fill}} = 0.30$). Plans II/III use a seven-element LLM action set: $\{\text{BUY\_NOW, SELL\_NOW, BID\_1, BID\_3, ASK\_1, ASK\_3, HOLD}\}$.	Action selection — output of EU maximization (Figures 2–3)
$\tau_{r \to s}$	Trust of receiver $r$ in sender $s$. Updated by exponential moving average: $\tau_{r \to s} \leftarrow (1 - \lambda)\,\tau_{r \to s} + \lambda \cdot \text{closeness}_{r,s}$, where $\lambda = 0.30$ is the EMA learning rate and $\text{closeness} = \max\!\bigl(0,\, 1 - \|\hat{v}_s - \text{VWAP}_t\|\,/\,\text{VWAP}_t\bigr)$. Initialized at $0.5$; self-trust fixed at $1.0$.	Messaging diagnostic; context for Plan II/III prompts (Figure 3)
$\pi_i^{\text{II}}, \pi_i^{\text{III}}$	Structured LLM prompts for Plans II and III. $\pi^{\text{II}}$ includes market rules, agent state, and the explicit utility formula $U_L/U_N/U_A$ for the agent's risk type. $\pi^{\text{III}}$ omits the formula and supplies only the risk-preference label.	LLM posterior — input to $\alpha^\star_{i,t} \leftarrow \text{LLM}(\pi_i)$ (Figure 3)
$Q$	Total shares outstanding, $Q = \sum_i q_i$, conserved under double-auction trades (shares transfer, never created or destroyed).	Normalized deviation $\mathrm{ND}$, turnover $\mathrm{TO}$ (Figure 4)
$\bar{p}_t$	Mean trade price in global period $t$. $\bar{p}_t = \sum_{j \in \mathcal{T}_t} p_j \,/\, \|\mathcal{T}_t\|$ where $\mathcal{T}_t$ is the set of trades in period $t$. Used as the basis for Haessel $R^2$ and amplitude.	Market-quality diagnostics (Figure 4, Table 1)
$R^2_{\text{Haessel}}$	Haessel (1978) coefficient of determination. $R^2 = 1 - \sum_t (\bar{p}_t - \mathrm{FV}_t)^2 \,/\, \sum_t (\bar{p}_t - \overline{\bar{p}})^2$. Measures how closely per-period mean prices fit the fundamental staircase; can be negative if mispricing exceeds sample variance.	Market-quality diagnostics (Figure 4, Table 1)
$\mathrm{ND}$	Normalized absolute price deviation. $\mathrm{ND} = \sum_j \|p_j - \mathrm{FV}_{t(j)}\| \cdot q_j \,/\, Q$, summing over all trades $j$ weighted by quantity, divided by total shares outstanding.	Market-quality diagnostics (Figure 4, Table 1)
$A$	Price amplitude. $A = \bigl(\max_t (\bar{p}_t - \mathrm{FV}_t) - \min_t (\bar{p}_t - \mathrm{FV}_t)\bigr) \,/\, \mathrm{FV}_1$. Peak-to-trough excursion of the mean-price residual, normalized by the initial fundamental.	Market-quality diagnostics (Figure 4, Table 1)
$\mathrm{TO}$	Turnover. $\mathrm{TO} = \sum_j q_j \,/\, Q$ — total shares traded (summing quantity $q_j$ over all trades $j$) divided by total shares outstanding $Q$. A value of $1.0$ means every share changed hands once.	Market-quality diagnostics (Figure 4, Table 1)
$\rho_t$	Price-to-fundamental ratio. $\rho_t = p_t \,/\, \mathrm{FV}_t$ (Lopez-Lira 2025), where $p_t$ is the most recent trade price at tick $t$. Values $> 1$ indicate overpricing; persistent $\rho_t \gg 1$ signals a bubble.	Market-quality diagnostics (Table 1)

Tag	Citation	Role in this simulator
DLM 2005	Dufwenberg, Lindqvist & Moore, Bubbles and Experience: An Experiment, AER 95(5), 1731–1737	Market substrate — asset life, dividend shape, $\mathrm{FV}_t$, session loop
LL 2025	Lopez-Lira, AI-Agent Expected-Utility Market Makers (working paper)	Utility agent, EU scoring, risk functionals, trust EMA
SSW 1988	Smith, Suchanek & Williams, Bubbles, Crashes and Endogenous Expectations in Experimental Spot Asset Markets, Econometrica 56(5)	Canonical experimental-bubble design; the asset-life and dividend structure that DLM 2005 inherits

Experiment settings

Agents Pre-run draft · editable before the simulation starts

Note

Trade & Dividend Feed

Note

Order Book

Note

Note

Note

Note

Note

Note

Note

Note

Note

Replay & Trace Inspector

Decisions recorded at this tick

System Design

Fundamental Value

Prior Elicitation

Expected-Utility Scoring

Plan I — Algorithmic Posterior

Pairwise Trust Dynamics

Plan II — LLM Posterior with Utility Forms

Plan III — LLM Posterior with Risk Label

Mispricing Measures

Volume and Efficiency Measures

System Prompt · $\pi^{\mathrm{II}}_{\text{sys}}$

User Prompt · $\pi^{\mathrm{II}}_{\text{usr}}(\text{agent}_2)$

How experience changes the prompt

Glossary & Reference

Abbreviations & indices

Mathematical notation

Figures

Transaction Price Trajectory vs Fundamental Value

Mispricing Magnitude

Trade Volume per Period

Transaction Density Heatmap

Agent Action Timeline

Subjective Valuation · Per agent

Pairwise Trust Matrix

Market-Quality Statistics (Table 1)

10-Session Batch Results (Table 2)

Source papers

AI-Agent Prior Elicitation in Experimental Asset Markets

Motivation

Literature & positioning

Experimental asset markets

LLMs as economic agents

Gap

Research questions

Key idea · three-plan factorial

Plan I · Algorithm

Plan II · LLM + Forms

Plan III · LLM + Label

Market substrate · DLM (2005)

Round boundary protocol

Session payoff & batch structure

Agent design · $N = 6$ Utility agents

U · Utility agent (sole agent class)

Risk composition

Strategy cube

Endogenous experience

Expected-utility framework

$U_L$ · Risk-loving

$U_N$ · Risk-neutral

$U_A$ · Risk-averse

Shared prior & endogenous experience

Plan I · algorithmic belief update

Novice · $k=0$

Intermediate · $k \in \{1,2\}$

Veteran · $k \geq 3$

Social learning · trust EMA & strategic deception

Plan II · LLM update with explicit utility forms

Prompt context $\pi_i^{\text{II}}$

Execution semantics

Plan III · LLM update with risk label only

Identification argument

Experimental design · treatments & parameters

Factorial structure