With Applications to Portfolio Optimization
You manage a portfolio of \(N\) assets.
Let \(\mathbf{r}_t = (r_{1}, \dots, r_{N})\) denote the vector of asset returns at time \(t\).
Using \(T\) periods of historical data, we observe: \[ \mathbf{Y} = \{\mathbf{r}_t\}_{t=1}^T \in \mathbb{R}^{T \times N} \]
we infer the inputs used for portfolio construction: \[ \boldsymbol{\mu} \;=\; \mathbb{E}[\mathbf{r}_t], \qquad \boldsymbol{\Sigma} \;=\; \mathrm{Cov}(\mathbf{r}_t). \]
Core question: How uncertain are these inferred inputs?
Standard portfolio optimization solves:
\[ \max_{\mathbf{w}} \quad \mathbf{w}^\top \widehat{\boldsymbol{\mu}} - \frac{\gamma}{2}\, \mathbf{w}^\top \widehat{\boldsymbol{\Sigma}}\,\mathbf{w} \]
(Equation from [1])
Issue: Parameter uncertainty is ignored
\[ \mathbf{r}_t \sim p(\mathbf{r}_t \mid \boldsymbol{\mu}, \boldsymbol{\Sigma}), \qquad (\boldsymbol{\mu}, \boldsymbol{\Sigma}) \sim p(\boldsymbol{\mu}, \boldsymbol{\Sigma}) \]
\[ \boldsymbol{\Sigma} = \mathbf{D}\mathbf{C}\mathbf{D}, \qquad \mathbf{D} = \mathrm{diag}(\sigma_1, \dots, \sigma_N) \]
Inference state: \(x = (\boldsymbol{\mu}, \mathbf{D}, \mathbf{C})\)
We seek the posterior distribution:
\[ \mathbb{P}(\boldsymbol{\mu}, \mathbf{D}, \mathbf{C} \mid \mathbf{Y}) \]
By Bayes’ theorem:
\[ \mathbb{P}(\boldsymbol{\mu}, \mathbf{D}, \mathbf{C} \mid \mathbf{Y}) = \frac{\mathbb{P}(\mathbf{Y} \mid \boldsymbol{\mu}, \mathbf{D}, \mathbf{C}) \mathbb{P}(\boldsymbol{\mu}, \mathbf{D}, \mathbf{C})}{\mathbb{P}(\mathbf{Y})} \]
The marginal likelihood requires a high-dimensional integral:
\[ \mathbb{P}(\mathbf{Y}) = \int \mathbb{P}(\mathbf{Y} \mid \boldsymbol{\mu}, \mathbf{D}, \mathbf{C}) \mathbb{P}(\boldsymbol{\mu}, \mathbf{D}, \mathbf{C}) \ d\boldsymbol{\mu} \ d\mathbf{D} \ d\mathbf{C} \]
No closed-form solution
Dimension grows quickly with \(N\)
MCMC is a class of algorithms that generate samples from a posterior distribution.
Define the unnormalized target density:
\[ \gamma(x) \equiv \mathbb{P}(\mathbf{Y} \mid x) \mathbb{P}(x), \quad x = (\boldsymbol{\mu}, \mathbf{D}, \mathbf{C}) \]
The normalizing constant is:
\[ Z = \mathbb{P}(\mathbf{Y}) = \int \gamma(x) \ dx \]
Although \(Z\) is unknown, we can compute relative posterior densities:
\[ \frac{\mathbb{P}(x' \mid \mathbf{Y})}{\mathbb{P}(x \mid \mathbf{Y})} = \frac{\gamma(x')}{\gamma(x)} \]
How much more (or less) posterior density the model assigns to \(x'\) versus \(x\).
(See [3])
MCMC algorithms generate a Markov chain with states: \[ x^{(1)}, x^{(2)}, \dots, x^{(K)} \]
which (once the chain has converged) are treated as samples from the target distribution \[ \mathbb{P}(x \mid \mathbf{Y}) \propto \gamma(x). \]
All MCMC algorithms:
Examples: Metropolis–Hastings, Gibbs, Hamiltonian Monte Carlo
High-dimensional parameter space Even a small portfolio with \(N=5\) assets has:
Complex posterior geometry
Result: Single chains make local moves, can get stuck, and yield unreliable uncertainty estimates. (See [7]; [3])
In practice, we run multiple chains in parallel:
\[ { x_1^{(1:K)}, \dots, x_M^{(1:K)} } \]
Parallelism provides:
However, chains remain independent.
Parallel chains help, but independence limits exploration of complex posteriors.
Interacting MCMC allows chains to share information during sampling.
Examples include:
Benefits:
Interacting MCMC requires coordination during execution.
However:
Portfolio and risk applications require:
Two runs with the same data and seed
should not produce different inferences
Question: How can we run interacting, parallel MCMC algorithms such that:
Stateful RNGs depend on:
In parallel MCMC, execution order varies, leading to different random draws and different results.
A stateful RNG produces a deterministic sequence given a seed.
Example sequence:
\[ \text{seed} \to s_0 \to (u_0, s_1) \to (u_1, s_2) \to \dots \]
Generate randomness as a pure function of algorithmic context:
\[ u = f_{\text{RNG}}(\text{seed}, \text{chain ID}, \text{iteration}, \text{variable index}) \]
Random numbers depend only on their role in the algorithm, not on execution history.
Result: Bitwise-identical results across machines, thread counts, and execution orders.
Goal: Generate random numbers as a deterministic, one-to-one function of structured inputs.
Each draw is keyed by:
\[ (\text{seed}, \text{chain ID}, \text{iteration}, \text{variable index}) \]
Each random draw is uniquely determined by its algorithmic role. (See [13])
Example
| Chain ID | Iteration | Variable Index | Key |
|---|---|---|---|
| 1 | 10 | 5 | (seed, 1, 10, 5) |
| 2 | 10 | 5 | (seed, 2, 10, 5) |
| 1 | 11 | 6 | (seed, 1, 11, 6) |
| 2 | 11 | 6 | (seed, 2, 11, 6) |
Integer Mixing Permutations
\[ u = f_{\text{RNG}}(h) = \text{mix}(h) \]
Examples: SplitMix (See [14])
Key: for chain \(m\), iteration \(t\), index \(j\) \[ k_{m,t,j} = (\text{seed}, m, t, j) \]
Encode to 64-bit integer: \(z_{m,t,j} = \text{encode}(k_{m,t,j})\)
Stateless RNG (integer mixing): \[ u_{m,t,j} = f_{\text{RNG}}(z_{m,t,j}) = \text{mix}_{64}(z_{m,t,j}), \qquad u_t^{(m)} = (u_{m,t,1}, \dots, u_{m,t,N_t^{(m)}}) \]
MCMC update: \[ x_{t+1}^{(m)} = \Phi_m\!\big(x_t^{(m)},\, X_{-m}^{(t)},\, u_t^{(m)}\big) \]
where:
(See [13] for details.)
1. Explicit randomness required (no hidden RNG)
2. Deterministic communication only (no randomness in control flow)