On the stochastic differential equation implied by f(σ,Wₜ)

To effectively elucidate the causal relationships between various economic processes, it is vital to delineate the evolution of their patterns. For example, recent developments in interest rates highlight the potential correlations among different rates. The onset of global trade tensions, initiated by former President Trump’s policies, has prompted notable adjustments, including reductions in rates by various countries due to decisions made by international institutions such as the European Central Bank (ECB) (see https://www.lesechos.fr/finance-marches/marches-financiers/la-bce-choisit-de-baisser-ses-taux-face-a-lincertitude-economique-2160609). Additionally, the Federal Reserve (FED) is facing pressure to lower its rates in response to these external influences (https://www.marketwatch.com/story/trump-is-furious-that-fed-wont-cut-interest-rates-like-ecb-heres-why-powell-wont-budge-162dfdaa).


A straightforward approach to modeling the evolution of interest rates is through stochastic processes, such as the Ornstein-Uhlenbeck process. Although potential negative rates present a challenge, our focus will be on further exploring multivariate scenarios. Should it be imperative to avoid negative rates, the Heston-White model presents a viable alternative. For a thorough examination of interest rate modeling, refer to the comprehensive work of Damiano Brigo and Fabio Mercurio.

In the following, we are interested in the stochastic differential equation of the form

{\displaystyle \begin{aligned}        {\rm d}X_t = -\alpha\,X_t\,{\rm d}t + \dots     \end{aligned} }

where the second term shall be generalized. But what generalization?
We thus introduce the vector stochastic process X_t  of dimension n  (n  interest rates), \alpha  is a n\times n  matrix describing the trends of the vector stochastic processes. 
In other posts of the present blog, the following equation was proposed.

{\displaystyle \begin{aligned}        {\rm d}X_t = -\alpha\,X_t\,{\rm d}t + f(\sigma W_t),     \end{aligned} }     (1)

where f  is a function, assumed to be at least continuous on \mathbb{R}  (or \mathbb{R} -Borelian, or "Borealian" to be more precise). In addition, \sigma  is assumed to be some positive number. Finally, W_t  is a vector of standard Wiener processes. The function f  is giving non-linearities and further depenencies 
First, we note that this equation gives

{\displaystyle \begin{aligned}        f(\sigma W_t) = {\rm d}X_t + \alpha\,X_t\,{\rm d}t.     \end{aligned} }

This means that f  is a sum of linear forms (i.e. "{\rm d}\dots ") defining some metric of integration. Since {\rm d}X_t  and {\rm d}t  are the only forms which we consider in this equation, then we heuristically we:

{\displaystyle \begin{aligned}        f(\sigma W_t) = A_1(X_t,t)\,{\rm d}t + A_2(X_t,t)\,{\rm d}X_t + A_3(X_t,t)\,{\rm d}X_t\,{\rm d}X_t + A_4(X_t,t)\,{\rm d}t\,{\rm d}X_t,     \end{aligned} }

where the A_i 's are Borealian functions of X_t  and t  and we ignore the terms of the form ({\rm d}X_t)^k  with k>2 , and ({\rm d}t)^k  with k>1 . Considering now the fact that f  is only depending on W_t , this means that the term of the form {\rm d}t\,{\rm d}X_t  could be set to zero. Thus we have:

{\displaystyle \begin{aligned} f(\sigma W_t) = A_1(X_t,t)\,{\rm d}t + A_2(X_t,t)\,{\rm d}X_t + A_3(X_t,t)\,{\rm d}X_t\,{\rm d}X_t.     \end{aligned} }

Using again the fact that f  only depends on W_t, we should have

{\displaystyle \begin{aligned} f(\sigma W_t) = \tilde{A}_1(W_t,t)\,{\rm d}t + \tilde{A}_2(W_t,t)\,{\rm d}W_t + \tilde{A}_3(W_t,t)\,{\rm d}W_t\,{\rm d}W_t = \left(\tilde{A}_1(W_t,t)+\tilde{A}_3(W_t,t)\right)\,{\rm d}t + \tilde{A}_2(W_t,t)\,{\rm d}W_t,     \end{aligned} }

where the \tilde{A}_i 's are other Borelian functions but only depending on W_t  (and t ). Repporting to Eq. (1), we then have:

{\displaystyle \begin{aligned} {\rm d}X_t = (-\alpha\,X_t+F(W_t,t))\,{\rm d}t + G(W_t,t)\,{\rm d}W_t.     \end{aligned} }

This (vector) equation turns out to be the most possible general stochastic differential equation related to the function f  introduced in Eq. (1). Note here that F(W_t,t)  is a vector of dimension n  and G(W_t,t)  is a matrix of dimension n\times n , representing the covariance matrix associated with the vector X_t . In fact, this equation is an Itô process.
If the processes only have dependencies in their stochastic terms, we shall set F  to be a vector only depending on time t , i.e. F(W_t,t)\equiv F(t) , so that the final quation of interest is given by:

{\displaystyle \begin{aligned} {\rm d}X_t = (-\alpha\,X_t+F(t))\,{\rm d}t + G(W_t,t)\,{\rm d}W_t.     \end{aligned} }

We integrate this equation by setting:

{\displaystyle \begin{aligned} Y_t = {\rm exp}\,\left(\alpha t\right)\, X_t.     \end{aligned} }

The Itô's lemma gives:

{\displaystyle \begin{aligned} {\rm d}Y_t = \alpha\, {\rm exp}\,\left(\alpha t\right)\, X_t\,{\rm d}t + {\rm exp}\,\left(\alpha t\right)\, \,{\rm d}X_t = {\rm exp}\,\left(\alpha t\right)\,F(t)\,{\rm d}t + {\rm exp}\,\left(\alpha t\right)\,G(W_t,t)\,{\rm d}W_t.     \end{aligned} }

Therefore, integration of this process finally leads to:

{\displaystyle \begin{aligned} X_t = {\rm exp}(-\alpha\,t)\,X_0 + \int_0^t {\rm exp}(-\alpha\,(t-s))\,F(s)\,{\rm d}s + \int_0^t {\rm exp}(-\alpha\,(t-s))\,G(W_s,s)\,{\rm d}W_s.     \end{aligned} }

Now, we note that the only random term is the third one, which has zero expected value. Therefore, we have

{\displaystyle \begin{aligned} X_t \sim \mathcal{N}\left({\rm exp}(-\alpha\,t)\,X_0 + \int_0^t {\rm exp}(-\alpha\,(t-s))\,F(s)\,{\rm d}s,\,\, \int_0^t {\rm exp}(-\alpha\,(t-s))\,G(W_s,s)\,{\rm d}W_s\right).     \end{aligned} }

In words, X_t  is following a normal vector process with covariance \displaystyle \int_0^t {\rm exp}(-\alpha\,(t-s))\,G(W_s,s)\,{\rm d}W_s . It shall be interesting to see in which circumstances the matrix \alpha and vector F may lead to a non-explosive process.

Model Assumptions

The modeling decision to employ f(\sigma W) rather than a time-varying correlation matrix reflects a deliberate trade-off between expressive power and analytical tractability. The function f is used to capture nonlinear heteroskedastic behavior influenced by interaction between multiple stochastic systems. More Specifically:

  • Nonlinearity: The transformation via f permits the introduction of local, nonlinear distortion effects that are challenging to capture using purely linear correlation structures.
  • Parsimony: A full time-evolving correlation matrix introduces a significant number of parameters, which can lead to identifiability issues, particularly when empirical data is limited.
  • Interpretability: The function f offers a modular and interpretable way to model external influence on endogenous noise, aligning with methods used in stochastic volatility modeling.

The choice here is intentional and consistent with the goal of modeling systems where volatility is driven by nonlinear interaction rather than simply nonstationary correlation.

The differential form introduced as Equation (7) in the paper is given by:

{\displaystyle df(\mathbf{X}) = \sum_{k = 1}^{n} \frac{\partial f(\mathbf{X})}{\partial x_k} \, dx_k. \nonumber }

This expression implies that f is differentiable and locally homogeneous of degree 1, satisfying:

{\displaystyle f(\lambda \mathbf{x}) = \lambda f(\mathbf{x}), \quad \text{for any } \lambda \in \mathbb{R}, \mathbf{x} \in \mathbb{R}^n. \nonumber }

This is not an assumption of global homogeneity, but rather a local property that ensures consistency under scalar transformation. The rationale behind this is twofold:

  • Stability Under Scaling: Systems influenced by proportional shocks should exhibit consistent variance scaling properties under time evolution.
  • Differentiability: The form of df ensures that perturbations to each dimension of X yield tractable expressions in the stochastic differential system.

This framework is particularly useful for modeling multiplicative noise processes or systems with volatility clustering.

Equation (6) as originally written,

{\displaystyle dx^{i,j}_t = \alpha x^{i,j}_t dt + f(\sigma W^{\{i,j\}}_t), \nonumber }

is shorthand for a more general formulation in which the driving noise W^{\{i,j\}}_t is a linear combination of two Wiener processes:

{\displaystyle \tilde{W}_t = \lambda_1 W^i_t + \lambda_2 W^j_t, \quad \lambda_1, \lambda_2 \in \mathbb{R}. \nonumber }

The resulting system becomes:

{\displaystyle dx_t = \alpha x_t dt + f(\sigma \tilde{W}_t). \nonumber }

This construction acknowledges that real-world systems are rarely closed and often subject to external influences that do not respect strict orthogonality. The function f absorbs these dependencies into a nonlinear transformation of noise.

Given that f \neq 0, the resulting process is no longer a Levy process in the strict sense. The introduction of f breaks both stationary increment and independent increment properties, depending on its form. This departure is intentional, as the goal is to model a more physically realistic, heteroskedastic process where the variance is no longer constant and memory effects may emerge. The nonlinear properties of f(W(\boldsymbol{z})) described in Chapter II, which can be thought of as a type of memory function, arise from its intrinsic dependence on past values of W_t. This means that for some integer-time stochastic process \{Z_n; n \geqslant 1\}, our model may satisfy one of two conditions:

A \textbf{Submartingale} condition:

{\displaystyle \mathbb{E}[|Z_n|] < \infty, \quad \mathbb{E}[Z_n | Z_{n-1}, Z_{n-2}, \dots, Z_1] \geq Z_{n-1}, \quad n \geq 1, \nonumber }

or a \textbf{Supermartingale} condition:

{\displaystyle \mathbb{E}[|Z_n|] < \infty, \quad \mathbb{E}[Z_n | Z_{n-1}, Z_{n-2}, \dots, Z_1] \leq Z_{n-1}, \quad n \geq 1. \nonumber }