Dynamic Causal Modelling

Dynamic causal modelling (DCM)¹ tries to model the interactions between (cortical) neuronal populations, based on neuroimaging time series. In particular, DCM is generative, deterministic and nonlinear. It is distinct from conventional approaches in the assumptions made:

structural equation modelling assumes that responses are driven by endogenous factors
autoregression models assume that responses are driven by intrinsic noise
DCM assums that responses are driven by “designed changes in inputs”¹

In essence, DCM is a collection of biophysical models, each estimating the neuronal responses in one brain area. For BOLD fMRI in particular, these would be hemodynamic models. To start, we first take a look at the region-wise hemodynamic models.

1. Region-wise hemodynamic models²

Here, regions refer to units that the neuroimaging time series are based on, e.g., voxels. Each model estimates a region’s BOLD response given its input stimuli. This is done using Bayesian estimation which computes the posterior distribution of the model parameters given the data.

We start with a hemodynamic model that considers a single input stimulus³. This model inclues two components, one based on the Balloon model and one that models the changes in regional cerebral blood flow (rCBF).

1.1. The Balloon model

The Balloon model is a hemodynamic model that estimates the BOLD signal ($y$) based on blood flow⁴^,⁵:

\[y(t) = V_0 [7E_0 (1 - q) + 2 (1 - \frac{q}{v}) + (2E_0 - 0.2) (1 - v)] \\\]

where $V_0=0.02$ is the resting blood volume fraction, $v$ is the normalised venous volume, $q$ is the normalised total deoxyhemoglobin voxel content, and $E_0$ is the resting net oxygen extraction fraction by the capillary bed. Here, the variables are $v$ and $q$, which are characterised by rates of change:

\[\begin{aligned} \tau_0 \dot{v} & = f_{in} - f_{out}(v) = f_{in} - v^{\frac{1}{\alpha}} \\ \tau_0 \dot{q} & = f_{in} \frac{E(f_{in}, E_0)}{E_0} - f_{out}(v) \frac{q}{v} \approx f_{in} \frac{1 - (1 - E_0)^{\frac{1}{f_in}}}{E_0} - v^{\frac{1}{\alpha}} (\frac{q}{v}) \end{aligned}\]

where $f_{in}$ and $f_{out}$ are the inflow and outflow, $\tau_0$ is the mean transit time, $\alpha$ is a parameter from the windkessel model representing the stiffness. In the end, there are three unknown parameters ($E_0$, $\tau_0$, and $\alpha$) and one variable ($f_{in}$) to estimate.

1.2. rCBF component

This component models the linear dependency of blood flow on synaptic activity. This model estimates the inflow $f_{in}$ for the Balloon component based on the input stimulus $u(t)$:

\[\begin{aligned} \dot{f_{in}} & = s \\ \dot{s} & = \epsilon u(t) - \frac{s}{\tau_s} - \frac{f_{in} - 1}{\tau_f} \end{aligned}\]

where $s$ is some flow inducing signal, $\epsilon$ is an unknown parameter representing the efficacy of neuronal activity’s effects, $\tau_s$ is an unknown time-constant for signal decay, and $\tau_f$ is an unknown time-constant for autoregulatory feedback from blood flow.

1.3. Multi-input model

Now we can extend the model to a multiple input system. This is simply done by adding multiple $u(t)$ terms:

\[\dot{s} = \epsilon_1 u(t)_1 + ... + \epsilon_n u(t)_n - \frac{s}{\tau_s} - \frac{f_{in} - 1}{\tau_f}\]

The model has the state-space representation:

\[\begin{aligned} \dot{X}(t) & = f(X, u(t)) \\ \dot{x}_1 & = f_1(X, u(t)) = \epsilon_1 u(t)_1 + ... + \epsilon_n u(t)_n - \kappa_s x_1- \kappa_f (x_2 - 1) \\ \dot{x}_2 & = f_2(X, u(t)) = x_1 \\ \dot{x}_3 & = f_3(X, u(t)) = \frac{1}{\tau_0} (x_2 - f_{out}(x_3, \alpha)) \\ \dot{x}_4 & = f_4(X, u(t)) = \frac{1}{\tau_0} (x_2 \frac{E(x_2, E_0)}{E_0} - f_{out}(x_3, \alpha) \frac{x_4}{x_3}) \\ y(t) & = \lambda(X(t)) = 0.02 [7E_0 (1 - x_4) + 2 (1 - \frac{x_4}{x_3}) + (2E_0 - 0.2) (1 - x_3)] \end{aligned}\]

1.4. Priors

Informative priors on the parameters (or the unavailability of them) are often the roadblocks of Bayesian estimation. However, since we are using biophysical parameters in this case, their prior distribution can be estimated from data. While the model is nonlinear, its Volterra formuation is linear with respect to the unknown parameters:

\[\begin{aligned} y(t) & = \kappa_0 + \sum^{\infty}_{i=1} \int^{\infty}_0 ... \int^{\infty}_0 \kappa_i (\sigma_1, ..., \sigma_i) u(t - \sigma_1) ... u(t - \sigma_i) d\sigma_1 ... d\sigma_i \\ \kappa_i(\sigma_1, ..., \sigma_i) & = \frac{\partial^i y(t)}{\partial u(t - \sigma_1) ... u(t - \sigma_i)} \end{aligned}\]

Assuming Gaussian distributions, the prior densities can be described using the estimated sample mean and covariance²,³:

Parameter	Description	Sample mean $\eta_{\theta}$	Sample variance $C_{\theta}$
$\kappa_s = \frac{1}{\tau_s}$ (or $\kappa$)	rate of signal decay	0.65/s	0.015
$\kappa_f = \frac{1}{\tau_f}$ (or $\gamma$)	rate of flow-dependent elimination	0.41/s	0.002
$\tau_0$	hemodynamic mean transit time	0.98s	0.0568
$\alpha$	Grubb’s stiffness exponent	0.32	0.0015
$E_0$ (or $\rho$)	resting oxygen extraction fraction	0.34	0.0024

2. DCM

The interaction between brain regions can be arbitrarily modelled as:

\[\dot{z} = F(z, u, \theta)\]

where $z$ is the neuronal states of the brain regions, $u$ is the input stimulus, $\theta$ is the parameters. To reparametrise this formulation in terms of effective connectivity ($\frac{\partial \dot{z}}{\partial z}$), we use the bilinear approximation of it:

\[\begin{aligned} \dot{z} & \approx A z + \sum u_j B^j z + C u \\ A & = \frac{\partial F}{\partial z} = \frac{\partial \dot{z}}{\partial z} \\ B & = \frac{\partial^2 F}{\partial z \partial u_j} = \frac{\partial}{\partial u_j} \frac{\partial \dot{z}}{\partial z} \\ C & = \frac{\partial F}{\partial u} \end{aligned}\]

The three matrices $A$, $B^j$ and $C$ represents the effective connectivity (in absence of input), the input-induced change in connectivity, and the influences of inputs on neuronal activity respectively¹. This is a generic formulation of DCM, which can be combined with the hemodyncamic models for a model specific to BOLD fMRI. The combined model has the state-space representation:

\[\begin{aligned} \dot{x} & = f(x, u, \theta) \\ y & = \lambda(x) \\ \end{aligned}\]

Integrating the state equations (or using the Volterra kernels) will again give us the similar expansion, but for the predicted response $h$¹:

\[\begin{aligned} h_i(u, \theta) & = \sum_k \int^t_0 ... \int^t_0 \kappa^k_i (\sigma_1, ..., \sigma_k) u(t - \sigma_1), ..., u(t - \sigma_k) d\sigma_1, ..., d\sigma_k \\ \kappa^k_i (\sigma_1, ..., \sigma_k) & = \frac{\partial^k y_i}{\partial u(t - \sigma_1) ... u(t - \sigma_k)} \end{aligned}\]

Adding an error ($\epsilon$) term, we can approximate $y$ with a working etimate of the conditional mean²:

\[\begin{aligned} y & = h(u, \theta) + \epsilon \\ & \approx h(u, \eta_{\theta | y}) + \frac{\partial h(u, \eta_{\theta | y})}{\partial \theta} (\theta - \eta_{\theta | y}) + \epsilon \\ & = h(u, \eta_{\theta | y}) + J(\theta - \eta_{\theta | y}) + \epsilon \end{aligned}\]

Under Gaussian assumptions, we can continue with Bayesian estimation:

\[\begin{aligned} \text{Let } r & = y - h(u, \eta_{\theta || y}) \text{, then} \\ p(y | \theta) & \propto e^{-\frac{1}{2} [r - J (\theta - \eta_{\theta | y})]^T C^{-1}_\epsilon [r - J (\theta - \eta_{\theta | y})]} \\ p(\theta) & \propto e^{-\frac{1}{2} (\theta - \eta_\theta)^T C^{-1}_\theta (\theta - \eta_\theta)} \\ p(\theta | y) & \propto p(y | \theta) p(\theta) = e^{-\frac{1}{2} (\theta - \eta^{i+1}_{\theta | y})^T C^{-1}_{\theta | y} (\theta - \eta^{i+1}_{\theta | y})} \\ \text{where} \quad C^{-1}_{\theta | y} & = (J^T C^{-1}_\epsilon J + C^{-1}_\theta)^{-1} \\ \eta^{i+1}_{\theta | y} & = \eta^{i}_{\theta | y} + C_{\theta | y} [J^T C^{-1}_\epsilon r + C^{-1}_\theta (\eta_\theta - \eta^{i}_{\theta | y})] \end{aligned}\]

The parameters can then be estimated with an expectation maximization (EM) scheme.

3. Regression DCM

Regression DCM (rDCM) adds four modification to the original DCM framework⁶:

State and observation equations are translated to frequency domain using Fourier transform
Hemodynamic model is replaced with a linear hemodynamic response function (HRF)
Mean field approximation across regions
Conjugate priors on neuronal parameters and noise precision are specified

3.1. Four transforms and HRF

To start, the nonlinear term in the model is removed, reducing the formulation to:

\[\dot{x} = Ax + Cu\]

Then, the equation can be translated to the frequency domain (Fourier transform, denoted by the hat symbol):

\[\hat{\dot{x}} = i \omega \hat{x}= A \hat{x} + C \hat{u}\]

To add the hemodynamic model, we want to ensure the linearity of the equation for translating to the frequency domain. The simple approach adopted here is to represent responses ($y$) as the convolution between the neuronal states and a fixed HRF ($h$):

\[\begin{aligned} i \omega (\widehat{h \otimes x}) & = A(\widehat{h \otimes x}) + C \hat{h} \hat{u} \\ i \omega \hat{y_B} & = A \hat{y_B} + C \hat{h} \hat{u} \\ \end{aligned}\]

where $y_B$ is the response without noise. Here, the neuronal activity and response formulated as a continuous variables. However, our actual data (fMRI measurements) are discrete. This leads to the use of the discrete Fourier transform (DFT) for discretisation:

\[\begin{aligned} i \omega & := i m \Delta \omega = i m \frac{2 \pi}{NT} \approx \frac{1}{T} (e^{2 \pi i \frac{m}{N}} - 1) \\ \text{with} \quad y_i & = y_{B,i} + \epsilon_i \quad \text{where} \quad \epsilon_i \sim \mathcal{N} (0, \sigma^2_i I_{N \times N}) \\ \text{then} \quad \frac{\hat{y}}{T} (e^{2 \pi i \frac{m}{N}} - 1) & = A \hat{y} + C \hat{h} \hat{u} + [\frac{1}{T} (e^{2 \pi i \frac{m}{N}} - 1) - A] \hat{\epsilon} \end{aligned}\]

where $N$ is the number of data points, $T$ is the time interval between data points (i.e., repetition time for fMRI), and $m = [0, 1, …, N-1]$. is a vector of frequency indices.

3.2. Mean field approximation

Another issue arises as the error term makes it difficult to derive an analytical expression for the likelihood function. To solve this, mean field approximation is used, assuming no dependencies between parameters affecting different regions. Effectively, the noise distribution becomes:

\[\epsilon \sim \mathcal{N} (0, \tau^{-1}_i I_{N \times N})\]

where $\tau$ parametrise the noise precision.

3.3. Likelihood function

These modifications cause the likelihood functions to become:

\[\begin{aligned} p(Y | \theta, \tau, X) & =\prod^R_{r=1} \mathcal{N} (Y_r; X \theta_r, \tau^{-1}_r I_{N \times N}) \\ \text{where} \quad Y_r & = (e^{2 \pi i \frac{m}{N}} - 1) \frac{\hat{y}_r}{T} \\ X & = [\hat{y}_1, ..., \hat{y}_R, \hat{h} \hat{u}_1, ..., \hat{h} \hat{u}_K] \\ \theta_r & = [a_{r,1}, ..., a_{r, R}, c_{r, 1}, ..., c_{r, K}] \end{aligned}\]

Specifically, $Y_r$ is the Fourier transformation of the temporal derivative of $y_r$, which is the BOLD signal in brain region $r$. The design matrix $X$ involves the variables $y$ and $u$ (input stimulus) through discrete Fourier transform. The parameters $\theta_r$ incluses connections $a_{i, j}$ between regions $r$ and $1, …, R$, and driving input parameters $c_{i, j}$ targeting region $r$. The parameter $\tau_r$ represents the noise precision for region $r$.

Additionally, rDCM can be applied to resting-state fMRI, by setting all $c_r$ to zero in the absence of input stimulus $u_k$⁷. This formulation comes at a loss of the separation between state and observation levels, and is similar to a multivariate autoregressive model in the frequency domain.

3.4. Conjugate priors

For the noise precision $\tau$, Gamma prior is used as Gamma priors serves as conjugate prios on precision for Gaussian likelihood. For the connectivity parameters, the same zero-mean Gaussian shrinkage priors from classical DCM are used. The prior distributions are thus:

\[\begin{aligned} p(\tau_i) & = \Gamma(\tau_i; \alpha_0, \beta_0) = \frac{\beta^{\alpha_0}_0}{\Gamma(a_0)} \tau^{\alpha_0 - 1}_i e^{-\beta_0 \tau_i} \\ p(\theta_i) & = \mathcal{N} (\theta_i; \mu^i_0, \Sigma^i_0) = (2 \pi)^{-\frac{D_i}{2}} |\Sigma^i_0|^{-\frac{1}{2}} e^{-\frac{1}{2}(\theta_i - \mu^i_0)^T {\Sigma^{i}_0}^{-1} (\theta_i - \mu^i_0)} \\ \end{aligned}\]

where $\alpha_0=2$ and $\beta_0=1$ are the shape and rate parameters of Gamma distribution matching the first two moments of the standard log-normal prior from DCM10 (from SPM8), and $\mu_0$ and $\Sigma_0$ are the mean and covariance of Gaussian distribution from DCM10.

3.5. Inference

The inference procedure is done using a variational Bayes approach under the Laplace approximation (VBL):

\[\begin{aligned} \text{since} \quad p(\theta, \tau | Y, X) & \propto \prod^R_{i=1} p(Y_i | X, \theta_i, \tau_i) \prod^R_{i=1} [p(\theta_i) p(\tau_i)] \\ \ln q(\theta | Y, X) & = \langle \ln p(\theta, \tau, Y | X) \rangle_{q(\tau)} + const \\ & = \langle \ln \mathcal{N} (Y; X \theta, \tau^{-1} I_{N \times N}) + \ln \mathcal{N} (\theta; \mu_0, \Sigma_0) \rangle_{q(\tau)} + const \\ & = -\frac{\langle \tau \rangle_{q(\tau)}}{2} (Y - X\theta)^T (Y -X\theta) -\frac{1}{2} (\theta - \mu_0)^T {\Sigma_0}^{-1} (\theta - \mu_0) + const \\ & = -\frac{\alpha_{\tau|y}}{2\beta_{\tau|y}}\theta^T X^T X\theta + \frac{\alpha_{\tau|y}}{\beta_{\tau|y}} \theta^T X^T Y - \frac{1}{2} \theta^T \Sigma^{-1}_0 \theta + \theta^T \Sigma^{-1}_0 \mu_0 + const \\ & = \theta^T [-\frac{1}{2} (\frac{\alpha_{\tau|y}}{\beta_{\tau|y}} X^T X + \Sigma^{-1}_0) \theta + \frac{\alpha_{\tau|y}}{\beta_{\tau|y}} X^T Y + \Sigma^{-1}_0 \mu_0] + const \\ \ln q(\tau | Y, X) & = \langle \ln p(\theta, \tau, Y | X) \rangle_{q(\theta)} + const \\ & = \langle \ln \mathcal{N} (Y; X \theta, \tau^{-1} I_{N \times N}) + \ln \Gamma (\tau; \alpha_0 \beta_0) \rangle_{q(\theta)} + const \\ & = \frac{N}{2} \ln - \frac{\tau}{2} \langle (Y - X\theta)^T (Y -X\theta) \rangle_{q_(\theta)} + (\alpha_0 - 1) \ln \tau - \beta_0 \tau + const \\ & = \frac{N}{2} \ln - \frac{\tau}{2} [(Y - X\mu_{\theta | y})^T (Y - X\mu_{\theta | y}) + trace(X^T X \Sigma_{\theta | y})] + (\alpha_0 - 1) \ln \tau - \beta_0 \tau + const \\ \end{aligned}\]

Finally, the iterative sheme has the update functions, with a convergence criteria of $\Delta \tau < 10^{-10}$:

\[\begin{aligned} \Sigma_{\theta | y} & = (\frac{\alpha_{\tau|y}}{\beta_{\tau|y}} X^T X + \Sigma^{-1}_0)^{-1} \\ \mu_{\theta | y} & = \Sigma_{\theta | y} (\frac{\alpha_{\tau|y}}{\beta_{\tau|y}} X^T Y + \Sigma^{-1}_0 \mu_0) \\ \alpha_{\tau | y} & = \alpha_0 + \frac{N}{2} \\ \beta_{\tau | y} & = \beta_0 + \frac{1}{2} [(Y - X\mu_{\theta | y})^T (Y - X\mu_{\theta | y}) + trace(X^T X \Sigma_{\theta | y})] \end{aligned}\]

Friston KJ, Harrison L, Penny W. 2003. Dynamic causal modelling. NeuroImage. 19: 1273-1302. DOI: 10.1016/S1053-8119(03)00202-7. ↩ ↩² ↩³ ↩⁴
Friston KJ. 2002. Bayesian estimation of dynamical systems: an application to fMRI. NeuroImage. 16: 513-530. DOI: 10.1006/nimg.2001.1044. ↩ ↩² ↩³
Friston KJ, Mechelli A, Turner R, Price CJ. 2000. Nonlinear responses in fMRI: the Balloon model, Volterra kernels, and other hemodynamics. NeuroImage. 12: 466-477. DOI: 10.1006/nimg.2000.0630. ↩ ↩²
Buxton RB, Frank LR. 1997. A model for the coupling between cerebral blood flow and oxygen metabolism during neural stimulation. J. Cereb. Blood Flow Metab. 17: 64-72. ↩
Buxton RB, Wong EC, Frank LR. 1998. Dynamics of blood flow and oxygenation changes during brain activation: The Balloon model. MRM. 39: 855-864. ↩
Frässle S, Lomakina El, Razi A, Friston KJ, et al. 2017. Regression DCM for fMRI. NeuroImage. 155: 406-421. DOI: 10.1016/j.neuroimage.2017.02.090 ↩
Frässle S, Harrison SJ, Heinzle J, Clementz BA, et al. 2021. Regression dynamic causal modeling for resting-state fMRI. Hum Brain Mapp. 42: 2159-2180. DOI: 10.1002/hbm.25357 ↩

1. Region-wise hemodynamic models2