Deep Learning for Regularization Prediction in Diffeomorphic Image Registration

Jian Wang1Orcid, Miaomiao Zhang1Orcid
1: University of Virginia, USA
Publication date: 2022/02/13
https://doi.org/10.59275/j.melba.2021-77df
PDF · arXiv

Abstract

This paper presents a predictive model for estimating regularization parameters of diffeomorphic image registration. We introduce a novel framework that automatically determines the parameters controlling the smoothness of diffeomorphic transformations. Our method significantly reduces the effort of parameter tuning, which is time and labor-consuming. To achieve the goal, we develop a predictive model based on deep convolutional neural networks (CNN) that learns the mapping between pairwise images and the regularization parameter of image registration. In contrast to previous methods that estimate such parameters in a high-dimensional image space, our model is built in an efficient bandlimited space with much lower dimensions. We demonstrate the effectiveness of our model on both 2D synthetic data and 3D real brain images. Experimental results show that our model not only predicts appropriate regularization parameters for image registration, but also improving the network training in terms of time and memory efficiency.

Keywords

diffeomorphic image registration · predictive registration regularization · deep learning

Bibtex @article{melba:2021:017:wang, title = "Deep Learning for Regularization Prediction in Diffeomorphic Image Registration", author = "Wang, Jian and Zhang, Miaomiao", journal = "Machine Learning for Biomedical Imaging", volume = "1", issue = "February 2022 issue", year = "2021", pages = "1--20", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2021-77df", url = "https://melba-journal.org/2021:017" }
RISTY - JOUR AU - Wang, Jian AU - Zhang, Miaomiao PY - 2021 TI - Deep Learning for Regularization Prediction in Diffeomorphic Image Registration T2 - Machine Learning for Biomedical Imaging VL - 1 IS - February 2022 issue SP - 1 EP - 20 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2021-77df UR - https://melba-journal.org/2021:017 ER -

2021:017 cover

Disclaimer: the following html version has been automatically generated and the PDF remains the reference version. Feedback can be sent directly to publishing-editor@melba-journal.org

1 Introduction

Diffeomorphic image registration is a fundamental tool for various medical image analysis tasks, as it provides smooth and invertible smooth mapping (also known as a diffeomorphism) between pairwise images. Examples include atlas-based image segmentation (Ashburner and Friston, 2005; Gao et al., 2016), anatomical shape analysis based on geometric changes (Vaillant et al., 2004; Zhang et al., 2016; Hong et al., 2017), and motion correction in spatial-temporal image sequences (De Craene et al., 2012; Liao et al., 2016; Xing et al., 2019). The nice properties of diffeomorphisms keep topological structures of objects intact in images. Artifacts (i.e., tearing, folding, or crossing) that generate biologically meaningless images can be effectively avoided, especially when large deformation occurs. The problem of diffeomorphic image registration is typically formulated as an optimization over transformation fields, such as a free-form deformation using B-splines (Rueckert et al., 2006), a LogDemons algorithm based on stationary velocity fields (SVF) (Arsigny et al., 2006), and a large diffeomorphic deformation metric mapping (LDDMM) method utilizing time-varying velocity fields (Beg et al., 2005).

To ensure the smoothness of transformation fields, a regularization term defined on the tangent space of diffeomorphisms (called velocity fields) is often introduced in registration models. Having such a regularity with proper model parameters is critical to registration performance because they greatly affect the estimated transformations. Either too large or small-valued regularity can not achieve satisfying registration results (as shown in Fig .1). Models of handling the regularity parameter mainly include (i) direct optimizing a Bayesian model or treating it as a latent variable to integrate out via Expectation Maximization (EM) algorithm (Allassonnière et al., 2007; Allassonnière and Kuhn, 2008; Zhang et al., 2013; Wang and Zhang, 2021), (ii) exhaustive search in the parameter space (Jaillet et al., 2005; Valsecchi et al., 2013; Ruppert et al., 2017), and (iii) utilizing parameter continuation methods (Haber et al., 2000; Haber and Modersitzki, 2006; Mang and Biros, 2015; Mang et al., 2019). Direct optimization approaches define a posterior of transformation fields that includes an image matching term as a likelihood and a regularization as a prior to support the smoothness of transformations (Zöllei et al., 2007; Allassonnière and Kuhn, 2008; Toews and Wells, 2009). Estimating regularization parameters of these models using direct optimization is not straightforward due to the complex structure of the posterior distribution. Simpson et al. infer the level of regularization in small deformation registration model by mean-field VB inference (Jordan et al., 1999), which allows tractable approximation of full Bayesian inference in a hierarchical probabilistic model (Simpson et al., 2012, 2015). However, these aforementioned algorithms are heavily dependent on initializations, and are prone to getting stuck in the local minima of high-dimensional and non-linear functions in the transformation space. A stochastic approximative expectation maximization (SAEM) algorithm (Allassonnière and Kuhn, 2008) was developed to marginalize over the posterior distribution of unknown parameters using a Markov Chain Monte Carlo (MCMC) sampling method. Later, Zhang et al. estimate the model parameters of regularization via a Monte Carlo Expectation Maximization (MCEM) algorithm for unbiased atlas building problem (Zhang et al., 2013). A recent model of Hierarchical Bayesian registration (Wang and Zhang, 2021) further characterizes the regularization parameters as latent variables generated from Gamma distribution, and integrates them out by an MCEM method.

Refer to caption
Figure 1: Left to right: examples of transformation fields overlaid with deformed images with under-regularized and over-regularized registration models. A small regularization introduces crossing artifacts on the transformations vs. a large regularization discourages sufficient transformations between images.

Despite the achievement of the aforementioned methods, estimating the regularization parameter in a high-dimensional and nonlinear space of 3D MRIs (i.e., dimension is typically 1283superscript1283128^{3} or higher) inevitably leads to expensive computational cost through iterative optimizations. To address this issue, we present a deep learning approach to fast predict registration parameters. While there exist learning-based registration models for transformations (Krebs et al., 2019; Balakrishnan et al., 2018; Biffi et al., 2020), we are particularly interested in learning the relationship between pairwise images and optimal regularizations of transformations via regression. In order to produce “ground truth” regularization parameters, we first introduce a low-dimensional Bayesian model of image registration to estimate the best regularity from the data itself. Following a recent work of (Wang et al., 2019), we construct a posterior distribution of diffeomorphic transformations entirely in a bandlimited space with much lower dimensions. This greatly reduces the computational cost of data generation in training. The theoretical tools developed in this work are generic to various deformable registration models, e.g, stationary velocity fields that remain constant over time (Arsigny et al., 2006). The model recommends optimal registration parameters for registration in real-time and has great potential in clinical applications (i.e., image-guided navigation system for brain shift compensation during surgery (Luo et al., 2018)). To summarize, our main contributions are three folds:

  • To the best of our knowledge, we are the first to present a predictive regularization estimation method for diffeomorphic image registration through deep learning.

  • We develop a low-dimensional Bayesian framework in a bandlimited Fourier space to speed up the training data generation.

  • Our model significantly speeds up the parameter estimation, while maintaining comparable registration results.

The paper is organized as follows. In sec. 2, we lay out the basics of image registration optimization in the LDDMM framework. In sec. 3.1, we first develop a low-dimensional posterior distribution that is parametrized by bandlimited velocity fields. We then estimate the regularization parameter by maximizing the posterior. In sec. 3.2, we design a deep convolutional neural network that takes an image pair as input and adaptively predicts the optimal smoothness level for registration. In sec. 4, we validate our model on both 2D synthetic data and 3D brain MRI scans.

2 Background: Fast LDDMM With Geodesic Shooting

In this section, we first briefly review a fast image registration algorithm FLASH in the setting of LDDMM with geodesic shooting (Zhang and Fletcher, 2015, 2019). The core idea of the FLASH is to reparameterize diffeomorphic transformations effectively in its tangent space (also known as velocity fields), where the signals are smooth without developing high frequencies in the Fourier space. This allows all computations of the original LDDMM with geodesic shooting (Vialard et al., 2012; Younes et al., 2009) to be carried out in the bandlimited space of velocity fields with much lower dimensions. As a result, the FLASH algorithm significantly speeds up diffeomorphic image registration with little to no loss of accuracy.

2.1 Geodesic Shooting in Fourier Spaces

Given time-dependent velocity field v~tV~subscript~𝑣𝑡~𝑉\tilde{v}_{t}\in\tilde{V}, the diffeomorphism ϕt1Diff~(Ω)subscriptsuperscriptitalic-ϕ1𝑡~DiffΩ\phi^{-1}_{t}\in\widetilde{{\rm Diff}}(\Omega) in the finite-dimensional Fourier domain can be computed as

dϕ~t1dt𝑑subscriptsuperscript~italic-ϕ1𝑡𝑑𝑡\displaystyle\frac{d\tilde{\phi}^{-1}_{t}}{dt}=𝒟~ϕ~t1v~t,absent~𝒟subscriptsuperscript~italic-ϕ1𝑡subscript~𝑣𝑡\displaystyle=-\tilde{\mathcal{D}}\tilde{\phi}^{-1}_{t}\ast\tilde{v}_{t},(1)

where 𝒟~ϕ~t1~𝒟subscriptsuperscript~italic-ϕ1𝑡\tilde{\mathcal{D}}\tilde{\phi}^{-1}_{t} is a tensor product 𝒟~ϕ~t1tensor-product~𝒟subscriptsuperscript~italic-ϕ1𝑡\tilde{\mathcal{D}}\otimes\tilde{\phi}^{-1}_{t}, representing the Fourier frequencies of a Jacobian matrix 𝒟~~𝒟\tilde{\mathcal{D}}. The \ast is a circular convolution with zero padding to avoid aliasing. We truncate the output of the convolution in each dimension to a suitable finite set to avoid the domain growing to infinity.

Geodesic shooting algorithm states that the transformation can be uniquely determined by integrating a partial differential equation with a given initial velocity v0subscript𝑣0v_{0} forward in time. In contrast to the original LDDMM that optimizes over a collection of time-dependent velocity fields, geodesic shooting estimates an optimal initial velocity v0subscript𝑣0v_{0}. In this work, we adopt an efficient variant of geodesic shooting defined in Fourier spaces. We first discretize the Jacobian and divergence operators using finite difference approximations (particularly the central difference scheme) and then compute their Fourier coefficients. Detailed derivations can be found in the appendix section of FLASH (Zhang and Fletcher, 2019). The Fourier representation of the geodesic constraint that satisfies Euler-Poincaré differential (EPDiff) equation is (Zhang and Fletcher, 2015)

v~tt=adv~tv~t=𝒦~[(𝒟~v~t)T~(α)v~t+~~(α)v~tv~t)],\displaystyle\frac{\partial\tilde{v}_{t}}{\partial t}={\rm ad}^{\dagger}_{\tilde{v}_{t}}\tilde{v}_{t}=-\tilde{\mathcal{K}}\left[(\tilde{\mathcal{D}}\tilde{v}_{t})^{T}\star\tilde{\mathcal{L}}(\alpha)\tilde{v}_{t}+\tilde{\nabla}\cdot\tilde{\mathcal{L}}(\alpha)\tilde{v}_{t}\otimes\tilde{v}_{t})\right],(2)

where ~~\tilde{\mathcal{L}} is a symmetric, positive-definite differential operator that is a function of parameter α𝛼\alpha (details are in Sec. 3.1). Here 𝒦~~𝒦\tilde{\mathcal{K}} is the inverse operator of ~~\tilde{\mathcal{L}}, and \star is the truncated matrix-vector field auto-correlation 111The output signal maintains bandlimited after the auto-correlation operates on zero-padded input signal followed by truncating it back to the bandlimited space.. The adsuperscriptad{\rm ad}^{\dagger} is an adjoint operator to the negative Jacobi–Lie bracket of vector fields, adv~w~=[v~,w~]=𝒟~v~w~𝒟~w~v~subscriptad~𝑣~𝑤~𝑣~𝑤~𝒟~𝑣~𝑤~𝒟~𝑤~𝑣{\rm ad}_{\tilde{v}}\tilde{w}=-[\tilde{v},\tilde{w}]=\tilde{\mathcal{D}}\tilde{v}\ast\tilde{w}-\tilde{\mathcal{D}}\tilde{w}\ast\tilde{v}. The operator ~\tilde{\nabla}\cdot is the discrete divergence (computed by summing the Fourier coefficients of different directions over in 𝒟~~𝒟\tilde{\mathcal{D}}) of a vector field.

2.2 FLASH: Fast Image Registration

Consider a source image S𝑆S and a target image T𝑇T as square-integrable functions defined on a torus domain Ω=d/dΩsuperscript𝑑superscript𝑑\Omega=\mathbb{R}^{d}/\mathbb{Z}^{d} (S(x),T(x):Ω:𝑆𝑥𝑇𝑥ΩS(x),T(x):\Omega\rightarrow\mathbb{R}). The problem of diffeomorphic image registration is to find the geodesic (shortest path) of diffeomorphic transformations ϕtDiff(Ω):ΩΩ,t[0,1]:subscriptitalic-ϕ𝑡DiffΩformulae-sequenceΩΩ𝑡01\phi_{t}\in{\rm Diff}(\Omega):\Omega\rightarrow\Omega,t\in[0,1], such that the deformed image Sϕ11𝑆subscriptsuperscriptitalic-ϕ11S\circ\phi^{-1}_{1} at time point t=1𝑡1t=1 is similar to T𝑇T.

The objective function can be formulated as a dissimilarity term plus a regularization that enforces the smoothness of transformations

E(v~0)𝐸subscript~𝑣0\displaystyle E(\tilde{v}_{0})=γ2Dist(Sϕ11,T)+12(~(α)v~0,v~0),s.t.,Eq.(1)&Eq.(2).\displaystyle=\frac{\gamma}{2}\text{Dist}(S\circ\phi_{1}^{-1},T)+\frac{1}{2}(\tilde{\mathcal{L}}(\alpha)\tilde{v}_{0},\tilde{v}_{0}),\,s.t.,\text{Eq}.\eqref{eq:finalleftinvariantfft}\,\&\,\text{Eq}.\eqref{eq:epdiffleft}.(3)

The Dist(,)Dist\text{Dist}(\cdot,\cdot) is a distance function that measures the dissimilarity between images. Commonly used distance metrics include sum-of-squared difference (L2subscript𝐿2L_{2} norm) of image intensities (Beg et al., 2005), normalized cross-correlation (NCC) (Avants et al., 2008), and mutual information (MI) (Wells III et al., 1996). The ϕ11superscriptsubscriptitalic-ϕ11\phi_{1}^{-1} denotes the inverse of deformation that warps the source image S𝑆S in the spatial domain when t=1𝑡1t=1. Here γ𝛾\gamma is a weight parameter balancing between the distance function and regularity term. The distance term stays in the full spatial domain, while the regularity term is computed in bandlimited space.

3 Our Model: Deep Learning for Regularity Estimation in Image Registration

In this section, we present a supervised learning model based on CNN to predict the regularity of image registration for a given image pair. Analogous to  (Yang et al., 2017), we run optimization-based image registration to obtain training data. We introduce a low-dimensional Bayesian model of image registration to produce appropriate regularization parameters for training.

3.1 Low-dimensional Bayesian Model of Registration

In contrast to previous approaches, our proposed model is parameterized in a bandlimited velocity space V~~𝑉\tilde{V}, with parameter α𝛼\alpha enforcing the smoothness of transformations.

Assuming an independent and identically distributed (i.i.d.) Gaussian noise on image intensities, we obtain the likelihood

p(T|Sϕ11,σ2)=1(2πσ2)Mexp(12σ2Sϕ11T22),𝑝conditional𝑇𝑆superscriptsubscriptitalic-ϕ11superscript𝜎21superscript2𝜋superscript𝜎2𝑀12superscript𝜎2superscriptsubscriptnorm𝑆superscriptsubscriptitalic-ϕ11𝑇22p(T\,|\,S\circ\phi_{1}^{-1},\sigma^{2})=\frac{1}{(\sqrt{2\pi}\sigma^{2})^{M}}\exp{\left(-\frac{1}{2\sigma^{2}}||S\circ\phi_{1}^{-1}-T||_{2}^{2}\right)},(4)

where σ2superscript𝜎2\sigma^{2} is the noise variance and M𝑀M is the number of image voxels. The deformation ϕ11superscriptsubscriptitalic-ϕ11\phi_{1}^{-1} corresponds to ϕ~11superscriptsubscript~italic-ϕ11\tilde{\phi}_{1}^{-1} in Fourier space via the Fourier transform (ϕ11)=ϕ~11superscriptsubscriptitalic-ϕ11superscriptsubscript~italic-ϕ11\mathcal{F}(\phi_{1}^{-1})=\tilde{\mathcal{\phi}}_{1}^{-1}, or its inverse ϕ11=1(ϕ~11)superscriptsubscriptitalic-ϕ11superscript1superscriptsubscript~italic-ϕ11\phi_{1}^{-1}=\mathcal{F}^{-1}(\tilde{\mathcal{\phi}}_{1}^{-1}). The likelihood is defined by the residual error between a target image and a deformed source at time point t=1𝑡1t=1. We assume the definition of this distribution is after the fact that the transformation field is observed through geodesic shooting; hence is not dependent on the regularization parameters.

Analogous to (Wang et al., 2018), we define a prior on the initial velocity field v~0subscript~𝑣0\tilde{v}_{0} as a complex multivariate Gaussian distribution, i.e.,

p(v~0|α)=1(2π)M2|~1(α)|12exp(12(~(α)v~0,v~0)),𝑝conditionalsubscript~𝑣0𝛼1superscript2𝜋𝑀2superscriptsuperscript~1𝛼1212~𝛼subscript~𝑣0subscript~𝑣0\displaystyle p({\tilde{v}}_{0}|\alpha)=\frac{1}{(2\pi)^{\frac{M}{2}}|{\tilde{\mathcal{L}}}^{-1}(\alpha)|^{\frac{1}{2}}}\exp{\left(-\frac{1}{2}({\tilde{\mathcal{L}}(\alpha)}{\tilde{v}}_{0},{\tilde{v}}_{0})\right)},(5)

where |||\cdot| is matrix determinant. The Fourier coefficients of ~~\tilde{\mathcal{L}} is, i.e., ~=(α𝒜~+1)3~superscript𝛼~𝒜13\tilde{\mathcal{L}}=\left(\alpha\tilde{\mathcal{A}}+1\right)^{3}, 𝒜~(ξ1,,ξd)=2j=1d(cos(2πξj)1)~𝒜subscript𝜉1subscript𝜉𝑑2superscriptsubscript𝑗1𝑑2𝜋subscript𝜉𝑗1\tilde{\mathcal{A}}(\xi_{1},\ldots,\xi_{d})=-2\sum_{j=1}^{d}\left(\cos(2\pi\xi_{j})-1\right). Here 𝒜~~𝒜\tilde{\mathcal{A}} denotes a negative discrete Fourier Laplacian operator with a d𝑑d-dimensional frequency vector (ξ1,,ξd)subscript𝜉1subscript𝜉𝑑(\xi_{1},\ldots,\xi_{d}), where d𝑑d is the dimension of the bandlimited Fourier space.

Combining the likelihood in Eq. (4) and prior in Eq. (5) together, we obtain the negative log posterior distribution on the deformation parameter parameterized by v~0subscript~𝑣0\tilde{v}_{0} as

lnp(v~0|S,T,σ2,α)=𝑝conditionalsubscript~𝑣0𝑆𝑇superscript𝜎2𝛼absent\displaystyle-\ln\,p(\tilde{v}_{0}\,|\,S,T,\sigma^{2},\alpha)=12(~v~0,v~0)+Sϕ11T222σ212ln|~|+2Mlnσ+Mln(2π).12~subscript~𝑣0subscript~𝑣0superscriptsubscriptnorm𝑆superscriptsubscriptitalic-ϕ11𝑇222superscript𝜎212~2𝑀𝜎𝑀2𝜋\displaystyle\frac{1}{2}(\tilde{\mathcal{L}}\tilde{v}_{0},\tilde{v}_{0})+\frac{\|S\circ\phi_{1}^{-1}-T\|_{2}^{2}}{2\sigma^{2}}-\frac{1}{2}\ln|\tilde{\mathcal{L}}|+2M\ln\sigma+M\ln(2\pi).(6)

Next, we optimize Eq. (6) over the regularization parameter α𝛼\alpha and the registration parameter v~0subscript~𝑣0\tilde{v}_{0} by maximum a posterior (MAP) estimation using gradient descent algorithm. Other optimization schemes, such as BFGS (Polzin et al., 2016), or the Gauss-Newton method (Ashburner and Friston, 2011) can also be applied.

Gradient of α𝛼\alpha. To simplify the notation, first we define f(v~0)lnp(v~0|S,T,σ2,α)𝑓subscript~𝑣0𝑝conditionalsubscript~𝑣0𝑆𝑇superscript𝜎2𝛼f(\tilde{v}_{0})\triangleq-\ln\,p(\tilde{v}_{0}\,|\,S,T,\sigma^{2},\alpha). Since the discrete Laplacian operator ~~\tilde{\mathcal{L}} is a diagonal matrix in Fourier space, its determinant can be computed as j=1d(α𝒜~j+1)3superscriptsubscriptproduct𝑗1𝑑superscript𝛼subscript~𝒜𝑗13\displaystyle\prod_{j=1}^{d}(\alpha\tilde{\mathcal{A}}_{j}+1)^{3}. Therefore, the log determinant of ~~\tilde{\mathcal{L}} operator is

ln|~|=3j=1d(α𝒜~j+1).~3superscriptsubscript𝑗1𝑑𝛼subscript~𝒜𝑗1\ln|\tilde{\mathcal{L}}|=3\sum_{j=1}^{d}(\alpha\tilde{\mathcal{A}}_{j}+1).

We then derive the gradient term αf(v~0)subscript𝛼𝑓subscript~𝑣0\nabla_{\alpha}f(\tilde{v}_{0}) as

αf(v~0)=32[j=1d𝒜~jα𝒜~j+1(α𝒜~+1)5𝒜~v~0,v~0].subscript𝛼𝑓subscript~𝑣032delimited-[]superscriptsubscript𝑗1𝑑subscript~𝒜𝑗𝛼subscript~𝒜𝑗1superscript𝛼~𝒜15~𝒜subscript~𝑣0subscript~𝑣0\displaystyle\nabla_{\alpha}f(\tilde{v}_{0})=-\frac{3}{2}[\sum_{j=1}^{d}\frac{\tilde{\mathcal{A}}_{j}}{\alpha\tilde{\mathcal{A}}_{j}+1}-\langle(\alpha\tilde{\mathcal{A}}+1)^{5}\tilde{\mathcal{A}}\tilde{v}_{0},\tilde{v}_{0}\rangle].(7)

Gradient of v~0subscript~𝑣0\tilde{v}_{0}. We compute the gradient with respect to v~0subscript~𝑣0\tilde{v}_{0} by using a forward-backward sweep approach developed in (Zhang et al., 2017). Steps for obtaining the gradient v~0f(v~0)subscriptsubscript~𝑣0𝑓subscript~𝑣0\nabla_{\tilde{v}_{0}}f(\tilde{v}_{0}) are as follows:

  1. (i)

    Forward integrating the geodesic shooting equation Eq.(2) to compute v~1subscript~𝑣1\tilde{v}_{1},

  2. (ii)

    Compute the gradient of the energy function Eq. (3) with respect to v~1subscript~𝑣1\tilde{v}_{1} at t=1𝑡1t=1,

    v~1f(v~0)=~1(α)(1σ2(Sϕ11T)(Sϕ11)).subscriptsubscript~𝑣1𝑓subscript~𝑣0superscript~1𝛼1superscript𝜎2𝑆superscriptsubscriptitalic-ϕ11𝑇𝑆superscriptsubscriptitalic-ϕ11\displaystyle\nabla_{\tilde{v}_{1}}f(\tilde{v}_{0})=\tilde{\mathcal{L}}^{-1}(\alpha)\left(\frac{1}{\sigma^{2}}(S\circ\phi_{1}^{-1}-T)\cdot\nabla(S\circ\phi_{1}^{-1})\right).(8)
  3. (iii)

    Bring the gradient v~1f(v~0)subscriptsubscript~𝑣1𝑓subscript~𝑣0\nabla_{\tilde{v}_{1}}f(\tilde{v}_{0}) back to t=0𝑡0t=0 by integrating adjoint Jacobi fields backward in time (Zhang et al., 2017),

    dv^dt=adv~h^,dh^dt=v^adv~h^+adh^v~,formulae-sequence𝑑^𝑣𝑑𝑡subscriptsuperscriptad~𝑣^𝑑^𝑑𝑡^𝑣subscriptad~𝑣^subscriptsuperscriptad^~𝑣\frac{d{\hat{v}}}{dt}=-{\rm ad}^{\dagger}_{\tilde{v}}{\hat{h}},\quad\frac{d{\hat{h}}}{dt}=-\hat{v}-{\rm ad}_{\tilde{v}}\hat{h}+{\rm ad}^{\dagger}_{\hat{h}}\tilde{v},(9)

    where v^V^𝑣𝑉\hat{v}\in V are introduced adjoint variables with an initial condition h^=0,v^=v~1f(v~0)formulae-sequence^0^𝑣subscriptsubscript~𝑣1𝑓subscript~𝑣0\hat{h}=0,\hat{v}=\nabla_{\tilde{v}_{1}}f(\tilde{v}_{0}) at t=1𝑡1t=1.

A summary of the optimization is in Alg. 1. It is worthy to mention that our low-dimensional Bayesian framework developed in bandlimited space dramatically speeds up the computational time of generating training regularity parameters by approximately ten times comparing with high-dimensional frameworks.

Input : source image S𝑆S and target image T𝑇T, step size ϵitalic-ϵ\epsilon, step size τ𝜏\tau, iterations r𝑟r, algorithm stop criterion rate u𝑢u, counter q𝑞q, minimal convergence iteration qminsubscript𝑞𝑚𝑖𝑛q_{min}
Output : optimal smoothness level αoptsuperscript𝛼𝑜𝑝𝑡\alpha^{opt}, and registration solution v~0subscript~𝑣0\tilde{v}_{0}
/* Low-dimensional MAP Estimation for α𝛼\alpha */
1for i=1𝑖1i=1 to r𝑟r do
2       1. Compute gradient αf(v~0)subscript𝛼𝑓subscript~𝑣0\nabla_{\alpha}f(\tilde{v}_{0}) by Eq. (7);
       /* Update α𝛼\alpha when its gradient is greater than zero; */
3       2. if αf(v~0)2>1e6superscriptdelimited-∥∥subscript𝛼𝑓subscript~𝑣021𝑒6\lVert\nabla_{\alpha}f(\tilde{v}_{0})\rVert^{2}>1e-6  then
4            αoptαoptταf(v~0);superscript𝛼𝑜𝑝𝑡superscript𝛼𝑜𝑝𝑡𝜏subscript𝛼𝑓subscript~𝑣0\alpha^{opt}\leftarrow\alpha^{opt}-\tau\nabla_{\alpha}f(\tilde{v}_{0});
5       else
6            break;
7      3. Forward integrating the geodesic evolution equation in Eq. (2) with initial velocity v~0subscript~𝑣0\tilde{v}_{0};
8       4. Compute gradient v~1f(v~0)subscriptsubscript~𝑣1𝑓subscript~𝑣0\nabla_{\tilde{v}_{1}}f(\tilde{v}_{0}) by Eq. (8) then backward integrating adjoint equations Eq. (9) to generate v~0f(v~0)subscriptsubscript~𝑣0𝑓subscript~𝑣0\nabla_{\tilde{v}_{0}}f(\tilde{v}_{0}) at t=0𝑡0t=0;
       /* Update v~0subscript~𝑣0\tilde{v}_{0} when its gradient is greater than zero; */
9       5. if v~0f(v~0)2>1e6superscriptdelimited-∥∥subscriptsubscript~𝑣0𝑓subscript~𝑣021𝑒6\lVert\nabla_{\tilde{v}_{0}}f(\tilde{v}_{0})\rVert^{2}>1e-6  then
10            v~0v~0ϵv~0f(v~0);subscript~𝑣0subscript~𝑣0italic-ϵsubscriptsubscript~𝑣0𝑓subscript~𝑣0\tilde{v}_{0}\leftarrow\tilde{v}_{0}-\epsilon\nabla_{\tilde{v}_{0}}f(\tilde{v}_{0});
11       else
12            break;
      /* Compute algorithm stop rate */
13       6. Compute total energy Eq. (6) for ilimit-from𝑖i-th iteration as ObjisubscriptObji\rm{Obj}_{i};
14       7. if ObjiObji1Obji<usubscriptObjisubscriptObji1subscriptObji𝑢\frac{\rm{Obj}_{i}-\rm{Obj}_{i-1}}{\rm{Obj}_{i}}<u  then
15             q=q+1𝑞𝑞1q=q+1 ;
16       else
17            continue;
      /* Convergence check */
18       8. if qqmin𝑞subscript𝑞𝑚𝑖𝑛q\geq q_{min}  then
19             break ;
20       else
21            continue;
22      
End
Algorithm 1 MAP of low-dimensional Bayesian registration model for training data generation.

3.2 Network Architecture

Now we are ready to introduce our network architecture by using the estimated α𝛼\alpha and given image pairs as input training data. Fig. 2 shows an overview flowchart of our proposed learning model. With the optimal registration regularization parameter αoptsuperscript𝛼𝑜𝑝𝑡\alpha^{opt} obtained from an image pair (as described in Sec. 3.1), a two-stream CNN-based regression network takes source images, target images as training data to produce a predictive regularization parameter. We optimize the network (S,T;W)𝑆𝑇𝑊\mathcal{H}(S,T;W) with the followed objective function,

Eloss=Err[(S,T;W),αopt]+Reg(W),subscript𝐸lossErr𝑆𝑇𝑊superscript𝛼𝑜𝑝𝑡Reg𝑊\displaystyle E_{\text{loss}}=\text{Err}[\mathcal{H}(S,T;W),\alpha^{opt}]+\text{Reg}(W),

where Reg(W)Reg𝑊\text{Reg}(W) denotes the regularization on convolution kernel weights W𝑊W. Err[,]Err\text{Err}[\cdot,\cdot] denotes the data fitting term between the ground truth and the network output. In our model, we use L2subscript𝐿2L_{2} norm for both terms. Other network architectures, e.g. 3D Residual Networks (3D-ResNet) (Hara et al., 2017) and Very Deep Convolutional Networks (3D-VGGNet) (Simonyan and Zisserman, 2014; Yang et al., 2018) can be easily applied as well.

Refer to caption
Figure 2: Illustration of our proposed network. From left to right: training data includes pairwise images, a CNN-based neural network, and a loss computed between the network output and the optimal parameter αoptsuperscript𝛼𝑜𝑝𝑡\alpha^{opt}, which is estimated from Alg. 1.

In our network, we input the 3D source and target images into separate channels that include four convolutional blocks. Each 3D convolutional block is composed of a 3D convolutional layer, a batch normalization (BN) layer with activation functions (PReLU or ReLU), and a 3D max-pooling layer. Specifically, we apply 5×5×55555\times 5\times 5 convolutional kernel and 2×2×22222\times 2\times 2 max-pooling layer to encode a batch (size as B𝐵B) of source and target images (1283superscript1283128^{3}) to feature maps (163superscript16316^{3}). After extracting the deep features from source and target channels, we combine them into a fusion channel, which includes three convolutional blocks, a fully connected layer, and an average pooling layer to produce the network output.

4 Experiment

To demonstrate the effectiveness of the proposed low-dimensional Bayesian registration model, we validate it through three sets of experiments. For 2D synthetic data registration, we deform a binary source image with velocity fields sampled from the prior distribution in Eq. (5) with known regularization parameters to simulate target images. We show three convergence graphs of our MAP estimation and compare them with the ground truth parameters.

Similarly, we synthesize 900900900 image pairs using regularization parameters at different scales respectively, i.e., α={0.1,1.0,10.0}𝛼0.11.010.0\alpha=\{0.1,1.0,10.0\}, to test the performance of our predictive model. We then use the predicted parameter α𝛼\alpha to run registration model and show the error maps between target and deformed images.

For 3D brain MRI registration, we show results on both MAP and our network prediction. We first show the numerical difference between the MAP estimation and our prediction (i.e. predicted regularization parameter), and then report the mean error of deformed images between both methods across all datasets. We visualize the transformation grids and report the value of regularization parameters for both methods. To further investigate the accuracy of parameters generated by our model, we perform registration-based segmentation and examine the resulting segmentation accuracy over nine brain structures, including cortex, putamen, cerebellum, caudate, gyrus, brain stem, precuneus, cuneus, and hippocampus. We evaluate a volume-overlapping similarity measurement, also known as Søitalic-ø\orensen-Dice coefficient (Dice, 1945), between the propagated segmentation and the manual segmentation. The statistics of dice evaluation over 150150150 registration pairs are reported.

We last compare the computational efficiency on both time and memory consumption of the proposed method with a baseline model that performs Bayesian estimation of regularization parameter in the full spatial domain (Zhang et al., 2013).

To generate the training data of initial velocities, we run the proposed low-dimensional Bayesian registration algorithms until convergence. We use the Euler integrator in geodesic shooting and set the of integration steps as 101010. We set algorithm stop rate u𝑢u as 1e61𝑒61e-6 and minimal convergence iteration qminsubscript𝑞𝑚𝑖𝑛q_{min} as 30. We use optimal truncated dimension for v~0subscript~𝑣0\tilde{v}_{0} as 16 and σ=0.03𝜎0.03\sigma=0.03 according to (Zhang et al., 2017). For the network setting, We initialize the convolution kernel weights using the He normal initializer (He et al., 2015) and use the Adam optimizer with a learning rate of 5e45𝑒45e-4 until convergence. We set 161616 and 1.0e41.0𝑒41.0e-4 as batch size and weight decay. The maximum epoch for 2D and 3D network training is 100010001000.

4.1 Data

We run experiments on both 2D synthetic dataset and 3D real brain MRI scans.

2D synthetic data. We generate synthetic bull-eye images with the size of 100×100100100100\times 100 (as shown in Fig. 3). We manipulate the width a𝑎a and height b𝑏b of two ellipses by using equation (x50)2a2+(y50)2b2=1superscript𝑥502superscript𝑎2superscript𝑦502superscript𝑏21\frac{(x-50)^{2}}{a^{2}}+\frac{(y-50)^{2}}{b^{2}}=1.

3D brain MRIs. We include 150015001500 public T1-weighted brain MRI scans from Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset (Jack Jr et al., 2008), Open Access Series of Imaging Studies (OASIS) (Fotenos et al., 2005), and LONI Probabilistic Brain Atlas Individual Subject Data (LPBA40) (Shattuck et al., 2008), among which 260260260 subjects have manual delineated segmentation labels. All 3D data were carefully pre-processed as 128×128×128128128128128\times 128\times 128, 1.25mm31.25𝑚superscript𝑚31.25mm^{3} isotropic voxels, and underwent skull-stripped, intensity normalized, bias field corrected, and pre-aligned with affine transformation.

For both 2D and 3D datasets, we split the images by using 70%percent7070\% as training images, 15%percent1515\% as validation images, and 15%percent1515\% as testing images such that no subjects are shared across the training, validation, and testing stage. We evaluate the hyperparameters of models and generate preliminary experiments on the validation dataset. The testing set is only used for computing the final results.

4.2 Results

Fig. 3 displays our MAP estimation of registration results including appropriate regularity parameters on 2D synthetic images. The middle panel of Fig. 3 reports the convergence of α𝛼\alpha estimation vs. ground truth. It indicates that our low-dimensional Bayesian model provides trustworthy regularization parameters that are fairly close to ground truth for network training. The bottom panel of Fig. 3 shows the convergence graph of the total energy for our MAP approach.

Refer to caption
Figure 3: Top panel: source image, velocity fields generated from prior distribution and transformation fields (with known regularization parameter α=3,6,11𝛼3611\alpha=3,6,11), and target images produced by deformed source images; Middle panel: convergence graphs of estimated α𝛼\alpha by MAP for training data; Bottom panel: convergence graphs of total energy Eq. (6).

Fig. 4 further investigates the consistency of our network prediction. The left panel shows estimates of regularization parameter at multiple scales, i.e., α=0.1,1.0,10.0𝛼0.11.010.0\alpha=0.1,1.0,10.0, over 900900900 2D synthetic image pairs respectively. The right panel shows the mean error of image differences between deformed source images by transformations with predicted α𝛼\alpha and target images. While there are small variations on estimated regularization parameters, the registration results are very close (with averaged error at the level of 105superscript10510^{-5}).

Refer to caption
Figure 4: Left: our prediction of regularization parameters over 900900900 image pairs synthesized with different ground truth parameters α=0.1,1.0,10.0𝛼0.11.010.0\alpha=0.1,1.0,10.0; Right: error maps of image differences between deformed and target images.

Fig. 5 shows examples of 2D pairwise image registration with regularization estimated by MAP and our predictive deep learning model. We obtain the regularization parameter α=11.34𝛼11.34\alpha=11.34 (MAP) vs. α=13.20𝛼13.20\alpha=13.20 (network prediction), and α=5.44𝛼5.44\alpha=5.44 (MAP) vs. α=6.70𝛼6.70\alpha=6.70 (network prediction). The error map of deformed images indicates that both estimations obtain fairly close registration results.

Refer to caption
Figure 5: Left to right: two examples of 2D synthetic source, target, deformed images by MAP and predictive deep learning method, and error maps of deformed image intensity differences.

Fig. 6 displays deformed 3D brain images overlaid with transformation grids for both methods. The parameter estimated from our model produces a comparable registration result. From our observation, a critical pattern between the optimal αoptsuperscript𝛼𝑜𝑝𝑡\alpha^{opt} and its associate image pairs is that the value of α𝛼\alpha is relatively smaller when large deformation occurs. This is because the image matching term (encoded in the likelihood) requires a higher weight to deform the source image.

Refer to caption
Figure 6: Left to right: source, target, deformed images overlaid with the transformation grids generated by both methods. The optimal and predicted value of α𝛼\alpha for two showcases are (8.90,9.20)8.909.20(8.90,9.20) and (3.44,2.60)3.442.60(3.44,2.60).

Fig. 7 investigates the consistency of our network prediction over three different datasets of 3D brain MRI. The left panel shows the absolute value of numerical differences between predicted regularization parameters and MAP estimations. The right panel shows the voxel-wise mean error of image differences between deformed images by transformations with predicted α𝛼\alpha and deformed images by MAP. While slight numerical difference on estimated regularization parameters exists, the 3D deformed images are fairly close (with averaged voxel-wise error at the level of 106superscript10610^{-6}).

Refer to caption
Figure 7: Left: statistics of the difference between our prediction and MAP estimation over 300300300 image pairs from three data sets, ADNI, LPBA40, and OASIS; Right: error maps of image differences between deformed image by our prediction and MAP estimation.

Fig. 8 visualizes three views of the deformed brain images (overlay with propagated segmentation label) that are registered by MAP and our prediction. Our method produces a registration solution, which is highly close to the one estimated by MAP. The propagated segmentation label fairly aligns with the target delineation for each anatomical structure. While we show different views of 2D slices of brains, all computations are carried out fully in 3D.

Refer to caption
Figure 8: Axial, coronal and sagittal view of 3D segmentation labels with nine anatomical structures overlaid with source, target, deformed images by our low-dimensional MAP (α=8.91𝛼8.91\alpha=8.91) and our network prediction (α=6.80𝛼6.80\alpha=6.80).

Fig. 9 reports the volume overlapping of nine anatomical structures for both methods, including Cor(cortex), Puta (putamen), Cere (cerebellum), Caud (caudate), gyrus, Stem (brain stem), Precun (precuneus), Cun (cuneus), and Hippo (hippocampus). Our method produces comparable dice scores comparing with MAP estimations. This indicates that the segmentation-based registration by using our estimation achieves comparable registration performance with little to no loss of accuracy.

Refer to caption
Figure 9: Dice scores of propagated manual segmentations for both methods for 150 registration pairs. Evaluations are performed over nine anatomical brain structures, Cor (cortex), Puta (putamen), Cere (cerebellum), Caud (caudate), gyrus, Stem (brain stem), Precun (precuneus), Cun (cuneus) and Hippo (hippocampus).

Table. 1 quantitatively reports the averaged time and memory consumption of MAP estimation in full spatial image domain and our method. The proposed predictive model provides appropriate regularization parameters approximately 100010001000 times faster than the conventional optimization-based registration method with a much lower memory footprint.

Table 1: Time and memory consumption of MAP estimation of regularization in full-dimensional image space vs. our proposed low-dimensional Bayesian model, as well as network prediction.
MethodsFull-spatial MAPLow-dimensional MAPNetwork Prediction
Runtime (Sec)19012572.16
Memory (MB)45011934.4

5 Conclusion

In this paper, we proposed a deep learning-based approach to model the relationship between the regularization of image registration and the input image data. We first developed a low-dimensional Bayesian model that defines image registration entirely in a bandlimited space. We then learned the mapping between regularization parameters and spatial images through a CNN-based neural network. To the best of our knowledge, we are the first to predict the optimal regularization parameter of diffeomorphic image registration by deep learning approaches. In contrast to existing methods, our developed model substantially improves the efficiency and the robustness of image registration. Our work has great potential in a variety of clinical applications, e.g. image-guided navigation system for neurosurgery in real-time. Potential future work may include: i) extending the current model to further consider adversarial examples, i.e., image outliers with significant differences; and ii) developing an unsupervised learning of registration parameter estimation to eliminate the need of training data generation (ground truth labels).


Acknowledgments

This work was supported by a startup funding at the University of Virginia.


Ethical Standards

The work follows appropriate ethical standards in conducting research and writing the manuscript, following all applicable laws and regulations regarding treatment of animals or human subjects.

References

  • Allassonnière and Kuhn (2008) Stéphanie Allassonnière and Estelle Kuhn. Stochastic algorithm for parameter estimation for dense deformable template mixture model. arXiv preprint arXiv:0802.1521, 2008.
  • Allassonnière et al. (2007) Stéphanie Allassonnière, Yali Amit, and Alain Trouvé. Towards a coherent statistical framework for dense deformable template estimation. Journal of the Royal Statistical Society: Series B (Statistical Methodology), 69(1):3–29, 2007.
  • Arsigny et al. (2006) Vincent Arsigny, Olivier Commowick, Xavier Pennec, and Nicholas Ayache. A log-euclidean framework for statistics on diffeomorphisms. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 924–931. Springer, 2006.
  • Ashburner and Friston (2005) John Ashburner and Karl Friston. Unified segmentation. Neuroimage, 26(3):839–851, 2005.
  • Ashburner and Friston (2011) John Ashburner and Karl J Friston. Diffeomorphic registration using geodesic shooting and gauss–newton optimisation. NeuroImage, 55(3):954–967, 2011.
  • Avants et al. (2008) Brian B Avants, Charles L Epstein, Murray Grossman, and James C Gee. Symmetric diffeomorphic image registration with cross-correlation: evaluating automated labeling of elderly and neurodegenerative brain. Medical image analysis, 12(1):26–41, 2008.
  • Balakrishnan et al. (2018) Guha Balakrishnan, Amy Zhao, Mert R Sabuncu, John Guttag, and Adrian V Dalca. An unsupervised learning model for deformable medical image registration. In Proceedings of the IEEE conference on computer vision and pattern recognition, pages 9252–9260, 2018.
  • Beg et al. (2005) MIRZA Faisal Beg, Michael I Miller, Alain Trouvé, and Laurent Younes. Computing large deformation metric mappings via geodesic flows of diffeomorphisms. International journal of computer vision, 61(2):139–157, 2005.
  • Biffi et al. (2020) Carlo Biffi, Juan J Cerrolaza, Giacomo Tarroni, Wenjia Bai, Antonio De Marvao, Ozan Oktay, Christian Ledig, Loic Le Folgoc, Konstantinos Kamnitsas, Georgia Doumou, et al. Explainable anatomical shape analysis through deep hierarchical generative models. IEEE transactions on medical imaging, 39(6):2088–2099, 2020.
  • De Craene et al. (2012) Mathieu De Craene, Gemma Piella, Oscar Camara, Nicolas Duchateau, Etelvino Silva, Adelina Doltra, Jan D’hooge, Josep Brugada, Marta Sitges, and Alejandro F Frangi. Temporal diffeomorphic free-form deformation: Application to motion and strain estimation from 3d echocardiography. Medical image analysis, 16(2):427–450, 2012.
  • Dice (1945) Lee R Dice. Measures of the amount of ecologic association between species. Ecology, 26(3):297–302, 1945.
  • Fotenos et al. (2005) Anthony F Fotenos, AZ Snyder, LE Girton, JC Morris, and RL Buckner. Normative estimates of cross-sectional and longitudinal brain volume decline in aging and ad. Neurology, 64(6):1032–1039, 2005.
  • Gao et al. (2016) Yang Gao, Miaomiao Zhang, Karen Grewen, P Thomas Fletcher, and Guido Gerig. Image registration and segmentation in longitudinal mri using temporal appearance modeling. In 2016 IEEE 13th International Symposium on Biomedical Imaging (ISBI), pages 629–632. IEEE, 2016.
  • Haber and Modersitzki (2006) Eldad Haber and Jan Modersitzki. A multilevel method for image registration. SIAM journal on scientific computing, 27(5):1594–1607, 2006.
  • Haber et al. (2000) Eldad Haber, Uri M Ascher, and Doug Oldenburg. On optimization techniques for solving nonlinear inverse problems. Inverse problems, 16(5):1263, 2000.
  • Hara et al. (2017) Kensho Hara, Hirokatsu Kataoka, and Yutaka Satoh. Learning spatio-temporal features with 3d residual networks for action recognition. In Proceedings of the IEEE International Conference on Computer Vision Workshops, pages 3154–3160, 2017.
  • He et al. (2015) Kaiming He, Xiangyu Zhang, Shaoqing Ren, and Jian Sun. Delving deep into rectifiers: Surpassing human-level performance on imagenet classification. In Proceedings of the IEEE international conference on computer vision, pages 1026–1034, 2015.
  • Hong et al. (2017) Yi Hong, Polina Golland, and Miaomiao Zhang. Fast geodesic regression for population-based image analysis. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 317–325. Springer, 2017.
  • Jack Jr et al. (2008) Clifford R Jack Jr, Matt A Bernstein, Nick C Fox, Paul Thompson, Gene Alexander, Danielle Harvey, Bret Borowski, Paula J Britson, Jennifer L. Whitwell, Chadwick Ward, et al. The alzheimer’s disease neuroimaging initiative (adni): Mri methods. Journal of Magnetic Resonance Imaging: An Official Journal of the International Society for Magnetic Resonance in Medicine, 27(4):685–691, 2008.
  • Jaillet et al. (2005) Léonard Jaillet, Anna Yershova, Steven M La Valle, and Thierry Siméon. Adaptive tuning of the sampling domain for dynamic-domain rrts. In 2005 IEEE/RSJ International Conference on Intelligent Robots and Systems, pages 2851–2856. IEEE, 2005.
  • Jordan et al. (1999) Michael I Jordan, Zoubin Ghahramani, Tommi S Jaakkola, and Lawrence K Saul. An introduction to variational methods for graphical models. Machine learning, 37(2):183–233, 1999.
  • Krebs et al. (2019) Julian Krebs, Hervé Delingette, Boris Mailhé, Nicholas Ayache, and Tommaso Mansi. Learning a probabilistic model for diffeomorphic registration. IEEE transactions on medical imaging, 38(9):2165–2176, 2019.
  • Liao et al. (2016) Ruizhi Liao, Esra A Turk, Miaomiao Zhang, Jie Luo, P Ellen Grant, Elfar Adalsteinsson, and Polina Golland. Temporal registration in in-utero volumetric mri time series. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 54–62. Springer, 2016.
  • Luo et al. (2018) Jie Luo, Matt Toews, Ines Machado, Sarah Frisken, Miaomiao Zhang, Frank Preiswerk, Alireza Sedghi, Hongyi Ding, Steve Pieper, Polina Golland, et al. A feature-driven active framework for ultrasound-based brain shift compensation. arXiv preprint arXiv:1803.07682, 2018.
  • Mang and Biros (2015) Andreas Mang and George Biros. An inexact newton–krylov algorithm for constrained diffeomorphic image registration. SIAM journal on imaging sciences, 8(2):1030–1069, 2015.
  • Mang et al. (2019) Andreas Mang, Amir Gholami, Christos Davatzikos, and George Biros. Claire: A distributed-memory solver for constrained large deformation diffeomorphic image registration. SIAM Journal on Scientific Computing, 41(5):C548–C584, 2019.
  • Polzin et al. (2016) Thomas Polzin, Marc Niethammer, Mattias P Heinrich, Heinz Handels, and Jan Modersitzki. Memory efficient lddmm for lung ct. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 28–36. Springer, 2016.
  • Rueckert et al. (2006) Daniel Rueckert, Paul Aljabar, Rolf A Heckemann, Joseph V Hajnal, and Alexander Hammers. Diffeomorphic registration using b-splines. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 702–709. Springer, 2006.
  • Ruppert et al. (2017) Guilherme CS Ruppert, Giovani Chiachia, Felipe PG Bergo, Fernanda O Favretto, Clarissa L Yasuda, Anderson Rocha, and Alexandre X Falcão. Medical image registration based on watershed transform from greyscale marker and multi-scale parameter search. Computer Methods in Biomechanics and Biomedical Engineering: Imaging & Visualization, 5(2):138–156, 2017.
  • Shattuck et al. (2008) David W Shattuck, Mubeena Mirza, Vitria Adisetiyo, Cornelius Hojatkashani, Georges Salamon, Katherine L Narr, Russell A Poldrack, Robert M Bilder, and Arthur W Toga. Construction of a 3d probabilistic atlas of human cortical structures. Neuroimage, 39(3):1064–1080, 2008.
  • Simonyan and Zisserman (2014) Karen Simonyan and Andrew Zisserman. Very deep convolutional networks for large-scale image recognition. arXiv preprint arXiv:1409.1556, 2014.
  • Simpson et al. (2012) Ivor JA Simpson, Julia A Schnabel, Adrian R Groves, Jesper LR Andersson, and Mark W Woolrich. Probabilistic inference of regularisation in non-rigid registration. NeuroImage, 59(3):2438–2451, 2012.
  • Simpson et al. (2015) Ivor JA Simpson, Manuel Jorge Cardoso, Marc Modat, David M Cash, Mark W Woolrich, Jesper LR Andersson, Julia A Schnabel, Sébastien Ourselin, Alzheimer’s Disease Neuroimaging Initiative, et al. Probabilistic non-linear registration with spatially adaptive regularisation. Medical image analysis, 26(1):203–216, 2015.
  • Toews and Wells (2009) Matthew Toews and William M Wells. Bayesian registration via local image regions: information, selection and marginalization. In International Conference on Information Processing in Medical Imaging, pages 435–446. Springer, 2009.
  • Vaillant et al. (2004) Marc Vaillant, Michael I Miller, Laurent Younes, and Alain Trouvé. Statistics on diffeomorphisms via tangent space representations. NeuroImage, 23:S161–S169, 2004.
  • Valsecchi et al. (2013) Andrea Valsecchi, Jérémie Dubois-Lacoste, Thomas Stützle, Sergio Damas, José Santamaria, and Linda Marrakchi-Kacem. Evolutionary medical image registration using automatic parameter tuning. In 2013 IEEE Congress on Evolutionary Computation, pages 1326–1333. IEEE, 2013.
  • Vialard et al. (2012) François-Xavier Vialard, Laurent Risser, Daniel Rueckert, and Colin J Cotter. Diffeomorphic 3d image registration via geodesic shooting using an efficient adjoint calculation. International Journal of Computer Vision, 97(2):229–241, 2012.
  • Wang and Zhang (2021) Jian Wang and Miaomiao Zhang. Bayesian atlas building with hierarchical priors for subject-specific regularization. arXiv e-prints, pages arXiv–2107, 2021.
  • Wang et al. (2018) Jian Wang, William M Wells, Polina Golland, and Miaomiao Zhang. Efficient laplace approximation for bayesian registration uncertainty quantification. In International Conference on Medical Image Computing and Computer-Assisted Intervention, pages 880–888. Springer, 2018.
  • Wang et al. (2019) Jian Wang, William M Wells III, Polina Golland, and Miaomiao Zhang. Registration uncertainty quantification via low-dimensional characterization of geometric deformations. Magnetic Resonance Imaging, 64:122–131, 2019.
  • Wells III et al. (1996) William M Wells III, Paul Viola, Hideki Atsumi, Shin Nakajima, and Ron Kikinis. Multi-modal volume registration by maximization of mutual information. Medical image analysis, 1(1):35–51, 1996.
  • Xing et al. (2019) Jiarui Xing, Ulugbek Kamilov, Wenjie Wu, Yong Wang, and Miaomiao Zhang. Plug-and-play priors for reconstruction-based placental image registration. In Smart Ultrasound Imaging and Perinatal, Preterm and Paediatric Image Analysis, pages 133–142. Springer, 2019.
  • Yang et al. (2018) Chengliang Yang, Anand Rangarajan, and Sanjay Ranka. Visual explanations from deep 3d convolutional neural networks for alzheimer’s disease classification. In AMIA Annual Symposium Proceedings, volume 2018, page 1571. American Medical Informatics Association, 2018.
  • Yang et al. (2017) Xiao Yang, Roland Kwitt, Martin Styner, and Marc Niethammer. Quicksilver: Fast predictive image registration–a deep learning approach. NeuroImage, 158:378–396, 2017.
  • Younes et al. (2009) Laurent Younes, Felipe Arrate, and Michael I Miller. Evolutions equations in computational anatomy. NeuroImage, 45(1):S40–S50, 2009.
  • Zhang and Fletcher (2015) Miaomiao Zhang and P Thomas Fletcher. Finite-dimensional lie algebras for fast diffeomorphic image registration. In International Conference on Information Processing in Medical Imaging, pages 249–260. Springer, 2015.
  • Zhang and Fletcher (2019) Miaomiao Zhang and P Thomas Fletcher. Fast diffeomorphic image registration via fourier-approximated lie algebras. International Journal of Computer Vision, 127(1):61–73, 2019.
  • Zhang et al. (2013) Miaomiao Zhang, Nikhil Singh, and P Thomas Fletcher. Bayesian estimation of regularization and atlas building in diffeomorphic image registration. In International conference on information processing in medical imaging, pages 37–48. Springer, 2013.
  • Zhang et al. (2016) Miaomiao Zhang, William M Wells, and Polina Golland. Low-dimensional statistics of anatomical variability via compact representation of image deformations. In International conference on medical image computing and computer-assisted intervention, pages 166–173. Springer, 2016.
  • Zhang et al. (2017) Miaomiao Zhang, Ruizhi Liao, Adrian V Dalca, Esra A Turk, Jie Luo, P Ellen Grant, and Polina Golland. Frequency diffeomorphisms for efficient image registration. In International conference on information processing in medical imaging, pages 559–570. Springer, 2017.
  • Zöllei et al. (2007) Lilla Zöllei, Mark Jenkinson, Samson Timoner, and William Wells. A marginalized map approach and em optimization for pair-wise registration. In Biennial International Conference on Information Processing in Medical Imaging, pages 662–674. Springer, 2007.