Choice of training label matters: how to best use deep learning for quantitative MRI parameter estimation

Sean C. Epstein1,2Orcid, Timothy J. P. Bray3Orcid, Margaret Hall-Craggs3Orcid, Hui Zhang1,2Orcid
1: Department of Computer Science, University College London, 2: Centre for Medical Image Computing, University College London, 3: Centre for Medical Imaging, University College London
Publication date: 2024/01/23
https://doi.org/10.59275/j.melba.2024-geb5
PDF · Code · arXiv

Abstract

Deep learning (DL) is gaining popularity as a parameter estimation method for quantitative MRI. A range of competing implementations have been proposed, relying on either supervised or self-supervised learning. Self-supervised approaches, sometimes referred to as unsupervised, have been loosely based on auto-encoders, whereas supervised methods have, to date, been trained on groundtruth labels. These two learning paradigms have been shown to have distinct strengths. Notably, self-supervised approaches offer lower-bias parameter estimates than their supervised alternatives. This result is counterintuitive – incorporating prior knowledge with supervised labels should, in theory, lead to improved accuracy. In this work, we show that this apparent limitation of supervised approaches stems from the naïve choice of groundtruth training labels. By training on labels which are deliberately not groundtruth, we show that the low-bias parameter estimation previously associated with self-supervised methods can be replicated – and improved on – within a supervised learning framework. This approach sets the stage for a single, unifying, deep learning parameter estimation framework, based on supervised learning, where trade-offs between bias and variance are made by careful adjustment of training label.

Keywords

quantitative mri · diffusion mri · deep learning

Bibtex @article{melba:2024:002:epstein, title = "Choice of training label matters: how to best use deep learning for quantitative MRI parameter estimation", author = "Epstein, Sean C. and Bray, Timothy J. P. and Hall-Craggs, Margaret and Zhang, Hui", journal = "Machine Learning for Biomedical Imaging", volume = "2", issue = "January 2024 issue", year = "2024", pages = "586--610", issn = "2766-905X", doi = "https://doi.org/10.59275/j.melba.2024-geb5", url = "https://melba-journal.org/2024:002" }
RISTY - JOUR AU - Epstein, Sean C. AU - Bray, Timothy J. P. AU - Hall-Craggs, Margaret AU - Zhang, Hui PY - 2024 TI - Choice of training label matters: how to best use deep learning for quantitative MRI parameter estimation T2 - Machine Learning for Biomedical Imaging VL - 2 IS - January 2024 issue SP - 586 EP - 610 SN - 2766-905X DO - https://doi.org/10.59275/j.melba.2024-geb5 UR - https://melba-journal.org/2024:002 ER -

2024:002 cover

Disclaimer: the following html version has been automatically generated and the PDF remains the reference version. Feedback can be sent directly to publishing-editor@melba-journal.org

1 Introduction

Magnetic resonance imaging (MRI) is widely regarded as the premier clinical imaging modality, in large part due to the unparalleled range of contrast mechanisms available to it. Conventional MRI exploits this contrast in a purely qualitative manner: images provide only relative information, such that voxel intensities are only meaningful in the context of their neighbours. In contrast, quantitative MRI (qMRI) provides quantitative images, where voxel intensities are directly, and meaningfully, related to underlying tissue properties. Compared to conventional MRI, this approach promises increased reproducibility, interpretability, and tissue insight, at the cost of time-intensive image acquisition and post-processing (Cercignani et al., 2018).

One of the biggest time and resource bottlenecks in post-processing is parameter estimation, whereby a signal model is fit to the intensity variation across multiple MR images acquired at different experimental settings. Each voxel requires its own independent model fit: solving for the signal model parameters that best described the single voxel’s data. The computational cost of this curve-fitting process, which scales with both voxel number and model complexity, has become a bottleneck for modern qMRI experiments.

Accelerating curve fittings with deep learning (DL) was first proposed more than 30 years ago (Bishop and Roach, 1992), but has only recently gained popularity within the qMRI community (Golkov et al., 2016; Bertleff et al., 2017; Liu et al., 2020; Barbieri et al., 2020; Palombo et al., 2020). Just like traditional methods, DL relies on model fitting, but the model being fit is a fundamentally different one. Instead of fitting a qMRI signal model to a single voxel of interest (i.e. curve fitting), DL methods fit (“train”) a deep neural network (DNN) model to an ensemble of training voxels. This model maps a single voxel’s signal to its corresponding qMRI parameters; the unknowns in its fitting are network weights, rather than qMRI parameters. Once this DNN model has been fit to (“trained on”) the training data, parameter estimation is reduced to simply applying it to new unseen data, one voxel at a time. This approach offers two broad advantages over traditional fitting: (1) computational cost is amortised: despite being more computationally expensive than one-voxel signal model fitting, DL training only needs to be performed once, for any number of voxels; once trained, networks provide near-instantaneous parameter estimates on new data, and (2) computational cost is front-loaded: model training can be performed away from the clinic, before patient data is acquired.

To date, most DL qMRI fitting methods have been implemented within a supervised learning framework (Golkov et al., 2016; Bertleff et al., 2017; Yoon et al., 2018; Liu et al., 2020; Palombo et al., 2020; Aliotta et al., 2021; Yu et al., 2021; Gyori et al., 2022). This approach trains DNNs to predict groundtruth qMRI model parameters from noisy qMRI signals. When compared to conventional fitting, this approach has been found to produce high bias, low variance parameter estimates (Grussu et al., 2021; Gyori et al., 2022).

An alternative class of DL methods has also been proposed, sometimes referred to as unsupervised learning (Barbieri et al., 2020; Mpt et al., 2021), but more accurately described as self-supervised (introduction, ). In this framework, training labels are not explicitly provided, but are instead extracted by the network from its training input. This label generation is designed such that the network learns to predict signal model parameters corresponding to noise-free signals that most-closely approximate noisy inputs. This self-supervised approach has been found to produce similar results to conventional non-DL fitting, i.e. lower bias and higher variance than its groundtruth-labelled supervised alternative (Barbieri et al., 2020; Grussu et al., 2021).

From an information theoretic standpoint, the comparison between supervised and self-supervised performance raises an obvious unanswered question. How can it be that supervised methods, which provide strictly more information during training than their self-supervised counterparts, produce more biased parameter estimates?

In this work we answer this question by showing that this apparent limitation of supervised approaches stems purely from the selection of groundtruth training labels. By using intentionally-non-groundtruth training labels, pre-computed via independent maximum likelihood estimation, we show that the low-bias parameter estimation previously associated with self-supervised methods can be replicated – and improved on – within a supervised learning framework.

This approach sets the stage for a single, unifying, deep learning parameter estimation framework, based on supervised learning, where trade-offs between bias and variance can be made, on an application-specific basis, by careful adjustment of training label.

The rest of the paper is organized as follows: Section 2 describes existing DL parameter estimation approaches, our proposed method, and how they relate to each other; Section 3 describes the evaluation of our method and its comparison to the state of the art; Section 4 contains our findings; and Section 5 summarizes the contribution and discusses future work.

2 Theory

Quantitative MRI extracts biomarkers y𝑦y from MR data x𝑥x, producing quantitative spatial maps. We here describe existing voxelwise approaches to this problem (conventional fitting and DL alternatives) as well as our proposed novel method.

2.1 Conventional iterative fitting

This method, which relies on maximum likelihood estimation (MLE), extracts biomarkers by performing a voxelwise model fit every time new data is acquired. An appropriate signal model M𝑀M is required, parameterised by nysubscript𝑛𝑦n_{y} parameters of interest; for each combination of y𝑦y, the probability of observing the acquired data x𝑥x is known as the likelihood L𝐿L of those parameters:

L(x,z|y,ϵ)=i=1nzP(xi,zi|y,ϵ)𝐿𝑥conditional𝑧𝑦italic-ϵsuperscriptsubscriptproduct𝑖1subscript𝑛𝑧𝑃subscript𝑥𝑖conditionalsubscript𝑧𝑖𝑦italic-ϵL(x,z|y,\epsilon)=\prod_{i=1}^{n_{z}}P(x_{i},z_{i}|y,\epsilon)(1)

for nzsubscript𝑛𝑧n_{z} acquisitions from sampling scheme z𝑧z and noise model ϵitalic-ϵ\epsilon. The model parameters y^^𝑦\hat{y} which maximise the likelihood L𝐿L are assumed to best represent the tissue contained within the voxel of interest:

y^=argmaxyL(x,z|y,ϵ)^𝑦subscriptargmax𝑦𝐿𝑥conditional𝑧𝑦italic-ϵ\hat{y}=\operatorname*{arg\,max}_{y}L(x,z|y,\epsilon)(2)

Under a Gaussian noise model, this likelihood maximization reduces to the commonly-used non-linear least squares (NLLS):

y^=argminyi=1nzM(zi|y)xi2^𝑦subscriptargmin𝑦superscriptsubscript𝑖1subscript𝑛𝑧superscriptdelimited-∥∥𝑀conditionalsubscript𝑧𝑖𝑦subscript𝑥𝑖2\hat{y}=\operatorname*{arg\,min}_{y}\sum_{i=1}^{n_{z}}\lVert M(z_{i}|y)-x_{i}\rVert^{2}(3)

under the assumption of signal model M𝑀M associated with groundtruth biomarkers ygtsubscript𝑦𝑔𝑡y_{gt}, sampling scheme z𝑧z, and noise ϵitalic-ϵ\epsilon:

x=M(z|ygt)+ϵ𝑥𝑀conditional𝑧subscript𝑦𝑔𝑡italic-ϵx=M(z|y_{gt})+\epsilon(4)

Each of these optimisations has nysubscript𝑛𝑦n_{y} unknowns, which are solved for independently across different voxels; the computational cost scales linearly with the number of voxels nvsubscript𝑛𝑣n_{v}.

Developments in qMRI acquisition and analysis have led to increased (i) image spatial resolution (i.e. greater nvsubscript𝑛𝑣n_{v}) and (ii) model complexity (i.e. greater nysubscript𝑛𝑦n_{y}), such that conventional MLE fitting has become increasingly computationally expensive.

2.2 Existing deep learning methods

Deep learning approaches address this by reframing nvsubscript𝑛𝑣n_{v} independent problems into a single global model fit: learning the function \mathcal{F} that maps any x𝑥x to its corresponding ygtsubscript𝑦𝑔𝑡y_{gt}:

ygt=(x)subscript𝑦𝑔𝑡𝑥y_{gt}=\mathcal{F}(x)(5)

Deep neural networks aim to approximate this function by composing a large but finite number of building-block functions, parametrised by npsubscript𝑛𝑝n_{p} network parameters p𝑝p (“weights”):

y^=^(x|p^)^𝑦^conditional𝑥^𝑝\hat{y}=\hat{\mathcal{F}}(x|\hat{p})(6)

In this context, model fitting (“training”), is performed over network parameters p𝑝p and involves maximising ^^\hat{\mathcal{F}}’s mean performance over a large set of training examples; the trained network is defined by the best-fit parameters p^^𝑝\hat{p}. This fitting problem, whilst more computationally expensive to solve than any individual voxel (nv=1subscript𝑛𝑣1n_{v}=1) MLE, is only tackled once; once ^^\hat{\mathcal{F}} is learnt, it can be applied at negligible cost to new, unseen, data. This promise of rapid, zero-cost parameter estimation has led to the development of two broad classes of DL-based parameter estimation methods.

SupervisedGT methods approximate \mathcal{F} by minimising the difference between a large number of noise-free training labels (groundtruth parameter values) and corresponding network outputs (noise-free parameter estimates); training loss is calculated in the parameter space Y𝑌Y:

SupervisedGT training loss=i=1ntrainW(y^iygt,i)2SupervisedGT training losssuperscriptsubscript𝑖1subscript𝑛𝑡𝑟𝑎𝑖𝑛superscriptdelimited-∥∥𝑊subscript^𝑦𝑖subscript𝑦𝑔𝑡𝑖2\text{{Supervised\textsubscript{GT} training loss}}=\sum_{i=1}^{n_{train}}\lVert W\cdot(\hat{y}_{i}-y_{gt,i})\rVert^{2}(7)

where ntrainsubscript𝑛𝑡𝑟𝑎𝑖𝑛n_{train} is the number of training samples and W𝑊W is a tunable weight matrix which accounts for magnitude differences in signal model parameters. W𝑊W is generally a diagonal matrix, with each diagonal element Wiisubscript𝑊𝑖𝑖W_{ii} corresponding to the relative weighting of qMRI parameter yisubscript𝑦𝑖y_{i}; setting W𝑊W as the identity matrix equally weights all parameters in the training loss.

These methods produce higher bias, lower variance parameter estimation than conventional MLE fitting (Grussu et al., 2021; Gyori et al., 2022) and, by adjusting W𝑊W, can be tailored to selectively boost estimation performance on a subset of the parameter space Y𝑌Y.

In contrast, Self-supervised methods compute training loss within the signal space X𝑋X, by minimising the difference between network inputs (noisy signals) and a filtered representation of network outputs (noise-free signal estimates):

Self-supervised training loss=i=1ntrainM(z|y^i)xi2Self-supervised training losssuperscriptsubscript𝑖1subscript𝑛𝑡𝑟𝑎𝑖𝑛superscriptdelimited-∥∥𝑀conditional𝑧subscript^𝑦𝑖subscript𝑥𝑖2\text{{Self-supervised training loss}}=\sum_{i=1}^{n_{train}}\lVert M(z|\hat{y}_{i})-x_{i}\rVert^{2}(8)

These methods, which perform similarly to conventional MLE fitting, produce lower bias, higher variance parameter estimation than SupervisedGT(Grussu et al., 2021; Barbieri et al., 2020). Unlike SupervisedGT, the relative loss weighting of different signal model parameters is dictated by sampling scheme z𝑧z.

Under Gaussian noise conditions, single-voxel Self-supervised loss (i.e. minimising the sum of squared differences between a noisy signal and its noise-free signal estimate) is indistinguishable from the corresponding objective function in conventional fitting.

In contrast, under the Rician noise conditions encountered in MRI acquisition(Gudbjartsson and Patz, 1995), Self-supervised training loss no longer matches conventional fitting. Indeed, the sum of squared errors between noisy signals and noise-free estimates is not an accurate difference metric in the presence of Rician noise.

To summarise: existing supervised DL techniques are associated by high estimation bias, low variance, and end-user flexibility; in contrast, self-supervised methods have lower bias, higher variance, but are limited by the fact their loss is calculated in the signal space X𝑋X.

2.3 Proposed deep learning method

In light of this, we propose SupervisedMLE, a novel parameter estimation method which combines the advantages of SupervisedGT and Self-supervised methods. This method is contrasted to existing techniques in Fig 1.

This method mimics Self-supervised’s low-bias performance by learning a regularised form of conventional MLE, but does so in the parameter space Y𝑌Y, within a supervised learning framework. This addresses the limitations of Self-supervised: Rician noise modelling is incorporated, and parameter loss weighting is not limited by sampling scheme z𝑧z.

Our method learns ^^\hat{\mathcal{F}} by training on noisy signals paired with conventional MLE labels. These labels act as proxies for the groundtruth parameters we wish to estimate:

SupervisedMLE training loss =i=1ntrainW(y^iyMLE,i)2SupervisedMLE training loss superscriptsubscript𝑖1subscript𝑛𝑡𝑟𝑎𝑖𝑛superscriptdelimited-∥∥𝑊subscript^𝑦𝑖subscript𝑦𝑀𝐿𝐸𝑖2\text{{Supervised\textsubscript{MLE} training loss} }=\sum_{i=1}^{n_{train}}\lVert W\cdot(\hat{y}_{i}-y_{MLE,i})\rVert^{2}(9)

where yMLE,isubscript𝑦𝑀𝐿𝐸𝑖y_{MLE,i} is the maximum likelihood estimate associated with the ithsuperscript𝑖𝑡i^{th} training sample.

Our method offers one final advantage over Self-supervised approaches. In addition to the parameter estimation improvements relating to noise model correction and parameter loss weighting, it naturally interfaces with SupervisedGT. In so doing, it presents the opportunity to combine low-bias and low-variance methods into a single, tunable hybrid approach, by a simple weighted sum of each method’s loss function:

Hybrid training loss =αSupervisedMLE loss +(1α)SupervisedGT lossHybrid training loss 𝛼SupervisedMLE loss 1𝛼SupervisedGT loss\text{{Hybrid training loss} }=\alpha\cdot\text{{Supervised\textsubscript{MLE} loss} }+(1-\alpha)\cdot\text{{Supervised\textsubscript{GT} loss}}(10)
Refer to caption
Figure 1: Comparison between our proposed method (SupervisedMLE) and existing supervised and self-supervised approaches.

3 Experimental evaluation

Three classes of network were investigated and compared: SupervisedGT, Self-supervised, and SupervisedMLE, as described in Fig 1. Additionally, to control for differences in loss function weighting between supervised and unsupervised methods, Self-supervised was converted into supervised form by training SupervisedMLE on Gaussian-model based MLE labels. All models are summarised in Table 1.

Table 1: Summary of evaluated parameter estimation networks. Y𝑌Y denotes parameter space; X𝑋X denotes signal space.
Network nameLoss spaceLabelLabel noise model
SupervisedGTY𝑌YGroundtruthN/A
Self-supervisedX𝑋XN/AN/A
SupervisedMLE, RicianY𝑌YMLERician
SupervisedMLE, GaussianY𝑌YMLEGaussian

All networks were trained and tested on the same datasets; differences in performance can be attributed solely to differences in loss function formulation and training label selection.

3.1 Signal model

The intravoxel incoherent motion (IVIM) model (Le Bihan et al., 1986) was investigated as an exemplar 4-parameter non-linear qMRI model which poses a non-trivial model fitting problem and is well-represented in the DL qMRI literature (Bertleff et al., 2017; Barbieri et al., 2020; Mpt et al., 2021; Mastropietro et al., 2022; Rozowski et al., 2022):

S(b|S0,f,Dslow,Dfast)=S0(feb(Dfast+Dslow)+(1f)ebDslow)𝑆conditional𝑏subscript𝑆0𝑓subscript𝐷𝑠𝑙𝑜𝑤subscript𝐷𝑓𝑎𝑠𝑡subscript𝑆0𝑓superscript𝑒𝑏subscript𝐷𝑓𝑎𝑠𝑡subscript𝐷𝑠𝑙𝑜𝑤1𝑓superscript𝑒𝑏subscript𝐷𝑠𝑙𝑜𝑤S(b|S_{0},f,D_{slow},D_{fast})=S_{0}(fe^{-b(D_{fast}+D_{slow})}+(1-f)e^{-bD_{slow}})(11)

where S𝑆S corresponds to the signal model M𝑀M, b𝑏b corresponds to the sampling scheme z𝑧z, and [S0,f,Dslow,Dfast]subscript𝑆0𝑓subscript𝐷𝑠𝑙𝑜𝑤subscript𝐷𝑓𝑎𝑠𝑡[S_{0},f,D_{slow},D_{fast}] corresponds to the parameter-vector y𝑦y. In physical terms, IVIM is a two-compartment diffusion model, wherein signal decay arises from both molecular self-diffusion (described by Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow}) and perfusion-induced ‘pseudo-diffusion’ (described by Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast}). In Equation 11, S0subscript𝑆0S_{0} is an intensity normalisation factor and f𝑓f denotes the signal fraction corresponding to the perfusing compartment.

3.2 Network architecture

Network architecture was harmonised across all network variants, and represents a common choice in the existing qMRI literature (Barbieri et al., 2020): 3 fully connected hidden layers, each with a number of nodes matching the number of signal samples z𝑧z (i.e. b-values), and an output layer with a number of nodes matching the number of model parameters. Wider (150 nodes per layer) and deeper (10 hidden layers) networks were investigated and found to have equivalent performance, during both training and testing, at the cost of increased training time. All networks were implemented in Pytorch 1.9.0 with exponential linear unit activation functions (Clevert et al., 2015); ELU performance is similar to ReLU, but is more robust to poor network weight initialisation.

3.3 Training data

Training datasets were generated at SNR =[15,30]absent1530=[15,30] to investigate parameter estimation performance at both high and low noise levels. At each SNR, 100,000 noise-free signals were generated from uniform IVIM parameter distributions (S0[0.8,1.2]subscript𝑆00.81.2S_{0}\in[0.8,1.2], f[0.1,0.5]𝑓0.10.5f\in[0.1,0.5], Dslow[0.4,3.0]103mm2/ssubscript𝐷𝑠𝑙𝑜𝑤0.43.0superscript103𝑚superscript𝑚2𝑠D_{slow}\in[0.4,3.0]10^{-3}mm^{2}/s, Dfast[10,150]103mm2/ssubscript𝐷𝑓𝑎𝑠𝑡10150superscript103𝑚superscript𝑚2𝑠D_{fast}\in[10,150]10^{-3}mm^{2}/s, representing realistic tissue values), sampling them with a real-world acquisition protocol (Zhao et al., 2015) (b=[0,10,20,30,50,80,100,200,400,800]𝑏01020305080100200400800b=[0,10,20,30,50,80,100,200,400,800] s/mm2𝑠𝑚superscript𝑚2s/mm^{2}), and adding Rician noise. Training data generative parameters were drawn from uniform, rather than in-vivo, parameter distributions to minimise bias in network parameter estimation(Gyori et al., 2022). Data were split 80/20 between training and validation. MLE labels were calculated using a bound-constrained non-linear fitting algorithm, implemented with scipy.optimize.minimize, using either Rician log-likelihood (for SupervisedMLE, Rician) or sum of squared errors (for SupervisedMLE, Gaussian) as fitting objective function. This algorithm was initialised with groundtruth values (i.e. generative y𝑦y) to improve fitting robustness and avoid local minima. Training/validation samples associated with ‘poor’ MLE labels (defined as lying on the boundary of the bound-constrained estimation space) were held out during training and ignored during validation.

3.4 Network training

Network training was performed using an Adam optimizer (learning rate = 0.001, betas = (0.9, 0.999), weight decay=0) as follows: SupervisedGT (at SNR 30) was trained 16 times on the same data, each time initialising with different network weights, to improve robustness to local minima during training. From this set of trained networks, a single SupervisedGT network was selected on the basis of validation loss. The trained weights of this selected network were subsequently used to initialise all other networks; in this way, any differences in network performance could be solely attributed to differences in training label selection and training loss formulation. In the case of supervised loss formulations, the inter-parameter weight vector W𝑊W was chosen as the inverse of each parameter’s mean value over the training set, to obtain equal loss weighting across all four IVIM parameters.

3.5 Testing data

Networks were tested on both synthetic and real qMRI data. The synthetic approach offers (i) known parameter groundtruths to assess estimation against, (ii) arbitrarily large datasets, and (iii) tunable data distributions, but is based on possibly simplified qMRI signals. This approach was used to assess parameter estimation performance in a controlled, rigorous manner; real data was subsequently used to validate the trends observed in silico.

Synthetic data was generated with sampling, parameter distributions, and noise levels matching those used in network training. The IVIM parameter space in which the networks were trained was uniformly sub-divided 10 times in each dimension, to analyse estimation performance as a function of parameter value. At each point in the parameter space, 500 corresponding noisy signals were generated and used to test network performance, accounting for variation under noise repetition.

Real data was acquired from the pelvis of a healthy volunteer, who gave informed consent, on a wide-bore 3.0T clinical system (Ingenia, Philips, Amsterdam, Netherlands), 5 slices, 224 x 224 matrix, voxel size = 1.56 x 1.56 x 7mm, TE = 76ms, TR = 516ms, scan time = 44s per 10 b-values listed in subsection 3.3. For the purposes of assessing parameter estimation methods, we obtained gold standard voxelwise IVIM parameter estimates from a supersampled dataset (16-fold repetition of the above acquisition, within a single scanning session, generating 160 b-values, total scan time = 11m44s). Conventional MLE was performed on this supersampled data to produce best-guess “groundtruth” parameters. During testing, the supersampled dataset was split into 16 distinct 10 b-value acquisitions, each corresponding to a single realistic clinical acquisition. All images were visually confirmed to be free from motion artefacts. The mismatch in parameter distributions between this in-vivo data (highly non-uniform) and the previously-described synthetic data (uniform by construction) limited the scope for validating our in-silico results. To address this, a final synthetic testing dataset was generated from the in-vivo MLE-derived “groundtruth” parameters, and was used for direct comparison between real and simulated data.

3.6 Evaluation metrics

Parameter estimation performance was evaluated using 3 key metrics: (i) mean bias with respect to groundtruth, (ii) mean standard deviation under noise repetition, and (iii) root mean squared error (RMSE) with respect to groundtruth. RMSE is the most commonly used metric to evaluate estimation performance (Barbieri et al., 2020; Bertleff et al., 2017), but is limited in its ability to disentangle accuracy and precision; to this end, mean bias and standard deviation were used as more specific measures of network performance.

It is important to note that all methods were assessed with respect to groundtruth qMRI parameters, even those trained on MLE labels. For these methods, the training and validation loss (MLE-based) differed from the reported testing loss (groundtruth-based).

4 Results & discussion

This section summarises our main findings and discusses the advantages offered by the parameter estimation method we propose.

4.1 Comparison of parameter estimation methods

The relative performance of all previously-discussed parameter estimation methods is summarised in Figures 2 and 3. These figures show the bias, variance (represented by its square root: standard deviation), and RMSE of parameter estimates with respect to groundtruth values, reported for each model parameter as a function of its value over the synthetic test dataset; each plotted point represents an average over 500 noise instantiations and a marginalisation over all non-visualised parameters. Marginalisation was required for visualisation of a 4-dimensional parameter space, but was confirmed to be representative of the entire, non-marginalised space, as discussed in §4.5.

Refer to caption
Figure 2: Parameter estimation performance at low SNR (15) as a function of groundtruth parameter Y𝑌Y. Performance summarised by bias & RMSE with respect to groundtruth and standard deviation with respect to noise repetition. Conventional MLE fitting is provided as a non-DNN reference standard. For the sake of visualisation, each plotted point represents marginalisation over all non-specified Y𝑌{Y} dimensions.
Refer to caption
Figure 3: Parameter estimation performance, visualised as in Figure 2, but for high SNR (30) data.

In keeping with previously reported results, we show a bias/variance trade-off between different parameter estimation methods. Conventional MLE fitting is provided as a reference (plotted in black). Approaches which, on a theoretical level, approximate conventional MLE (Self-supervised and SupervisedMLE, plotted in red), are generally associated with low bias, high variance, and high RMSE, whereas groundtruth-labelled supervised methods (plotted in blue) exhibit lower variance and RMSE at the cost of increased bias.

Increases in bias, if consistent across parameter space Y𝑌Y, do not necessarily reduce sensitivity to differences in underlying tissue properties. However, we show that SupervisedGT𝑆𝑢𝑝𝑒𝑟𝑣𝑖𝑠𝑒subscript𝑑𝐺𝑇Supervised_{GT} is associated with bias that varies significantly as a function of groundtruth parameter values. This results in a reduction in information content, visualised as the gradient of the bias plots (top row) in Fig 2. The more negative the gradient, the more parameter estimates are concentrated in the centre of the parameter estimation space X^^𝑋\hat{X}, and the lower the ability of the method to distinguish differences in tissue properties. This information loss can be seen in Fig 4, which compares SupervisedGT𝑆𝑢𝑝𝑒𝑟𝑣𝑖𝑠𝑒subscript𝑑𝐺𝑇Supervised_{GT} to conventional MLE fitting, and shows the compression in X^^𝑋\hat{X} over the groundtruth parameter-space X𝑋X.

Refer to caption
Figure 4: Comparison between SupervisedGT and reference conventional MLE fitting, expressed in terms of estimation bias and information compression at low SNR (15). Arrows represent the mean mapping from Y𝑌Y to Y^^𝑌\hat{Y}, averaged over noise, as a function of parameter space Y𝑌Y. For the sake of visualisation, each plotted point represents marginalisation over all non-specified Y𝑌{Y} dimensions.

4.2 Validation against clinical data

The above trends, found in simulation, were also observed in real-world data. Fig 5 shows the bias, variance, and RMSE of parameter estimates with respect to “groundtruth” values (obtained from the supersampled dataset described in §3.5). The x𝑥x axes of these plots correspond to these reference values. To aid visualisation, 10 uniform bins were constructed along each parameter dimension, into which clinical voxels were assigned based on their “groundtruth” parameter values. Fig 5 plots the mean bias, standard deviation, and RMSE associated with each bin as a function of the bin’s central value, together with the distribution of voxels across the 10 bins.

By calculating the variance of the 16 b=0𝑏0b=0 images, the SNR of this clinical dataset was found to be similar-to\sim15; Fig 2 is therefore the relevant point of comparison. It can be readily seen that the trends observed in simulated data, described in §4.1, are replicated for f<0.40𝑓0.40f<0.40, Dslow<1.5subscript𝐷𝑠𝑙𝑜𝑤1.5D_{slow}<1.5, and the entire range of Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast}, namely the regions of parameter-space which are well-represented in the real-world data. Fig 6 confirms that divergence outside of these ranges is due to under-representation in the in vivo test data; the apparent divergences can be replicated in-silico by matching real-world parameter distributions.

Fig 7 contains exemplar parameter maps from the clinical test data, and shows the real-world implications of the trends summarised in Figures 2 and 5: SupervisedGT𝑆𝑢𝑝𝑒𝑟𝑣𝑖𝑠𝑒subscript𝑑𝐺𝑇Supervised_{GT}’s low-variance, low-RMSE parameter estimation results in artificially smooth IVIM maps biased towards mean parameter values.

Refer to caption
Figure 5: In vivo parameter estimation performance of networks trained on low SNR (15) synthetic data, as a function of supersampling-derived reference parameter values. The first three rows summarise performance by showing bias & RMSE with respect to reference value and standard deviation with respect to noise repetition, marginalised over all non-specified Y𝑌{Y} dimensions. The bottom row shows the distribution of reference parameter values across the parameter range being visualised.
Refer to caption
Figure 6: Parameter estimation performance of networks trained on low SNR (15) synthetic data, tested on a synthetic dataset matching the distribution of in vivo reference parameter values. The first three rows summarise performance by showing bias & RMSE with respect to groundtruth value and standard deviation with respect to noise repetition, marginalised over all non-specified Y𝑌{Y} dimensions. The bottom row shows the distribution of groundtruth parameter values across the parameter range, which matches the in vivo dataset by construction.
Refer to caption
Figure 7: Parameter estimation performance of networks on real-world test data, visualised as spatial maps. Groundtruth maps are taken as the maximum likelihood parameter estimates associated with the complete 160 b-value dataset, whereas network predictions are obtained from a single 10 b-value subsample.

4.3 Advantages offered by our method

Our proposed method occupies the low-bias side of the bias-variance trade-off discussed in §3.5, and offers four broad advantages over the competing method in this space (Self-supervised): (i) flexibility in choosing inter-parameter loss weighting W𝑊W, (ii) incorporation of non-Gaussian (e.g. Rician) noise models, (iii) compatibility with complex, non-differentiable signal models M𝑀M, and (iv) ability to interface with low-variance methods, to produce a hybrid approach tunable to the needs of the task at hand. These advantages are analysed in turn.

4.3.1 Choice of inter-parameter loss weighting W𝑊W

By computing loss in parameter-space Y𝑌Y, our method has total flexibility in adjusting the relative contribution of different y𝑦y in the training loss function. In contrast, since Self-supervised calculates training loss in X𝑋X, the relative weighting depends on the acquisition protocol z𝑧z. Fig 8 compares our method - weighted so as to not discriminate between different model parameters - with variants designed to overweight single parameters by a factor of 106superscript10610^{6}. The potential advantages offered by this selective weighting are seen in the estimation Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast}, where this approach leads to a small increase in both precision and accuracy. This parameter-specific weighting is not accessible within a Self-supervised framework.

In light of the differences arising from inter-parameter loss weighting, for subsequent analysis we use SupervisedMLE, Gaussian as a proxy for Self-supervised; both methods encode the same regularised MLE fitting, but differ in their inter-parameter weighting.

Refer to caption
Figure 8: Comparison between SupervisedMLE, Rician, as described above, and variants which differ in their inter-parameter loss weighting W𝑊W, at low SNR (15). Each column compares SupervisedMLE, Rician to a different network variant, uniquely trained to overweight the single relevant signal model parameter. For the sake of visualisation, each plotted point represents marginalisation over all non-specified Y𝑌{Y} dimensions.

4.3.2 Incorporation of Rician noise modelling

By pre-computing MLE labels using conventional parameter estimation methods, we are able to incorporate accurate Rician noise modelling. Comparison between SupervisedMLE, Rician and SupervisedMLE, Gaussian shows the effect of the choice of noise model; these differences are most pronounced at low SNR (Fig 2) and high Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow}, when the Gaussian approximation of Rician noise is known to break down. In this regime, our method gives less biased, more informative Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} estimates, replicating conventional MLE performance at a fraction of the computational cost. At high Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow}, our method has a flatter, more information-rich, Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} bias curve than all other DL methods. This information loss is further visualised in Fig 9, which shows the compression in Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} estimates X^^𝑋\hat{X} over the groundtruth parameter-space X𝑋X. As expected, this compression is most apparent at high values of Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow}, when the signal is more likely to approach the Rician noise floor.

Refer to caption
Figure 9: Comparison of the information content captured by SupervisedMLE methods, as a function of the noise model used in computing MLE labels, at low SNR (15). Arrows represent the mean mapping from Y𝑌Y to Y^^𝑌\hat{Y}, averaged over noise, as a function of parameter space Y𝑌Y. For the sake of visualisation, each plotted point represents marginalisation over all non-specified Y𝑌Y dimensions.

4.3.3 Compatibility with complex signal models

An additional advantage of computing training loss in parameter-space Y𝑌Y is that DNN networks are signal model agnostic: network training does not require explicit calculation of M𝑀M. This approach is advantageous when working with complex signal models, as made clear by comparison with Self-supervised methods. In contrast with our proposed approach, Self-supervised methods embed M𝑀M between network output and training loss (see Fig 1); this poses two practical limitations.

The first relates to efficient implementation of mini-batch loss, which requires a vectorised representation (and calculation) of predicted signals. This may pose a non-trivial challenge in the case of complex signal models. The second limitation relates to how training loss is minimised: network parameters p𝑝p are updated by computing partial derivatives of the training loss. This process requires the loss to be expressed in a differentiable form; embedding M𝑀M in the loss formulation limits Self-supervised methods to signal models that can be expressed in an explicitly differentiable form.

Our method sidesteps both limitations by not requiring explicit calculation of M𝑀M during training, and is therefore compatible with a wider range of complex qMRI signal models.

4.3.4 Tunable network approach

As discussed above, we show a clear bias/variance trade-off between different parameter estimation methods. The optimal choice of method depends on the task at hand (Epstein et al., 2021), and may not lie at either extreme of this trade-off. Therefore, it would be advantageous to be able to combine low-bias and low-variance methods into a single, hybrid approach, with performance tunable by the relative contribution of each constituent method. Our proposed method, which interfaces naturally with SupervisedGT𝑆𝑢𝑝𝑒𝑟𝑣𝑖𝑠𝑒subscript𝑑𝐺𝑇Supervised_{GT}, offers exactly that. An example of this approach is shown in Fig 10: training loss has been weighted equally (α=0.5𝛼0.5\alpha=0.5) between groundtruth and MLE labels, and, as expected, the resulting network performance lies in the middle ground between these two extremes.

Refer to caption
Figure 10: Proof of concept of a hybrid parameter estimation method, formed by training a supervised network with an equally-weighted sum of SupervisedMLE, Rician and SupervisedGT loss functions (α=0.5𝛼0.5\alpha=0.5), at low SNR (15). For the sake of visualisation, each plotted point represents marginalisation over all non-specified Y𝑌Y dimensions.

4.3.5 Comparison with conventional fitting

Comparison between our proposed method (SupervisedMLE, Rician) and conventional fitting (MLE, Rician) highlights additional advantages offered by our approach. Firstly, Figs 2 and 3 demonstrate qualitatively similar performance between these methods across the entire parameter space. The fact that our method, which offers near-instantaneous parameter estimation, produces similar parameter estimates to well-understood conventional MLE methods justifies its adoption in and of itself. However, our method not only mimics but indeed in many cases outperforms (lower bias, variance, and RMSE) the very same method used to compute those labels. This result not only motivates its use, but also confirms that DL methods are able to exploit information shared between training samples beyond what would be possible by considering each sample in isolation.

4.4 A note on RMSE

We note that RMSE is a poor summary measure of network performance. RMSE is heavily skewed by outliers, and thus favours methods which give parameter estimates consistently close to mean parameter values. Such estimates, as in the case of Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast}, may contain very little information (Fig 4) despite being associated with low RMSE. Accordingly, we strongly recommend that RMSE be discontinued as a single summary metric for parameter estimation performance: it must always be accompanied by bias, variance, and ideally an analysis of information content.

RMSE’s limitations as a performance metric during testing may also call into question its suitability as a loss metric during training. This work, much like the rest of the DL qMRI literature, employs a training loss (MSE, described in Sections 2.2 and 2.3) which is monotonically related to RMSE. Whilst outside the scope of this work, implementing a non-RMSE-derived training loss (such as mean absolute error) may be worth of future investigation.

4.5 Justification of parameter marginalisation

The above analysis has been largely based on Figs 2 and 3, which show parameter estimation performance marginalised over 3 dimensions of X𝑋X. This choice, made to aid visualisation, was validated against higher dimensional representations of the same data.

Fig 11 compares SupervisedMLE, Rician and SupervisedGroundtruth performance across the entire qMRI parameter space. It can be seen that trends observed in Fig 2 are replicated here; we draw attention to two such examples. Firstly, Fig 2 suggests SupervisedGroundtruth produces lower f𝑓f standard deviation than SupervisedMLE, Rician; Fig 11 confirms this to be the case across all test data. In contrast, Fig 2 suggests that SupervisedGroundtruth produces higher Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} bias at low Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} and lower bias at high Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow}; Fig 11 confirms a spread of bias differences across the test data: some favouring one method, and others the other. This effect is explored in Fig 12, which compares Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} estimation performance as a function of f𝑓f and Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast} at two specific (non-marginalised) groundtruth Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} values (0.69,2.71)0.69,2.71). As expected from the marginalised representation in Fig 2, at low Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} SupervisedGroundtruth produces higher bias across the entire f𝑓f-Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast} parameter space, whereas at high Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} the opposite is true.

Despite this, it is important to note the limitations of marginalisation. Fig 12 also shows that the relative performance of SupervisedMLE, Rician and SupervisedGroundtruth varies across all parameter-space dimensions. Consider Dslow=0.69subscript𝐷𝑠𝑙𝑜𝑤0.69D_{slow}=0.69, where 2 shows similar marginalised RMSE for these methods. In fact, by visualising this difference as a function of f𝑓f and Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast}, we reveal two distinct regions: high f𝑓f/low Dfastsubscript𝐷𝑓𝑎𝑠𝑡D_{fast} (where SupervisedMLE, Rician produces lower RMSE), and elsewhere (where it produces higher RMSE). This highlights (i) the potential pitfalls of producing summary results by marginalising across entire parameter spaces and (ii) the need to choose parameter-estimation methods appropriate for the specific parameter combinations relevant to the tissues being investigated (Epstein et al., 2021).

Refer to caption
Figure 11: Non-marginalised comparison of parameter estimation performance between SupervisedMLE, Rician and SupervisedGroundtruth at low SNR (15). Colour intensity represents density of distribution across all X𝑋X and all noise repetitions.
Refer to caption
Figure 12: Differences in performance (bias, standard deviation, RMSE) between SupervisedMLE, Rician and SupervisedGroundtruth for two groundtruth values of Dslowsubscript𝐷𝑠𝑙𝑜𝑤D_{slow} at low SNR (15). The outermost columns (left and right) correspond to Dslow=0.69subscript𝐷𝑠𝑙𝑜𝑤0.69D_{slow}=0.69 and Dslow=2.71subscript𝐷𝑠𝑙𝑜𝑤2.71D_{slow}=2.71 respectively, and show mean performance under noise repetition, without marginalisation. The central column reproduces the corresponding marginalised representation from Fig 2.

4.6 Non-voxelwise approaches

This work has focused on voxelwise DL parameter estimation methods: networks which map one signal curve to its corresponding parameter estimate. There are, however, alternatives: convolutional neural network methods which map spatially related clusters (“patches”) of qMRI signals to corresponding clusters of parameter estimates (Fang et al., 2017; Ulas et al., 2019; Li et al., 2022). Our MLE training label approach could be incorporated into such methods, and we leave it to future work to investigate the effect this would have on parameter estimation performance.

5 Conclusions

In this work we draw inspiration from state-of-the-art supervised and self-supervised qMRI parameter estimation methods to propose a novel DNN approach which combines their respective strengths. In keeping with previous work, we demonstrate the presence of a bias/variance trade-off between existing methods; supervised training produces low variance under noise, whereas self-supervised leads to low bias with respect to groundtruth.

The increased bias of supervised DNNs is counter-intuitive - when labels are available, these methods have access to more information, and should therefore outperform, non-labelled alternatives. In light of this, we infer that the high bias associated with these supervised methods stems from the nature of the additional information they receive: groundtruth training labels. By substituting these labels with independently-computed maximum likelihood estimates, we show that the low-bias performance previously limited to self-supervised approaches can be achieved within a supervised learning framework.

This framework forms the basis of a novel low-bias supervised learning approach to qMRI parameter estimation: training on conventionally-derived maximum likelihood parameter estimates. This method offers four clear advantages to competing non-supervised low-bias DNN approaches: (i) flexibility in choosing inter-parameter loss weighting, which enables network performance to be boosted for qMRI parameters of interest; (ii) incorporation of Rician noise modelling, which improves parameter estimation at low SNR; (iii) separation between signal model and training loss, which enables the estimation of non-differentiable qMRI signal models;and, crucially, (iv) ability to interface with existing supervised low-variance approaches, to produce a tunable hybrid parameter estimation method.

This final point underpins the key contribution of this work: unifying low-bias and low-variance parameter estimation under a single supervised learning umbrella. When faced with a parameter estimation problem, we no longer need to choose between extremes of the bias/variance trade-off; we can now tune DNN parameter estimation performance to the specific needs of the task at hand. This sets the stage for future work, where this tuning constant is optimised as part of a computational, task-driven, experimental design framework (Epstein et al., 2021).


Acknowledgments

SCE is supported by the EPSRC-funded UCL Centre for Doctoral Training in Medical Imaging (EP/L016478/1). TJPB is supported by an NIHR Clinical Lectureship (CL- 2019-18-001) and, together with MHC, is supported by the National Institute for Health Research (NIHR) Biomedical Research Centre (BRC). This work was undertaken at UCLH/UCL, which receives funding from the UK Department of Health’s the NIHR BRC funding scheme.


Ethical Standards

The work follows appropriate ethical standards in conducting research and writing the manuscript, following all applicable laws and regulations regarding treatment of animals or human subjects.


Conflicts of Interest

The authors confirm they have no conflict of interest to disclose.


Data availability

Data and code are available at https://github.com/seancepstein/training_labels.

References

  • Aliotta et al. (2021) E. Aliotta, H. Nourzadeh, and Patel SH. Extracting diffusion tensor fractional anisotropy and mean diffusivity from 3-direction DWI scans using deep learning. Magnetic Resonance in Medicine, 85(2):845–854, 2021. .
  • Barbieri et al. (2020) S. Barbieri, O. J. Gurney-Champion, R. Klaassen, and Thoeny HC. Deep learning how to fit an intravoxel incoherent motion model to diffusion-weighted MRI. Magnetic Resonance in Medicine, 83(1):312–321, 2020. .
  • Bertleff et al. (2017) Marco Bertleff, Sebastian Domsch, Sebastian Weingärtner, Jascha Zapp, Kieran O’Brien, Markus Barth, and Lothar R. Schad. Diffusion parameter mapping with the combined intravoxel incoherent motion and kurtosis model using artificial neural networks at 3 T. NMR in biomedicine, 30(12), dec 2017. ISSN 1099-1492. . URL https://pubmed.ncbi.nlm.nih.gov/28960549/.
  • Bishop and Roach (1992) C. M. Bishop and C. M. Roach. Fast curve fitting using neural networks. Review of Scientific Instruments, 63(10):4450–4456, oct 1992. ISSN 0034-6748. . URL http://aip.scitation.org/doi/10.1063/1.1143696.
  • Cercignani et al. (2018) Mara Cercignani, Nicholas G. Dowell, and Paul S. Tofts. Quantitative MRI of the Brain. CRC Press, jan 2018. ISBN 9781315363578. . URL https://www.taylorfrancis.com/books/9781315363578.
  • Clevert et al. (2015) Djork Arné Clevert, Thomas Unterthiner, and Sepp Hochreiter. Fast and Accurate Deep Network Learning by Exponential Linear Units (ELUs). 4th International Conference on Learning Representations, ICLR 2016 - Conference Track Proceedings, nov 2015. . URL https://arxiv.org/abs/1511.07289v5.
  • Epstein et al. (2021) Sean C. Epstein, Timothy J.P. Bray, Margaret A. Hall-Craggs, and Hui Zhang. Task-driven assessment of experimental designs in diffusion MRI: A computational framework. PLOS ONE, 16(10):e0258442, oct 2021. ISSN 1932-6203. . URL https://journals.plos.org/plosone/article?id=10.1371/journal.pone.0258442.
  • Fang et al. (2017) Zhenghan Fang, Yong Chen, Weili Lin, and Dinggang Shen. Quantification of relaxation times in MR Fingerprinting using deep learning. Proceedings of the International Society for Magnetic Resonance in Medicine … Scientific Meeting and Exhibition. International Society for Magnetic Resonance in Medicine. Scientific Meeting and Exhibition, 25, apr 2017. ISSN 1524-6965. URL /pmc/articles/PMC5909960//pmc/articles/PMC5909960/?report=abstracthttps://www.ncbi.nlm.nih.gov/pmc/articles/PMC5909960/.
  • Golkov et al. (2016) Vladimir Golkov, Alexey Dosovitskiy, Jonathan I. Sperl, Marion I. Menzel, Michael Czisch, Philipp Sämann, Thomas Brox, and Daniel Cremers. q-Space Deep Learning: Twelve-Fold Shorter and Model-Free Diffusion MRI Scans. IEEE transactions on medical imaging, 35(5):1344–1351, may 2016. ISSN 1558-254X. . URL https://pubmed.ncbi.nlm.nih.gov/27071165/.
  • Grussu et al. (2021) F. Grussu, M. Battiston, M. Palombo, T. Schneider, Wheeler-Kingshott Camg, and Alexander DC. Deep Learning Model Fitting for Diffusion-Relaxometry: A Comparative Study. Mathematics and Visualization, pages 159–172, 2021. .
  • Gudbjartsson and Patz (1995) HáKon Gudbjartsson and Samuel Patz. The rician distribution of noisy mri data. Magnetic Resonance in Medicine, 34(6):910–914, 1995. . URL https://onlinelibrary.wiley.com/doi/abs/10.1002/mrm.1910340618.
  • Gyori et al. (2022) N. G. Gyori, M. Palombo, C. A. Clark, H. Zhang, and Alexander DC. Training data distribution significantly impacts the estimation of tissue microstructure with machine learning. Magnetic Resonance in Medicine, 87(2):932–947, 2022. .
  • (13) Murphy KP. Probabilistic Machine Learning: An introduction. MIT Press; 2022. probml.ai, Available from.
  • Le Bihan et al. (1986) D. Le Bihan, E. Breton, D. Lallemand, P. Grenier, E. Cabanis, and Laval-Jeantet M. Mr imaging of intravoxel incoherent motions: application to diffusion and perfusion in neurologic disorders. Radiology, 161(2):401–407, 1986. .
  • Li et al. (2022) Simin Li, Jian Wu, Lingceng Ma, Shuhui Cai, and Congbo Cai. A simultaneous multi‐slice T 2 mapping framework based on overlapping‐echo detachment planar imaging and deep learning reconstruction. Magnetic Resonance in Medicine, 87(5):2239–2253, may 2022. ISSN 0740-3194. . URL https://onlinelibrary.wiley.com/doi/10.1002/mrm.29128.
  • Liu et al. (2020) Hanwen Liu, Qing San Xiang, Roger Tam, Adam V. Dvorak, Alex L. MacKay, Shannon H. Kolind, Anthony Traboulsee, Irene M. Vavasour, David K.B. Li, John K. Kramer, and Cornelia Laule. Myelin water imaging data analysis in less than one minute. NeuroImage, 210:116551, apr 2020. ISSN 1053-8119. .
  • Mastropietro et al. (2022) A. Mastropietro, D. Procissi, E. Scalco, G. Rizzo, and N. A Bertolino. supervised deep neural network approach with standardized targets for enhanced accuracy of ivim parameter estimation from multi-snr images. NMR in Biomedicine, 35:10, 2022. .
  • Mpt et al. (2021) Kaandorp Mpt, S. Barbieri, R. Klaassen, van Laarhoven Hwm, H. Crezee, P. T. While, et al. Improved unsupervised physics-informed deep learning for intravoxel incoherent motion modeling and evaluation in pancreatic cancer patients. Magnetic Resonance in Medicine, 86(4):2250–2265, 2021. .
  • Palombo et al. (2020) M. Palombo, A. Ianus, M. Guerreri, D. Nunes, D. C. Alexander, N. Shemesh, et al. SANDI: A compartment-based model for non-invasive apparent soma and neurite imaging by diffusion MRI. NeuroImage, 215(11683):5, 2020. .
  • Rozowski et al. (2022) M. Rozowski, J. Palumbo, J. Bisen, C. Bi, M. Bouhrara, W. Czaja, et al. Input layer regularization for magnetic resonance relaxometry biexponential parameter estimation. Magnetic Resonance in Chemistry, 2022. .
  • Ulas et al. (2019) Cagdas Ulas, Dhritiman Das, Michael J. Thrippleton, Maria del C. Valdés Hernández, Paul A. Armitage, Stephen D. Makin, Joanna M. Wardlaw, and Bjoern H. Menze. Convolutional Neural Networks for Direct Inference of Pharmacokinetic Parameters: Application to Stroke Dynamic Contrast-Enhanced MRI. Frontiers in Neurology, 9(JAN):1147, jan 2019. ISSN 1664-2295. . URL https://www.frontiersin.org/article/10.3389/fneur.2018.01147/full.
  • Yoon et al. (2018) J. Yoon, E. Gong, I. Chatnuntawech, B. Bilgic, J. Lee, W. Jung, et al. Quantitative susceptibility mapping using deep neural network: QSMnet. NeuroImage, 179:199–206, 2018. .
  • Yu et al. (2021) T. Yu, E. J. Canales-Rodríguez, M. Pizzolato, G. F. Piredda, T. Hilbert, E. Fischi-Gomez, et al. Model-informed machine learning for multi-component T2 relaxometry. Medical Image Analysis, 69:10194, 2021. .
  • Zhao et al. (2015) Ying-hua Zhao, Shao-lin Li, Zai-yi Liu, Xin Chen, Xiang-cheng Zhao, Shao-yong Hu, Zhen-hua Liu, Ying-jie Mei MS, Queenie Chan, and Chang-hong Liang. Detection of Active Sacroiliitis with Ankylosing Spondylitis through Intravoxel Incoherent Motion Diffusion-Weighted MR Imaging. European Radiology, 25(9):2754–2763, sep 2015. ISSN 0938-7994. .