--- title: Uncertainty quantification for inverse problems with a weak wave-equation constraint author: | Zhilong Fang^1^\*, Curt Da Silva^1^, Rachel Kuske^2^, and Felix J. Herrmann^1^\ ^1^Seismic Laboratory for Imaging and Modeling (SLIM), University of British Columbia\ ^2^Department of Mathematics, University of British Columbia\ bibliography: - zfang.bib --- ## Abstract In this work, we present a new posterior distribution to quantify uncertainties in solutions of wave-equation based inverse problems. By introducing an auxiliary variable for the wavefields, we weaken the strict wave-equation constraint used by conventional Bayesian approaches. With this weak constraint, the new posterior distribution is a bi-Gaussian distribution with respect to both model parameters and wavefields, which can be directly sampled by the Gibbs sampling method. \vspace{-0.1cm} ## Introduction \vspace{-0.2cm} In wave-equation based inverse problems, the goal is to infer the unknown model parameters from the observed data using the wave-equation as a constraint. Conventionally, the wave-equation is treated as a strict constraint in a Bayesian inverse problem. After eliminating this constraint, the problem involves the following forward modeling operator mapping model to predicted data: ```math #forward f(\mvec) = \Pmat\A(\mvec)^{-1}\q, ``` where the vectors ``\mvec \in \mathbb{R}^{\ngrid}`` and ``\q \in \mathbb{C}^{\ngrid}`` represent the discretized ``\ngrid``-dimensional unknown model parameters and known source term, respectively. The matrix ``\A \in \mathbb{C}^{\ngrid \times \ngrid}`` represents the discretized wave-equation operator and the operator ``\Pmat \in \mathbb{R}^{\nrcv \times \ngrid}`` projects the solution of the wave equation ``\uvec = \A(\mvec)^{-1}\q`` onto the ``\nrcv`` receivers. In the Bayesian framework, the solution of an inverse problem given observed data ``\dvec`` is a posterior probability density function (PDF) ``\rhopost(\mvec|\dvec)`` expressed as [@Tarantola1982Bayesian]: ```math #PostD \rhopost(\mvec|\dvec) \propto \rholike(\dvec|\mvec)\rhoprior(\mvec), ``` where the likelihood PDF ``\rholike(\dvec|\mvec)`` describes the probability of observing data ``\dvec`` given model parameters ``\mvec`` and the prior PDF ``\rhoprior(\mvec)`` describes one's prior knowledge about the model parameters. Under the assumption that the noise in the data is Gaussian with zero mean and covariance matrix ``\SIGMAN``, and the prior distribution is also Gaussian with a mean model ``\mp`` and covariance matrix ``\SIGMAP``, the posterior PDF can be written as: ```math #PostD2 \rhopost(\mvec|\dvec) \propto \exp(-\frac{1}{2}(\|f(\mvec)-\dvec\|^{2}_{\SIGMAN^{-1}} + \|\mvec-\mp\|^{2}_{\SIGMAP^{-1}})). ``` Due to the non-linear map ``f(\mvec)`` and the high dimensionality of the model parameters (``\ngrid \geq 10^5``), applying Markov chain Monte Carlo (McMC) methods to sample the posterior PDF (#PostD2) faces a difficult challenge of constructing a proposal PDF that provides a reasonable approximation of the target density with reasonable computational costs [@MartinMcMC2012]. \vspace{-0.1cm} ## Posterior PDF with weak constraint \vspace{-0.2cm} As the exact constraints ``A(\mvec)\uvec = \q`` leads to the difficulty of studying the corresponding posterior PDF, we weaken the constraints and arrive at a more generic posterior PDF with an auxiliary variable -- wavefields ``\uvec`` as follows: ```math #PostDP \rhopost(\mvec,\uvec|\dvec) &\propto \rho_{1}(\dvec|\uvec)\rho_{2}(\uvec|\mvec)\rhoprior(\mvec), \text{ with } \\ \rho_{1}(\dvec|\uvec) &\propto \exp(-\frac{1}{2}\|\Pmat\uvec-\dvec\|^{2}_{\SIGMAN^{-1}}), \text{ and} \\ \rho_{2}(\uvec|\mvec) &\propto \exp(-\frac{\lambda^{2}}{2}\|\A(\mvec)\uvec-\q\|^2). ``` Here the penalty parameter ``\lambda`` controls the trade off between the wave-equation and the data-fitting terms. As ``\lambda`` grows, the wavefields are more tightly constrained by the wave equation. It is readily observed that the posterior PDF (#PostD) is a special case of the posterior PDF (#PostDP) with ``\rho_{2}(\uvec|\mvec) = 1`` and ``\A(\mvec)\uvec = \q``. The new posterior PDF (#PostDP) has two important properties. First, the conditional PDF ``\rhopost(\uvec|\mvec,\dvec)`` of ``\uvec`` on ``\mvec`` is Gaussian. Second, if the matrix ``\A(\mvec)`` linearly depends on ``\mvec``, the conditional PDF ``\rhopost(\mvec|\uvec,\dvec)`` of ``\mvec`` on ``\uvec`` is also Gaussian. Therefore, the posterior PDF (#PostDP) is bi-Gaussian with respect to ``\uvec`` and ``\mvec``. \vspace{-0.1cm} ## Gibbs sampling \vspace{-0.2cm} The bi-Gaussian property of the posterior PDF (#PostDP) provides a straight-forward intuition of applying Gibbs sampling method by alternatively drawing samples ``\uvec`` and ``\mvec`` from the conditional PDFs ``\rhopost(\uvec|\mvec,\dvec)`` and ``\rhopost(\mvec|\uvec,\dvec)``. At the ``k^{th}`` iteration of the Gibbs sampling, starting from the point ``\mvec_{k}``, the conditional PDF ``\rhopost(\uvec|\mvec_{k},\dvec)`` ``= \mathcal{N}(\overline{\uvec},\Hmat_{u}^{-1})`` with the Hessian matrix ``\Hmat_{u}`` and the mean wavefields ``\overline{\uvec}`` given by: ```math #OptU \Hmat_{u} &= \Pmat^{\top}\SIGMAP^{-1}\Pmat + \lambda^2\A(\mvec_{k})^{\top}\A(\mvec_{k}), \\ \overline{\uvec} &= \Hmat_{u}^{-1}(\A(\mvec_{k})^{\top}\q+\Pmat^{\top}\SIGMAP^{-1}\dvec). ``` To draw a sample ``\uvec_{k+1} \sim \mathcal{N}(\overline{\uvec},\Hmat_{u}^{-1})``, we first compute the Cholesky factorization of ``\Hmat_{u} = \Lmat_{u}^{\top}\Lmat_{u}``, where ``\Lmat_{u}`` is an upper triangular matrix. Then we apply ``\Lmat_{u}`` to compute ``\overline{\uvec}`` and ``\uvec_{k+1}`` by: ```math #Compuk1 \overline{\uvec} &= \Lmat_{u}^{-1}\Lmat_{u}^{-\top}(\A(\mvec_{k})^{\top}\q+\Pmat^{\top}\SIGMAP^{-1}\dvec), \\ \uvec_{k+1} &= \overline{\uvec} + \Lmat_{u}^{-1}\rvec_{u}, \rvec_{u} \sim \mathcal{N}(0,\mathbf{I}). ``` With ``\uvec_{k+1}``, the conditional PDF ``\rhopost(\mvec|\uvec_{k+1},\dvec)`` ``=\mathcal{N}(\overline{\mvec},\Hmat_{m}^{-1})``. The mean ``\overline{\mvec} = \mvec_{k}-\Hmat_{m}^{-1}\g_{m}`` with gradient and Hessian expressed as: ```math #GandH \g_{m} &= \Gmat^{\top}(\A(\mvec_{k})\uvec_{k+1} - \q) + \SIGMAP^{-1}(\mvec_{k}-\mp), \\ \Hmat_{m} &= \Gmat^{\top}\Gmat + \SIGMAP^{-1}, ``` where ``\Gmat = \frac{\partial \A(\mvec)\uvec_{k+1}}{\partial{\mvec}}`` is the sparse Jacobian matrix. The new sample ``\mvec_{k+1}`` can be computed by: ```math #Sampm \mvec_{k+1} = \overline{\mvec} + \Lmat_{m}^{-1}\rvec_{m}, \rvec_{m} \sim \mathcal{N}(0,\mathbf{I}), ``` where ``\Lmat_{m} = \Hmat_{m}^{1/2}``. \vspace{-0.1cm} ## Numerical experiment \vspace{-0.2cm} We apply our proposal method to an inverse problem constrained by the Helmholtz equation. We use single source, single frequency data to invert a 1D gridded squared slowness profile ``m(z) = 1/(v_0 + \alpha z)^2`` for values ``v_0 \, = \, 2000 \, \mathrm{m/s}`` and ``\alpha \, = \, 0.75``. We use frequency increments of ``1 \, \mathrm{Hz}``, a grid spacing of ``50 \, \mathrm{m}``, a maximum offset of ``10000 \, \mathrm{m}``, and a maximum depth of ``5000 \, \mathrm{m}``. Both source and receivers are located at the surface of the model. Gaussian noise with covariance matrix ``\SIGMAN \, = \, \mathbf{I}`` is added to the data. The mean model of the prior distribution is selected to have ``v_0 \, = \, 2000 \, \mathrm{m/s}`` and ``\alpha \, = \, 0.65``. The covariance matrix is set to ``\SIGMAP = 2\times10^{-8}\mathbf{I}``. We use the Gibbs sampling method to generate ``10^6`` samples with ``\lambda \, = \, 10^5``. We compare the true model, the prior mean model, and the posterior mean model in Figure [#fig:Model], and the prior and posterior standard deviations (STD)in Figure [#fig:STD]. The ``90\%`` confidence interval of the posterior distribution is also shown in Figure [#fig:Model] (gray background). Compared to the prior distribution, the posterior mean model has a smaller error in the shallow part but a larger error in the deep part. Meanwhile, the posterior STD in the shallow part has a larger decrease from the prior STD than that in the deep part. Both facts implies that data has a larger influence on the squared slowness in the shallow part than that in the deep part. ### Figure: ModelSTD{#fig:ModelSTD} ![Model comparison](./Fig/ModelBar){#fig:Model width=50%} ![STD comparison](./Fig/STD){#fig:STD width=same} (a) Comparison of true model (solid), mean model of prior PDF (dotted) and posterior PDF (dashed); (b) Comparison of STDs of prior (dotted) and posterior (dashed) PDFs. \vspace{-0.1cm} ## Conclusion \vspace{-0.2cm} By weakening the wave-equation constraint in a controlled manner, we arrive at a novel formulation of posterior PDF for wave-equation inversion. This new posterior PDF is bi-Gaussian with respect to model parameters and wavefields. Numerical experiment demonstrate that this posterior PDF can be successfully and efficiently sampled by the Gibbs sampling method. \vspace{-0.1cm} ## Acknowledgements \vspace{-0.2cm} This research was carried out as part of the SINBAD project with the support of the member organizations of the SINBAD Consortium. ```math_def \newcommand{\Pmat}{\mathbf{P}} \newcommand{\A}{\mathbf{A}} \newcommand{\q}{\mathbf{q}} \newcommand{\uvec}{\mathbf{u}} \newcommand{\post}{\text{post}} \newcommand{\rhopost}{\rho} \newcommand{\prior}{\text{prior}} \newcommand{\noise}{\text{noise}} \newcommand{\mis}{\text{mis}} \newcommand{\like}{\text{like}} \newcommand{\rholike}{\rho} \newcommand{\rholikepen}{{\rho}_{\text{like}}^{\text{pen}}} \newcommand{\overrholikepen}{\overline{\rho}_{\text{like}}^{\text{pen}}} \newcommand{\rhoprior}{\rho} \newcommand{\vvec}{\mathbf{v}} \newcommand{\obs}{\text{obs}} \newcommand{\dvec}{\mathbf{d}} \newcommand{\xvec}{\mathbf{x}} \newcommand{\Hmat}{\mathbf{H}} \newcommand{\Cmat}{\mathbf{C}} \newcommand{\g}{\mathbf{g}} \newcommand{\y}{\mathbf{y}} \newcommand{\dobs}{\mathbf{d}_{\text{obs}}} \newcommand{\F}{\mathbf{F}} \newcommand{\SIGMAN}{{\Gamma}_{\text{n}}} \newcommand{\SIGMANij}{_{i,j}{\Gamma}_{\text{noise}}} \newcommand{\SIGMAPDE}{{\Gamma}_{\text{pde}}} \newcommand{\SIGMAP}{{\Gamma}_{\text{p}}} \newcommand{\SIGMAU}{{\Gamma}_{\uvec}} \newcommand{\SIGMAS}{{\Gamma}_{\text{src}}} \newcommand{\SIGMAR}{{\Gamma}_{\text{rcv}}} \newcommand{\mvec}{\mathbf{m}} \newcommand{\Gmat}{\mathbf{G}} \newcommand{\Wmat}{\mathbf{W}} \newcommand{\Imat}{\mathbf{I}} \newcommand{\Lmat}{\mathbf{L}} \newcommand{\Llike}{\mathbf{L}_{\like}} \newcommand{\Lprior}{\mathbf{L}_{\text{prior}}} \newcommand{\Lpost}{\mathbf{L}_{\text{post}}} \newcommand{\Smat}{\mathbf{S}} \newcommand{\Rmat}{\mathbf{R}} \newcommand{\Mmat}{\mathbf{M}} \newcommand{\Jmat}{\mathbf{J}} \newcommand{\rvec}{\mathbf{r}} \newcommand{\nsrc}{n_{\text{src}}} \newcommand{\nrcv}{n_{\text{rcv}}} \newcommand{\nfreq}{n_{\text{freq}}} \newcommand{\ngrid}{n_{\text{grid}}} \newcommand{\ndata}{n_{\text{data}}} \newcommand{\nsmp}{n_{\text{smp}}} \newcommand{\mMAP}{\mathbf{m}_{\text{map}}} \newcommand{\Lmis}{\Lmat_{\text{mis}}} \newcommand{\nd}{n_{\text{data}}} \newcommand{\Hp}{\Hmat_{\text{prior}}} \newcommand{\fred}{f_\text{red}} \newcommand{\fpen}{f_\text{pen}} \newcommand{\Hlike}{\Hmat_{\text{like}}} \newcommand{\Hprior}{\Hmat_{\text{prior}}} \newcommand{\Hpost}{\Hmat_{\text{post}}} \renewcommand{\mp}{\tilde{\mvec}} \renewcommand{\SIGMAPDE}{\Gamma_{\text{PDE}}} \newcommand{\epssrc}{\epsilon_{\text{src}}} \newcommand{\epsrcv}{\epsilon_{\text{rcv}}} ```