Velocity Model Building with Jacobian-Informed Neural Operators

Jeongjin (Jayjay) Park	Huseyin Tuna Erdinc	Felix Herrmann

Motivation: Why Neural Operators for LS Inversion? ¹

Classical workflows (MVA, FWI)

repeated PDE solves, adjoint-state gradients
becomes very expensive when exploring multiple background models

\(\mathbf{m}\): ground-truth velocity model
\(\mathbf{m}_0\): background velocity model
\(\delta \mathbf{m}_{RTM}\): RTM at the background model
\(\mathcal{G}(\mathbf{m}, \mathbf{m}_0) = \delta \mathbf{m}_{RTM}\): RTM (Migration) operator

Motivation: Neural Operator as an amortized neural surrogate

Definition 1. (Neural operator for RTM imaging)

A learned operator \[\mathcal{G}_{nn}(\mathbf{m},\mathbf{m}_0) \approx \mathcal{G}(\mathbf{m}, \mathbf{m}_0)\]

predicts the RTM image for any \(\mathbf{m}_0\)

Effect

Near-zero cost per forward RTM prediction
Fast gradients through \(\mathcal{G}_{nn}\) via AD
this enables fast, scalable inversion
Our training goal: Amortized across background models

Least-Squares Inverse Problem: Setup

Equation 1. (Inversion objective)

\[\mathcal{L}(\mathbf{m},\mathbf{m}_0) = \|\mathcal{G}_{nn}(\mathbf{m}, \mathbf{m}_0) - \delta \mathbf{m}_{RTM}\|^2_2\]

Equation 2. (Inversion result as a minimizer)

\[\hat{\mathbf{m}} = \arg\min_\mathbf{m} \mathcal{L}(\mathbf{m}, \mathbf{m}_0)\]

Equation 3. (Gradient Descent / Optimization update)

\[\mathbf{m}^{k+1} = \mathbf{m}^k - \eta \nabla_\mathbf{m} \mathcal{L}(\mathbf{m}^k)\]

Neural Operator Training

Training \(\mathcal{G}_{nn}\) as an “amortized” RTM operator

Input function space and Training \(\mathcal{G}_{\text{nn}}\)

To learn a two-argument RTM operator
\[ \mathcal{G}_{\text{nn}} : \mathcal{X} \times \mathcal{B} \rightarrow \mathcal{Y}, \] we require training data that samples the input function space.

Probability model for training pairs

In practice, drawing training instances, \((\mathbf{m}, \mathbf{m}_0) \sim \mu,\) mean

\(\mathbf{m} \in \mathcal{X}\): samples variability in true geologic models
\(\mathbf{m}_0 \in \mathcal{B}\): samples variability in background (kinematic) models

This matches operator-learning theory, which guarantees

Definition 4. (Amortized RTM Operator)

\[ \mathbb{E}_{(m,m_0)\sim\mu} \Big[ \|\mathcal{G}_{\text{nn}}(m,m_0) - \delta m_{RTM} \| \Big] \]

Example list of \(\mathbf{m}_0\) for a given \(\mathbf{m}\)

\(\mathcal{G}_{\text{nn}}\): an Amortized RTM Operator

Trained on \((\mathbf{m},\mathbf{m}_0)\sim\mu\), the neural operator learns

how the pair \((\mathbf{m}, \mathbf{m}_0)\) determines the RTM image
not to rely on a single fixed background model
changes in traveltime due to low-wavenumber variations on \(\mathbf{m}_0\)

This makes \(\mathcal{G}_{\text{nn}}\) an amortized RTM operator, meaning it generalizes across the space of backgrounds \(\mathcal{B}\).

Result: Strong generalization to unseen background models during inversion!

Dataset Creation

To fulfill our training objective, we need to carefully design dataset

Creating Dataset for Amortized Neural Operator

Definition 5. (Dataset for Amortized Neural Operator)

\[ \mathcal{D} = \big\{ \big(\mathbf{m}^{(i)},\, \mathbf{m}_{0}^{(i,s)},\, \delta \mathbf{m}_{\mathrm{RTM}}^{(i,s)}\big) \big\}_{i=1,\dots,N;\; s=1,\dots,10} \]

\(i\): index for ground-truth velocity model
\(s\): index for background model
\(\mathbf{m}^{(i)}\): ground-truth velocity model
\(\mathbf{m}_{0}^{(i,s)}\): background model samples
\(\delta \mathbf{m}_{\mathrm{RTM}}^{(i,s)} = \mathcal{G}(\mathbf{m}^{(i)}, \mathbf{m}_{0}^{(i,s)})\): RTM image at that background

For each true model \(\mathbf{m}(x,z)^{(i)}\), we construct multiple background models by smoothing slowness field in depth and time-domain.

Algorithm for Dataset Creation

Inducing Variability in the \(\mathbf{m}_0\)

Two-way travel time variability (induced by smoother/faster backgrounds)
Two building blocks:
- T: smoothing in time
- D: smoothing in depth
We randomly convolve in T and D for multiple times
Thus, it shifts, stretches, and focuses RTM events

Variability in the \(\mathbf{m}_0\) in vertical trace

Building Block 1: Smoothing in Depth

Equation 4. (Depth-domain smoothing)

\[ s(x,z)= \left(S_{\sigma_x,\sigma_z} * s\right)(x,z) \]

\(\mathbf{m}(x,z)\): true velocity model in km/s
\(\mathbf{s}(x,z)\): \(1 / \mathbf{m}(x,z)\), slowness
\(S_\alpha\): Gaussian smoothing operator with kernel width \(\alpha\)

But smoothing in time requires more steps. We first need to convert from depth to time coordinate.

Building Block 2-1: Mapping between Depth and Time

The slowness model, \(\mathbf{s}(x, z)\), is defined in a depth grid.
Using travel time curve samples \((z_j, t_j)\), we know
1. \(t(z): z \mapsto t\)
2. \(z(t): t \mapsto z\) using LinearInterpolation

Equation 5. (Two-way travel time)

Given depth samples \(z_j = jh\) \[t_j(x)=2\sum_{k=1}^{j} \frac{h}{1000}\, s(x,z_k)\]

This defines discrete pairs \((z_j, t_j)\) that can be interpolated

\(h\): depth grid unit in meters
\(j\): index of depth

Building Block 2-2: Smoothing in Time

Equation 6. (Depth-to-time conversion)

\[\text{Using} \: t(z): z \mapsto t, \: \text{obtain} \: \mathbf{s}^{\text{time}}(x,t_n).\]

Equation 7. (Time-domain smoothing)

\[ s^\text{time}(x,t)= \left(S_{\sigma_x,\sigma_z} * s^\text{time}\right)(x,t) \]

Equation 8. (Time-to-depth conversion)

\[\text{Using} \: z(t): t \mapsto z, \: \text{obtain} \: \mathbf{s}(x,z_j)\]

Background models: diverse variations in travel time

Variability in the \(\mathbf{m}_0\)

Background models: \(\{\mathbf{m}_0^{(i,s)}\}_{s=1}^{10}\) for a given \(\mathbf{m}^{(i)}\)

Velocity model and Migrated models

RTM variation: \(\delta \mathbf{m}_{\mathrm{RTM}}^{(i,s)}\)

RTM variations

Numerical Experiment (Preliminary Result)

Can we replace \(\mathcal{G}\) with \(\mathcal{G}_{nn}\) in the least-squares inversion?

Fourier Neural Operator (FNO)¹

Schematic Plot for architecture of Fourier Neural Operator

Convolution Operator
- Efficient as convolution in frequency domain - multiplication
- learning weights, \(R\), in frequency domain
Additional Linear Transformation of Input
- To keep track of locational information and boundary information

Testing Accuracy in Forward Prediction

Example Sample ground-truth velocity and background velocity (\(\mathbf{m}_0\): TTD \(\sigma_x = 122.8\))

Forward Prediction Trace Plot

RTM Veritcal Trace

Inversion Setup

We test it on very simple case by smoothing the ground-truth model

Iterations: 150
Sample: unseen background model

Inversion Setup

To better understand how optimization is working, we evaluate how model iterate evolves
- at 1st iteration
- at 80th iteration

Inversion: RTM Prediction

Inversion: RTM Trace Comparison

Trace of Model iterate at the 1st iteration

Trace of model iterate at the 80th iteration

Inversion: Model iterate and its gradient

Inversion: Recovered Model

Next step: Fisher-Informed Neural Operator (FINO)

When MSE-FNO fails in inversion

When we create \(\mathbf{m}_0\) by smoothing the groundtruth, \(\mathbf{m}\) the inversion works well
When smoothing happens in slowness, \(\mathbf{s}\), then the relationship between groundtruth and the background model becomes highly nonlinear, and learning gradient correctly becomes challenging
With FINO framework, such limitation can be overcome, by explicitly teaching the derivative information important for inversion.

Acknowledgement

This research was carried out with the support of Georgia Research Alliance and partners of the ML4Seismic Center.

Velocity Model Building with Jacobian-Informed Neural Operators

Motivation: Why Neural Operators for LS Inversion? 1

Classical workflows (MVA, FWI)

Motivation: Neural Operator as an amortized neural surrogate

Definition 1. (Neural operator for RTM imaging)

Effect

Least-Squares Inverse Problem: Setup

Equation 1. (Inversion objective)

Equation 2. (Inversion result as a minimizer)

Equation 3. (Gradient Descent / Optimization update)

Neural Operator Training

Input function space and Training \(\mathcal{G}_{\text{nn}}\)

Probability model for training pairs

Definition 4. (Amortized RTM Operator)

\(\mathcal{G}_{\text{nn}}\): an Amortized RTM Operator

Trained on \((\mathbf{m},\mathbf{m}_0)\sim\mu\), the neural operator learns

Dataset Creation

Creating Dataset for Amortized Neural Operator

Definition 5. (Dataset for Amortized Neural Operator)

Algorithm for Dataset Creation

Inducing Variability in the \(\mathbf{m}_0\)

Building Block 1: Smoothing in Depth

Equation 4. (Depth-domain smoothing)

Building Block 2-1: Mapping between Depth and Time

Equation 5. (Two-way travel time)

Building Block 2-2: Smoothing in Time

Equation 6. (Depth-to-time conversion)

Equation 7. (Time-domain smoothing)

Equation 8. (Time-to-depth conversion)

Background models: diverse variations in travel time

Background models: \(\{\mathbf{m}_0^{(i,s)}\}_{s=1}^{10}\) for a given \(\mathbf{m}^{(i)}\)

RTM variation: \(\delta \mathbf{m}_{\mathrm{RTM}}^{(i,s)}\)

Numerical Experiment (Preliminary Result)

Fourier Neural Operator (FNO)1

Testing Accuracy in Forward Prediction

Forward Prediction Trace Plot

Inversion Setup

Inversion Setup

Inversion: RTM Prediction

Inversion: RTM Trace Comparison

Inversion: Model iterate and its gradient

Inversion: Recovered Model

Next step: Fisher-Informed Neural Operator (FINO)

Acknowledgement

Motivation: Why Neural Operators for LS Inversion? ¹

Fourier Neural Operator (FNO)¹