Enhancing robustness of Digital Shadow for CO2 Storage Monitoring with Augmented Rock Physics Modeling
\[ \def\textsc#1{\dosc#1\csod} \def\dosc#1#2\csod{{\rm #1{\small #2}}} \]
Introduction
In order to achieve climate targets, the IPCC emphasizes the importance of technologies capable of removing gigatonnes of CO2 each year, with Geological Carbon Storage (GCS) being a central component of this strategy (Panel on Climate Change) 2018; Ringrose 2020). GCS involves capturing CO2 and injecting it into deep geological formations, where it can be safely stored over long periods. To ensure the safety of GCS, accurate monitoring of subsurface CO2 behavior is essential, as it guarantees that CO2 plumes remain contained and do not leak into surrounding formations (Ringrose 2023). Time-lapse seismic imaging has proven vital for tracking the migration of CO2 plumes, yet it often fails to fully capture the complexities of multi-phase subsurface flow.
Digital Shadows (DS), which utilize machine learning-driven data assimilation techniques such as nonlinear Bayesian filtering and generative AI (Spantini, Baptista, and Marzouk 2022; Gahlot et al. 2024), offer a more detailed and reliable approach for monitoring CO2 storage (Herrmann 2023; Gahlot et al. 2023, 2024). By incorporating uncertainty in reservoir properties like permeability, this framework improves the accuracy of CO2 migration forecasts, including plume pressure and saturation, and thereby reduces risks for GCS projects. However, data assimilation depends on assumptions regarding reservoir properties, rock physics models linking reservoir state to seismic properties, and initial conditions. If these assumptions are inaccurate, predictions may become unreliable, which in turn can jeopardize the safety of GCS operations. One way to mitigate this risk is by augmenting the forecast ensemble used to train the neural networks responsible for the data assimilation process—mapping prior forecast samples to the posterior. In this presentation, we demonstrate that augmenting the forecast ensemble by incorporating various rock physics models mitigates the negative effects of using inaccurate models (e.g., uniform versus patchy saturation models). Additionally, we find that ensemble augmentation can enhance predictive accuracy in certain scenarios.
Methodology
Building on the formulation in Gahlot et al. (2024) for CO2-plume dynamics and observations, we introduce an uncertainty-aware Digital Shadow framework, described by the following equation:
\[ \begin{aligned} \mathbf{x}_k & = \mathcal{M}_k\bigl(\mathbf{x}_{k-1}, \boldsymbol{\kappa}_k; t_{k-1}, t_k\bigr)\\ & = \mathcal{M}_k\bigl(\mathbf{x}_{k-1}, \boldsymbol{\kappa}_k\bigr), \ \boldsymbol{\kappa}_k \sim p(\boldsymbol{\kappa}) \quad \text{for}\quad k=1, \dots, K. \end{aligned} \tag{1}\]
Here, \(\mathbf{x}_k\) represents the time-varying spatial distribution of CO2 saturation and pressure perturbations measured at time steps \(k = 1, \dots, K\), while \(\mathcal{M}_k\) is the multi-phase fluid-flow simulation operator that evolves the state from time \(t_{k-1}\) to \(t_k\) for each time step. The symbol \(\boldsymbol{\kappa}\) denotes the permeability distribution, which is highly heterogeneous (Ringrose 2020, 2023). Since exact permeability values are unknown, we assume a statistical distribution \(p(\boldsymbol{\kappa})\) and sample from it during each timestep. The state of CO2 saturation and reservoir pressure perturbations, \(\mathbf{x}_{k-1}\), is then propagated forward to \(\mathbf{x}_k\).
While reservoir simulations provide accurate modeling of CO2 plume dynamics, relying solely on these simulations is impractical due to the inherent stochasticity of permeability and the uncertainty in rock physics models, which remain poorly constrained. To improve the accuracy of CO2 plume forecasts, monitoring is key. Time-lapse seismic data (Lumley 2010) offer essential insights into plume behavior, and we integrate these data into our framework to enhance the characterization of the CO2 plume. The seismic data are modeled as:
\[ \mathbf{y}_k = \mathcal{H}_k(\mathbf{x}_k;\mathcal{R}_k) + \boldsymbol{\epsilon}_k, \quad \boldsymbol{\epsilon}_k \sim p(\boldsymbol{\epsilon}), \quad \mathcal{R}_k \sim p(\mathcal{R}) \quad \text{for}\quad k=1, \dots, K \tag{2}\]
where \(\mathcal{H}_k\) is the observation operator, \(\mathbf{y}_k\) is the seismic data observed at timestep \(k\), and \(\boldsymbol{\epsilon}_k\) represents colored Gaussian noise added to the seismic shot records before reverse-time migration. The symbol \(\mathcal{R}_k\) denotes the rock physics model, randomly drawn from the family of Brie Saturation models (Avseth, Mukerji, and Mavko 2010), with the exponent \(e\) uniformly sampled from the range \(e \sim \mathcal{U}(1,10)\). To combine fluid-flow simulations with time-lapse seismic data, we use a nonlinear Bayesian filtering approach, incorporating neural posterior density estimation to approximate the posterior distribution of the CO2 plume state conditioned on seismic observations. This approach merges sequential Bayesian inference with amortized neural posterior density estimation (Papamakarios et al. 2021). To estimate the posterior distribution of the CO2 plume from seismic data, we employ Conditional Normalizing Flows (CNFs) (Gahlot et al. 2024). In this simulation-driven framework, we generate simulation pairs consisting of CO2 plume dynamics and corresponding seismic observations (cf. equations 1 and 2) and train the CNFs on these pairs. To account for different rock physics models, we create multiple seismic images for each CO2 plume sample, resulting in an augmented dataset.
Synthetic Case Study
To evaluate our approach, we use a synthetic 2D Earth model derived from the Compass model (E. Jones et al. 2012), representative of the North Sea region. We focus on a subset of this model, containing subsurface features suitable for CO2 injection, which is discretized into a grid of \(512 \times 256\) with a grid spacing of \(6.25 \ \mathrm{m}\). To initialize the probabilistic digital shadow, we require an ensemble of potential CO2 plume scenarios. Given the uncertainty in the permeability distribution of CO2 storage sites, we first establish a probabilistic baseline for permeability, which we use to initialize the plume. The permeability distribution is obtained through probabilistic full-waveform inversion (Yin et al. 2024), assuming that a baseline seismic survey was conducted prior to CO2 injection. This permeability distribution is then converted into permeability samples using the empirical relationship from (Gahlot et al. 2024).
Multi-Phase Flow Simulations
Flow simulations are carried out using the open-source tool JutulDarcy JutulDarcy.jl (Møyner, Bruer, and Yin 2023). Initially, the reservoir is filled with brine, and supercritical CO2 is injected at a rate of \(0.0500 \ \mathrm{m^3/s}\) over a period of 1920 days. The injection depth is approximately \(1200 \ \mathrm{m}\). The simulation is run over four time-lapse intervals, \(t_k\), producing predicted CO2 saturation for each of the \(N=128\) ensemble members at each timestep.
Augmented Seismic Simulations
The outputs from the 128 flow simulations are converted into changes in subsurface acoustic properties using 10 different exponents of the Brie Saturation model (Avseth, Mukerji, and Mavko 2010), thus augmenting the data tenfold by generating different acoustic changes corresponding to each flow simulation. Seismic surveys are conducted with 8 receivers and 200 sources, using a dominant frequency of 15 Hz and a recording duration of 1.8 seconds. 28 dB SNR colored Gaussian noise is added to the shot records. Nonlinear wave simulations and imaging are performed using the open-source package JUDI.jl (Witte et al. 2019; Louboutin et al. 2023).
CNF Training
The \(1280\) ensemble members are used as training pairs, consisting of forecasted CO2 plumes and corresponding seismic observations. We train a Conditional Normalizing Flow (CNF) using the open-source package InvertibleNetworks.jl (Orozco et al. 2024). The network weights \(\boldsymbol{\phi}\) are optimized by minimizing the following objective over 120 epochs using the \(\textsc{ADAM}\) optimizer (Kingma and Ba 2014):
\[ \widehat{\boldsymbol{\phi}} = \mathop{\mathrm{argmin}\,}\limits_{\boldsymbol{\phi}} \frac{1}{M}\sum_{m=1}^M \Biggl(\frac{\Big\|f_{\boldsymbol{\phi}}(\mathbf{x}^{(m)};\mathbf{y}^{(m)})\Big\|_2^2}{2} - \log\Bigl |\det\Bigl(\mathbf{J}^{(m)}_{f_{\boldsymbol{\phi}}}\Bigr)\Bigr |\Biggr). \tag{3}\]
where \(\mathbf{J}\) is the Jacobian of the network \(f_{\theta}\) with respect to its input, and \(M\) is the number of training samples. For further details, we refer to (Gahlot et al. 2024).
Results
The performance of our enhanced Digital Shadow framework, incorporating rock physics, is shown in figure 1. The top row in each figure presents, from left to right, the ground truth (GT) CO2 plume, the conditional mean of the posterior samples, and a sample from the posterior. The bottom row displays, from left to right, the seismic observation, the error between the conditional mean and the ground truth, and the uncertainty. As shown in figure 1 (a), the non-augmented DS performs well when the seismic observation matches the correct rock physics model, with the conditional mean closely approximating the GT. However, performance degrades when the seismic observation is based on a different, unknown rock physics model, as seen in figure 1 (b), where the conditional mean deviates from the GT, leading to increased error and uncertainty. Augmenting the ensemble improves generalization when seismic observations are based on an unknown rock physics model, as demonstrated in figure 1 (c). In this case, the conditional mean is closer to the GT, and both error and uncertainty are reduced compared to the non-augmented approach.
Conclusions
This study demonstrates that uncertainties in permeability and rock physics models can significantly impact the accuracy of CO2 plume predictions. By augmenting the forecast ensemble, DS is able to account for unknown rock physics models (at least within the family of Brie saturation models where the exponent \(e\) is not specified) and fluid-flow properties. The enriched dataset improves the fidelity of CO2 plume forecasts, thereby enhancing the reliability of GCS monitoring.
Acknowledgement
This research was carried out with the support of Georgia Research Alliance, partners of the ML4Seismic Center and in part by the US National Science Foundation grant OAC 2203821. The overall readability is enhanced using ChatGPT 4.