ML4Seismic Partners Meeting - 2021 - Program
2021 ML4Seismic Partners Meeting
Monday November 22
Tuesday November 23
Abstracts
Introduction to the Meeting
Abstract. During this presentation, an overview of the [2021 ML4Seismic Program] will be given including organization of the meeting, setup of ML4Seismic, and the [Informal Sessions] in the afternoon.
Human in the Loop: Seismic Interpretation through Active Learning
Abstract. Deep learning can extract rich data representations if provided sufficient quantities of labeled training data. For many tasks however, annotating data has significant costs in terms of time and money, owing to the high standards of subject matter expertise required, for example in medical and geophysical image interpretation tasks. Active learning can identify the most informative training examples for the interpreter to train, leading to higher efficiency.
The Value of Learning Dynamics in Seismic Interpretation
Abstract. In seismic interpretation, deep models are frequently met with skepticism. The reason for this are semantically incorrect predictions that occur when the model is exposed to samples it was not trained on. For instance, a model exposed to out-of-distribution sections may predict low-depth facies in high-depth regions and vice versa. To alleviate this problem, we utilize Learning Dynamics to explain this process and engineer methods that increase robustness for seismic models.
Redwood - Towards clusterless supercomputing in the cloud
Abstract. We present Redwood, a Julia framework for clusterless supercomputing in the cloud. Redwood provides a set of distributed programming macros that enable users to remotely execute Julia functions in parallel through cloud services for batch and serverless computing. We present the architecture and design of Redwood, as well as its application to existing Julia packages for machine learning and inverse problems.
ML4Seismic Open Source Software environment
Mathias Louboutin, SLIM
Abstract. Software is at the core of research and development in inverse problems. At SLIM, we have experience developing scalable and performant software, such as our legacy parallel MATLAB framework. With ML4Seismic, we are dedicated to build on this experience to develop HPC open source software (OSS) for the scientific community in collaboration with our partners. In this talk, we will describe our OSS Julia and Python environment, our high-level abstraction principles, and the range of solutions we offer for seismic processing and inversion and for machine learning. We will emphasize our aim to provide scalable software that can be easily applied to industrial problems.
Uncertainty quantification in imaging and automatic horizon tracking – a Bayesian deep-prior based approach
Abstract. In inverse problems, uncertainty quantification (UQ) deals with a probabilistic description of the solution nonuniqueness and data noise sensitivity. Setting seismic imaging into a Bayesian framework allows for a principled way of studying uncertainty by solving for the model posterior distribution. Imaging, however, typically constitutes only the first stage of a sequential workflow, and UQ becomes even more relevant when applied to subsequent tasks that are highly sensitive to the inversion outcome. In this paper, we focus on how UQ trickles down to horizon tracking for the determination of stratigraphic models and investigate its sensitivity with respect to the imaging result. As such, the main contribution of this work consists in a data-guided approach to horizon tracking uncertainty analysis. This work is fundamentally based on a special reparameterization of reflectivity, known as “deep prior”. Feasible models are restricted to the output of a convolutional neural network with a fixed input, while weights and biases are Gaussian random variables. Given a deep prior model, the network parameters are sampled from the posterior distribution via a Markov chain Monte Carlo method, from which the conditional mean and point-wise standard deviation of the inferred reflectivities are approximated. For each sample of the posterior distribution, a reflectivity is generated, and the horizons are tracked automatically. In this way, uncertainty on model parameters naturally translates to horizon tracking. As part of the validation for the proposed approach, we verified that the estimated confidence intervals for the horizon tracking coincide with geologically complex regions, such as faults.
Multifidelity conditional normalizing flows for physics-guided Bayesian inference
Abstract We introduce a scalable Bayesian inference approach that combines techniques from deep learning with a physic-based variational inference formulation. Bayesian inference for ill-posed inverse problems is challenged by the high-dimensionality of the unknown, computationally expensive forward operator, and choosing a prior distribution that accurately encodes prior knowledge on the unknown. To handle this situation and to assess uncertainty, we propose to approximate the posterior distribution using a pretrained conditional normalizing flow, which is trained on existing low- and high-fidelity estimations of the unknown. To further improve the accuracy of this approximation, we use transfer learning and finetune this normalizing flow by minimizing the Kullback-Leibler divergence between the predicted and the desired high-fidelity posterior density. This amounts to minimizing a physic-based variational inference objective with respect to the network weights, which we believe might scale better than Bayesian inference with Markov Chain sampling methods. We apply the proposed Bayesian inference approach to seismic imaging where we use quasi-real data obtained from the Parihaka dataset.
Underspecification in Seismic Interpretation
Abstract. Building on our previous discussion, we introduce the concept of underspecification to the world of seismic interpretation. We will show why underspecification is a serious problem and why it is an issue in seismic. Further, we show how we can detect underspecification with learning dynamics and analyze deployment properties without having access to annotations. Our method exposes regions where the model is insecure and requires additional support by human interpreters.
Variational inference for artifact removal of adjoint solutions in photoacoustic inverse problems
Abstract. Photoacoustic is a medical imaging modality which combines light and ultrasound waves to image internal structures of biological tissue. The inverse problem reconstructs the tissue initially excited by light given propagated ultrasound data at receivers outside the tissue. Due to noisy, limited-view and sparse receiver data, traditional time-reversal adjoint solutions are highly ill-posed. This necessitates uncertainty quantification to communicate to practitioners which areas of the image can be trusted. We propose a framework which leverages a machine learning based method (Conditional Normalizing Flows) to learn the full posterior distribution of viable solutions given the time-reversal adjoint solution. We show that areas of calculated uncertainty correlate with structures that are known to be difficult to image. In addition, we also propose a MAP based solution, which solves the variational least-squares problem while using the trained Conditional Normalizing Flow as a prior distribution.
Contrastive explanations and robustness for recognition in data
Abstract. Several deep learning techniques have been proposed to recognize sub-surface structures like salt domes, horizons, faults, and chaotic regions. All these techniques rely on accurate and precise seismic measurements and inversion methods and assume the presence of seismic volumes that are distortion-free. However, in practice, seismic volumes suffer from multiple concurrent distortions – both in measurements and inversion processes. In this talk, we present 1) a seismic-aware distortion dataset called Distorted-LANDMASS, 2) a generalized contrast based robustness technique that alleviates the effect of distortions, 3) an explanainability framework that leverages this contrast and answers questions of the form `Why fault, rather than salt dome?’. This talk builds on a successful leverage of the proposed technology in natural images setting.
Making Black-boxes Transparent through Explainable and Interpretable Machine Learning
Abstract. Deep neural networks can learn powerful representations on raw data to generalize well on unseen test data. In many real-world applications, their deployment is limited by their black-box structure and the difficulty to interpret them to produce to explanations for their decisions. Over the years, researchers have come up with various methods to explain and interpret a neural network’s decisions. We apply some of these methods to geophysical datasets and show how explainability can augment geophysical workflows to generate increased trust in the model’s decisions.
Improved seismic survey design by maximizing the spectral gap with global optimization
Yijun Zhang, SLIM
Abstract. Random subsampling is increasingly being used in the acquisition of seismic data to shorten the acquisition time and to reduce costs. However, the design of optimal acquisition geometries is still an ongoing area of research. Matrix completion (MC) is a computationally efficient method to reconstruct fully sampled wavefields from sparsely sampled seismic data. In MC theory, the spectral gap (SG), which is a measure of the connectedness of the graph in expander graph theory, has been used to predict, and to some degree quantify, the quality of wavefield reconstruction, given a specific subsampling scheme (acquisition mask). Building on these insights, we propose an optimization scheme, based on simulated annealing, which finds subsampling masks with large SGs that improve the quality of wavefield reconstruction with MC. The experimental results show that the proposed method successfully increases the SG of the subsampling mask starting from randomly initialized masks. Increasing the SG leads to improved connectivity between the sources and receivers and therefore of the wavefield reconstruction. Numerical experiments confirm a direct relationship between increased SG and improved reconstruction quality. This confirms the value SG analysis brings to the design of seismic surveys without the need to carry out expensive wavefield reconstructions to optimize the acquisition design.
A dual formulation of wavefield reconstruction inversion for large-scale seismic inversion
Gabrio Rizzuti, Utrecht University
Abstract. Many of the seismic inversion techniques currently proposed that focus on robustness with respect to the background model choice are not apt to large-scale 3D applications, and the methods that are computationally feasible for industrial problems, such as full waveform inversion, are notoriously limited by convergence stagnation and require adequate starting models. We propose a novel solution that is both scalable and less sensitive to starting models or inaccurate parameters (such as anisotropy) that are typically kept fixed during inversion. It is based on a dual reformulation of the classical wavefield reconstruction inversion, whose empirical robustness with respect to these issues is well documented in the literature. While the classical version is not suited to 3D, as it leverages expensive frequency-domain solvers for the wave equation, our proposal allows the deployment of state-of-the-art time-domain finite-difference methods, and is potentially mature for industrial-scale problems.
Improved seismic monitoring of CO2 sequestration with the weighted joint recovery model
Abstract. Time-lapse seismic monitoring of CO2 sequestration is challenging because the time-lapse signature of CO2 plumes is weak in amplitude and often contaminated by imaging artifacts due to coarsely sampled, noisy, and non-replicated surveys. In this talk, we present a sparsity-promoting least-squares imaging method where the baseline, and the current and past monitor surveys are inverted jointly. We demonstrate that the sensitivity of seismic monitoring can be improved by inverting for the common component—i.e., the component shared by all vintages, and innovations with respect to this common component. Combining this joint approach with weighted \(\ell_{1,2}\)-norm minimization leads to a monitoring scheme capable of detecting irregular CO2-plume growth in a realistic geological setting.
Randomized linear algebra for inversion
Abstract. Inverse problems in exploration geophysics or machine learning heavily relies on linear algebra and large matrices manipulations. To tackle the growing cost of storing these matrices, randomized algorithms have been developed to obtain information from these matrices via randomized sketching. Inspired by previous work on extended image volumes, we will first show in this talk how the seismic imaging condition can be expressed in a randomized linear algebra framework leading to drastic memory savings. In a second part, we will extend this idea to convolutional neural networks to reduce the memory cost of training by orders of magnitude. We will demonstrate the practicality of these methods on representative examples.
Distributed Fourier Neural Operators
Abstract. Fourier Neural Operators (FNOs) are a class of neural operator, which use weightings on the Fourier transform of their inputs to approximate infinite-dimensional mappings between function spaces. They are particularly useful in approximating the solutions of smooth parametric (e.g. by permeability) partial differential equations (PDEs), and once trained are capable of producing a nearly identical output to a traditional numerical solver roughly three orders of magnitude faster, making them very useful in a wide variety of engineering applications that require repeated PDE solves (e.g. during uncertainty quantification). Until now, FNOs have been limited to small problems, as their memory intensive design makes them difficult to scale in a traditional machine learning setting on a single computer or GPU. In this work, a decomposition scheme is described and implemented in PyTorch using the DistDL distributed deep learning framework, which is capable of scaling FNOs to arbitrary dimension and input size running on many nodes in a distributed memory system. This parallel implementations allows FNOs to be trained and run on problems of a practical scale on both CPU and GPU clusters.
Discussion
This time slot is designated to informal discussion and feedback . It is important for us to get input on the application and possible on the use of our work and we would therefore greatly value your input. We hope that this format will be conducive to lively discussions.
Informal sessions
Our research team will be available during this time for informal breakout sessions1 organized along the following themes:
Breakout 1. Active Machine Learning and Explainability, and Uncertainty
Abstract. In this breakout session, we will have informal discussions on the frontier of machine learning and its deployment in practice. In particular, we will continue the theme in the sessions that tackle issues such as labels, trust, uncertainty, and explainability. We will discuss practices that explore learned manifolds to reduce the amount of annotation effort needed for subsurface interpretation and characterization applications using geophysical datasets. Further, we will focus on several methodologies that explain a model’s behavior with the goal of improving its generalizability. A careful study of the decision boundaries constructed during training will be discussed. Such discussions will lead to our recent works that investigate trust and uncertainty in neural network models and their applicability in practice. Also, our primary work on explainability through contrastivity will be discussed and explored for applications in subsurface imaging and analysis. This breakout session will be an interactive exercise where practitioners from participating companies share their feedback and views on topics such as contrastivity, explainability, active learning, decision boundaries, robustness, trust, uncertainty, and generelizability.
Breakout 2. Scalable Software in the Cloud
Abstract. In this breakout session, participants will get hands-on experience on our open-source software (OSS) available on GitHub. After demoing how to install our software, the attendees will be able to connect to Jupiter notebooks, which we will setup on the Azure Cloud2. Our Jupiter notebooks hub can be reached at
When accessing our Jupiter notebooks hub for the first time, you will need to provide your username (this is the email address you used to register to the meeting) and a password of your choice.
Topics discussed include
Devito—Just-in-time compiler technology. This OSS package is designed to generate highly optimized C code on the fly from symbolic expressions of the wave equation. The resulting code is fast and fully exploits parallelisms through multi-threading (OMP) and domain decompositions (MPI). It also offers support different CPUs and GPUs;
JUDI.jl—Scalable open-source inversion framework. This OSS package allows setup of scalable seismic inversions utilizing Julia’s data parallelism while building on Devito. Since JUDI.jl exposes the “linear algebra” of inversion problems, it integrates well with the latest techniques from convex and stochastic optimization (see e.g. SetIntersectionProjection.jl and SlimOptim.jl);
JUDI4Cloud.jl— JUDI.jl parallelism with Azure batch. Building on our experience introducing abstractions and serverless implementations of seismic inversion in the Cloud, Microsoft Research developed an additional abstraction layer that allows our codes to run seamlessly Cloud natively using Azure Batch etc. This OSS package allows running code in parallel w/o the need to setup a conventional cluster lowering costs while leveraging the existing Cloud software infrastructure;
InvertibleNetworks.jl—A Julia framework for invertible neural networks. This OSS package implements invertible neural networks and normalizing flows using memory-efficient backpropagation. InvertibleNetworks.jl manually implements gradients to take advantage of the invertibility of building blocks, which allows for scaling to large-scale problem sizes. We present the architecture and features of the library and demonstrate its application to a variety of problems ranging from loop unrolling to uncertainty quantification;
Seamless integration w/ Machine Learning via Automatic Differentiation (AD) with ChainRules.jl. This OSS package is designed to integrate imaging codes in JUDI.jl and JUDI4Cloud.jl with implementations for deep neural networks in Flux.jl and our InvertibleNetworks.jl. Compared to TensorFlow/PyTorch this approach scale well since Zygote.jl’s AD and abstractions by ChainRules.jl are leveraged;
XConv—Convolutional layers at Scale via Randomized Trace Estimation. To address the computational challenges of machine learning for seismic, low-memory implementations for convolutional layers were developed as part of the OSS package XConv using techniques from from randomized linear algebra. These techniques in conjunctions with our experience building the Devito compiler are leading to major improvements in memory use and computational speed;
TimeProbeSeismic.jl—Memory efficient seismic inversion via randomized trace estimation.. This OSS package implements memory efficient FWI and RTM using randomized trace estimation in Devito.
In addition to these “general purpose software packages”, an OSS repository is developed in support of seismic monitoring of Carbon Capture Sequestration (Seis4CCS.jl). Aside from general setup and recovery with the joint recovery model, this activity involves the development of fast parallel solvers for two-phase flow with Fourier Neural Operators in the OSS package dfno. To help mitigate risks of CCS, tools have been developed for uncertainty quantification in FastApproximateInference.jl.
Breakout 3. Seismic Imaging & CCS Monitoring w/ Uncertainty Quantification
Abstract. In this break out session, the latest developments and the road ahead will be discussed on seismic imaging, inversion, and monitoring technology. During this informal sessions topics will range from our latest work on Wavefield Reconstruction Inversion via a dual formulation in the time domain to seismic monitoring of CCS and Uncertainty Quantification with Markov-Chain Monte Carlo and variational Bayesian inference with normalizing flows. During the discussions, we will also pay ample attention to the next steps we envision for seismic monitoring of CCS, which includes formulations where plan to work on answering questions such as “How often and when should we collect additional monitoring data” and “What is the seismic detectability of CO2-plumes, the impact of acquisition density, and is there need to replicate surveys?”.
If you not already done so please follow this link to indicate your preference on this google form.↩
Please make sure you have access to IP addresses within the domain of Azure Cloud.↩