--- title: AVA classification as an unsupervised machine-learning problem author: | Ben B. Bougher and Felix J. Herrmann \ Seismic Laboratory for Imaging and Modeling (SLIM), University of British Columbia bibliography: - SEG.bib --- ## Abstract: Much of AVA analysis relies on characterizing background trends and anomalies in pre-stack seismic data. Analysts reduce a seismic section into a small number of these trends and anomalies, suggesting that a low-dimensional structure can be inferred from the data. We describe AVA-attribute characterization as an unsupervised-learning problem, where AVA classes are learned directly from the data without any prior assumptions on physics and geological settings. The method is demonstrated on the Marmousi II elastic model, where a gas reservoir was successfully delineated from a background trend in a depth migrated image. \vspace*{-0.5cm} ## Introduction \vspace*{-0.25cm} In the most general sense, unsupervised learning is a subfield of machine learning that tries to infer hidden structure within an unlabeled dataset. Unsupervised methods are particularly useful when the inferred structure is lower dimensional than the original data. For example, given a list of ``n`` patients in a hospital and their corresponding symptoms ``s``, it is unlikely that each patient-symptom combination is unique. A set of common diseases ``d`` can be inferred from the data, where ``d \ll n,s``. Popular unsupervised learning and data mining methods such as principal component analysis (PCA) and K-Means clustering rely on exploiting low-dimensional structure inherent in the data [@ding_k-means_2004]. Interestingly, interpreted images and geological maps produced by geoscience workflows are substantially lower dimension than the original field data. The structure of the major sedimentary layers of the Earth is relatively simple, as rocks with similar physical properties are formed along relatively continuous interfaces and facies in the subsurface. For this reason, we can use a combination of physical models, local geological knowledge, and experience to reduce large seismic and well-log datasets into low-dimensional models of the Earth. Abstractly, we are inferring a low-dimensional Earth model from high-dimensional geophysical data. In this respect, resevoir characterization can be posed as an unsupervised machine-learning problem. In conventional AVA interpretation, two-term AVA attributes are extracted from seismic angle gathers using the Shuey approximation [@shuey_simplification_1985] as a physical reflectivity model. Multivariate analysis of these attributes lead to an estimation of a background trend of shale-sand reflections and anomalous outliers that can be considered potential hydrocarbon indicators [@castagna_relationships_1985; @castagna_framework_1998]. Although this has proven to be an effective workflow, the efficacy of the method requires calibrated seismic data processing that preserves reflection amplitudes throughout migration. In theory, amplitude preserving migration is feasible [@sava_amplitude-preserved_2001; @zhang_amplitude-preserving_2014; @gajewski_amplitude_2002], however there are always large uncertainty and variations in measured AVA responses. Recent work by @hami-eddine_anomaly_2012 applied neural networks to classify AVA anamolies, while @hagen_application_1982, @saleh_avo_2000, and @scheevel_principal_2001 used principal component analysis (PCA) to characterize pre-stack seismic data. We follow a similar philosophy and demonstrate that conventional AVA characterization can be reformulated as an unsupervised learning problem. In the vernacular of machine learning, the problem generalizes as dimensionality reduction followed by clustering. ##Theory & Method Starting with angle-domain common-image gathers, we desire a segmented output image where each pixel is classified according to the local AVA response. We define the angle gathers as feature vectors ``x_i \in \mathbb{R}^d,\, i \in [1,...n]``, where ``n`` is the number of samples in the image and ``d`` is the number of angles in the gather. The feature vectors are shaped into a matrix ``X \in \mathbb{R}^{n \times d}``, where each row corresponds to a point in the image and each column corresponds to an angle. Generalizing the data as a feature matrix allows us to work in an unsupervised learning framework (Figure [#fig:unsupervised]). #### Figure: {#fig:unsupervised} ![](figures/unsupervised_learning.png){width=90%} :AVA characterization as unsupervised learning. We assume that the columns of ``X`` are not independent, as the angle response of a reflection is often modeled by simple equations with as few as two parameters (e.g.. two-term Shuey equation). Assuming the existence of a lower-dimensional representation, we can use dimensionality reduction techniques to reduce the number of columns in ``X`` into a new feature matrix ``\hat{X} \in \mathbb{R}^{n\times m}, m<