We address the problem of obtaining a reliable seismic image without prior knowledge of the source wavelet, especially from data that contain strong surface-related multiples. Conventional reverse-time migration requires prior knowledge of the source wavelet, which is either technically or computationally challenging to accurately determine; inaccurate estimates of the source wavelet can result in seriously degraded reverse-time migrated images, and therefore wrong geological interpretations. To solve this problem, we present a “wavelet-free” imaging procedure that simultaneously inverts for the source wavelet and the seismic image, by tightly integrating source estimation into a fast least-squares imaging framework, namely compressive imaging, given a reasonably accurate background velocity model. However, this joint inversion problem is difficult to solve as it is plagued with local minima and the ambiguity with respect to amplitude scalings, because of the multiplicative, and therefore nonlinear, appearance of the source wavelet in the otherwise linear formalism. We have found a way to solve this nonlinear joint-inversion problem using a technique called variable projection, and a way to overcome the scaling ambiguity by including surface-related multiples in our imaging procedure following recent developments in surface-related multiple prediction by sparse inversion. As a result, we obtain without prior knowledge of the source wavelet high-resolution seismic images, comparable in quality to images obtained assuming the true source wavelet is known. By leveraging the computationally efficient compressive-imaging methodology, these results are obtained at affordable computational costs compared with conventional processing work flows that include surface-related multiple removal and reverse-time migration.
Conventional reverse-time migration (RTM, Baysal et al., 1983) requires the knowledge of the source wavelet as prior information, which is used during the forward propagation of the source wavefield. Unfortunately, for field seismic data, this knowledge is either technically (e.g., relying on statistical assumptions, Ulrych et al., 1995) or computationally (e.g., as used in full-waveform inversion, Virieux and Operto, 2009) challenging to accurately determine during the pre-processing procedure. Inaccurate estimates of the source wavelet will introduce errors to the forward propagated source wavefield, and later to the seismic image during the cross-correlation of the source and receiver wavefields. For example, wrong estimates in the phase of the source wavelet can result in misplaced structures in the seismic image, and therefore lead to wrong geological interpretations.
To eliminate the dependence of seismic imaging on the knowledge of the source wavelet, we are motivated to incorporate source estimation into seismic imaging with minimal computational overhead. We achieve this objective via the joint inversion of the seismic image and the source wavelet, by extending the least-squares-migration formalism (Nemeth et al., 1999). Because the forward modelling of the seismic data is linear with respect to the source wavelet, this type of joint inversion problem is known as the separate least-squares problem (Golub and Pereyra, 1973, 2003; Kaufman, 1975; Aleksandr Y. Aravkin and van Leeuwen, 2012). To reduce the excessive simulation cost of conventional least-squares migrations, we leverage curvelet-domain sparsity of the seismic image (Herrmann et al., 2008a), and tightly integrate source estimation into a computationally efficient least-squares imaging formalism (Herrmann and Li, 2012), also known as the compressive imaging method. As a result, we obtain high-resolution least-squares migrated images with simulation costs (in terms of wave-equation solves, not the overall computing time) that roughly equal conventional reverse-time migration, whereas the computational overhead introduced by source estimation is negligible compared to the wave-equation solves.
Although a separable least-squares problem can usually be effectively solved using variable projection (Golub and Pereyra, 2003), applying variable projection to the proposed joint inversion procedure is complicated by the following two issues. First, the compressive imaging formalism adopts a sparsity-promoting objective function, which is different from typical separable least-squares problems (Golub and Pereyra, 2003). We follow van den Berg and Friedlander (2008) and A. Y. Aravkin et al. (2013), and derive an alternative problem formulation that turns the sparsity-promoting objective into a sparse constraint. Second, there is an ambiguity in the amplitude scaling between the model perturbations and the estimated source wavelet, which cannot be resolved using variable projection alone, as in any blind-deconvolution type of problem (Stockham Jr et al., 1975). Inspired by recent developments in Primary Estimation by Sparse Inversion (EPSI, G. J. A. van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2013), we propose to resolve the ambiguity by incorporating surface-related multiples in the inversion. In this way we leverage the self-consistency between the primary events and their corresponding surface-related multiples, and obtain a properly scaled source wavelet that leads to the separation of primaries and the surface-related multiples. As any other imaging algorithm, our proposed method relies on the knowledge of a reasonably accurate background model.
The variable projection method has found its applications in a variety of geophysical problems in recent years, for example, in characterization of P-S wave conversion (Fomel et al., 2003), in velocity model building (T. van Leeuwen and Mulder, 2009; Peters et al., 2014; Peters and Herrmann, 2014), in impedance reconstruction (Métivier et al., 2011; Métivier, 2011), and notably in source estimation during full-waveform inversion (Aleksandr Y. Aravkin et al., 2012; Rickett, 2013; M. Li et al., 2013). In all these applications, an objective function that measures the data misfit is used for the variable projection technique to be applicable. The misfit is usually measured using the \(\ell_2\)-norm (i.e., a least-squares formulation), but other differentiable misfit functions can also be used (Aleksandr Y. Aravkin et al., 2012). In this paper, however, we minimize the \(\ell_1\)-norm of the solution vector to promote sparsity, with the least-squares data-fitting term acting as a constraint. To the authors’ knowledge, this work is the first instance of applying the variable projection technique to an optimization problem with a sparsity-promoting objective.
Regarding the use of multiples in source estimation, successful data-space applications have been demonstrated in the literature of Surface-Related Multiple Elimination (SRME, Verschuur et al., 1992), and EPSI (G. J. A. van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2013; Esser et al., 2015). Based on the same physical principles as described by the SRME relation (Berkhout, 1993; Guitton, 2002; Muijs et al., 2007; N. D. Whitmore et al., 2010; Verschuur, 2011; Liu et al., 2011; Wong et al., 2014; Zhang and Schuster, 2014; Tu and Herrmann, 2015a; Davydenko et al., 2015), we incorporate surface-related multiples in sparsity-promoting seismic imaging including source estimation, and demonstrate why incorporation of the surface-related multiples mitigates the scaling ambiguity that plagues source estimation in seismic imaging. By optimizing in the model space rather than in the data space, we gain the following benefits: (i) we have access to the physical locations of the entire subsurface model, and can therefore overcome some limitations that SRME/EPSI imposes on data acquisition, such as (approximate) source/receiver co-location; (ii) the unknown model parameters are of much smaller dimensions compared to seismic data, which enables us to subsample the source wavefields to relax the dense data-acquisition requirement or the computation cost (Tu and Herrmann, 2015a). Aside from mitigating the scaling ambiguity, multiples also provide extra illumination coverage that complements primaries, especially along the cross-line direction in 3D seismic surveys (Long et al., 2013; S. Lu et al., 2014). However, properly imaging the surface-related multiples that contain many different orders of surface reflections calls for an inversion approach to suppress spurious artifacts from the multiples (Verschuur, 2011; Wong et al., 2014; Zhang and Schuster, 2014; Tu and Herrmann, 2015a), which can be computationally challenging. Tu and Herrmann (2015a) extended the compressive imaging approach by Herrmann and Li (2012) to account for surface-related multiples in a computationally efficient way, assuming prior knowledge of a properly scaled source wavelet. Verschuur (2011) also showed that simultaneous imaging of the primary and multiple wavefields requires the exact knowledge of the source wavelet. Davydenko et al. (2015) proposed a two-step approach to first image the multiples and then the primaries by source estimation. In this work, we are motivated to obviate the need for the exact knowledge of source wavelet, and jointly image the primary and multiple wavefields, by integrating source estimation into the method by Tu and Herrmann (2015a) using variable projection (Golub and Pereyra, 2003; Aleksandr Y. Aravkin and van Leeuwen, 2012).
The paper is organized as follows. First, we formulate the joint sparsity-promoting source- and image-estimation problem, followed by a technical discussion on how to solve \(\ell_1\)-norm minimization problems with source estimation via variable projection. Next, we explain why including surface-related multiples into the formulation helps mitigate the scaling ambiguity during imaging and source estimation. We conclude by demonstrating the advantages of the proposed method over conventional RTM, using carefully elaborated synthetic examples.
As we propose the joint inversion approach in the compressive-imaging framework, we will first briefly review the compressive-imaging formalism and formulate the source estimation problem. Afterwards, we will explain how to solve the problem using variable projection.
Given the knowledge of the source wavelet, compressive imaging aims to obtain least-squares migrated images in a computationally efficient manner, by solving the following Basis Pursuit De-Noise (BPDN) problem (Herrmann and Li, 2012): \[ \begin{equation} \begin{aligned} \mathrm{BPDN}(\vector{w},\sigma): \quad & \min_{\vector{x}} \|\vector{x}\|_1 \\ & \mathrm{subject\ to} \quad \sum_{i\in\Omega} \sum_{j\in\Sigma} \|\undvec{d}_{i,j}-\nabla\op{F}[\vector{m}_{0},{w}_i\undvec{s}_{j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}}\leq\sigma^2. \end{aligned} \label{eq:bpw} \end{equation} \] In this formulation, vector \(\vector{w}\) represents the Fourier spectrum of the source wavelet, and the tolerance parameter \(\sigma\) allows for data mismatch due to noise in the data and modelling errors. Vector \(\vector{x}\) is the curvelet coefficients of the image \(\delta\vector{m}\), i.e., \(\delta\vector{m}=\op{C}^*\vector{x}\) with \(\op{C}\) the curvelet synthesis operator (Candès and Donoho, 2004), and \(\delta\vector{m}\) itself being the perturbations over a given gridded background velocity model \(\vector{m}_0\) parameterized by the square of the slowness. In the data-fitting constraint of the above optimization program, \(i\) indexes the discretized frequencies, and \(j\) indexes the different source experiments. Instead of using all \(n_f\) discretized frequencies and \(n_s\) source experiments, Herrmann and Li (2012) proposed to subsample the monochromatic source experiments to reduce the simulation cost in forward modelling: notations \(\Omega\) and \(\Sigma\) denote the randomly selected \(n_f^{\prime} \ll n_f\) frequencies and \(n_s^{\prime} \ll n_s\) source experiments. By subsampling, the number of wave-equation solves in each iteration is reduced by a factor of \((n_f*n_s)/(n_f^{\prime}*n_s^{\prime})\). While a brief explanation of drawing the simultaneous source experiments is to multiply a tall (i.e., more rows than columns: the number of rows is the number of sequential sources and the number of columns is the number of simultaneous sources) source encoding matrix on the right-hand-side of the source matrix, we refer to Herrmann and Li (2012) and Tu and Herrmann (2015a) on the details of the subsampling procedure. The underlined variables \(\undvec{d}_{i,j}\) and \({w}_i\undvec{s}_{j}\) represent the \(j\)th column of the subsampled primary-only receiver wavefields (i.e., simultaneous data) and point-source wavefields (i.e., simultaneous sources) respectively. We assume that all sources share the same source signature \(\vector{w}\), while it is possible to extend our work to allow for source estimation for each shot separately, by leveraging the fact that the unknown source wavelets still have a much smaller dimension compared to the model parameters and the seismic data. The operator \(\nabla\op{F}\) represents the linearized Born scattering operator. Note that the source weight \(w_i\) is separable from the source vector \(\undvec{s}_j\), i.e., \(\nabla\op{F}[\vector{m}_{0},{w}_i\undvec{s}_{j}]\delta\vector{m}={w}_i\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\delta\vector{m}\). Throughout this paper, we form the operator \(\nabla\op{F}\) based on the two-way wave equation, therefore the application of \(\nabla\op{F}^*\) to data residuals produces reverse-time migrated images (Lailly, 1983). The above formulation means that among all possible solutions, the optimization program looks for the sparsest curvelet coefficients that, after curvelet synthesis and linearized modelling with the simultaneous sources, predict the simultaneous observed data up to a noise threshold specified by \(\sigma\). With the above compressive-imaging formalism in place, we now continue to formulate the source estimation problem.
Now we identify the source wavelet \(w_i, i\in \Omega\) (collected in the vector \(\vector{w}\)) as an additional unknown in the compressive imaging formulation, i.e., we have \[ \begin{equation} \begin{aligned} \mathrm{BPDN}(\sigma): \quad & \min_{\vector{x},\vector{w}} \|\vector{x}\|_1 \\ & \mathrm{subject\ to} \quad \sum_{i\in\Omega} \sum_{j\in\Sigma} \|\undvec{d}_{i,j}-{w}_i\nabla\op{F}[\vector{m}_{0}, \undvec{s}_{j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}} \leq \sigma^2. \end{aligned} \label{eq:bp} \end{equation} \] While conceptually attractive, \(\mathrm{BPDN}(\sigma)\) does not lend itself to develop tractable algorithms. Following early work by van den Berg and Friedlander (2008), we interchange the objective and constraint to arrive at an extension—remember the formulation now includes the source as an unknown, which appears nonlinearly in the \(\mathrm{BPDN}(\sigma)\) program, of the Least Absolute Shrinkage and Selection Operator (LASSO, Tibshirani (1996)) formulation: \[ \begin{equation} \begin{aligned} \mathrm{LASSO}(\tau): \quad & \min_{\vector{x},\vector{w}} f(\vector{x},\vector{w})\doteq\sum_{i\in\Omega} \sum_{j\in\Sigma} \|\undvec{d}_{i,j}-{w}_i\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}} \\ & \mathrm{subject\ to} \quad \|\vector{x}\|_1\leq\tau. \end{aligned} \label{eq:lso} \end{equation} \] As shown by A. Y. Aravkin et al. (2013), the set of minimizers of problem \(\mathrm{BPDN}(\sigma)\) and \(\mathrm{LASSO}(\tau)\) coincide, given the condition that each minimizer of problem \(\mathrm{BPDN}(\sigma)\) satisfies its constraint to equality (i.e., \(f(\vector{x},\vector{w}) = \sigma^2\)). Not coincidently, when this condition is met, each minimizer also satisfies the constraint of the \(\mathrm{LASSO}(\tau)\) problem to equality, meaning that we can look for such a \(\tau\) that the infimum of \(f(\vector{x},\vector{w})\) with \(|\vector{x}|_1\leq\tau\) has a value that equals \(\sigma^2\). Because the objective function of problem \(\mathrm{LASSO}(\tau)\) adopts the canonical form of a separable least-squares problem (Golub and Pereyra, 2003), we can now solve the problem using variable projection combined with projection onto a convex set. In the next few sections, we will first discuss how we solve problem \(\mathrm{LASSO}(\tau)\) with a given \(\tau\), and then discuss how we compute the right \(\tau\) so that problem \(\mathrm{LASSO}(\tau)\) converges to the same solution as problem \(\mathrm{BPDN}(\sigma)\).
To incorporate variable projections for the source wavelet into \(\ell_1\)-norm sparsity promoting imaging, let us first consider iterations that involve soft thresholding that undergirds this type of \(\ell_1\)-norm optimization. Model parameters at the \(k^{th}\) iteration are in this case projected onto the \(\ell_1\)-norm ball \(\|\vector{x}\|_1\leq \tau\), and the involved projected-gradient step is given by \[ \begin{equation} \vector{x}^{k+1}=\mathrm{P}_\mathcal{X}\left[\vector{x}^k+\lambda\nabla_{\vector{x}} f(\vector{x},\vector{w})\bigg|_{\vector{x}=\vector{x}^k,\vector{w}=\vector{w}^k}\right]. \label{eq:pg} \end{equation} \] In this expression, \(\lambda\) is a line-search parameter, \(\nabla_{\vector{x}} f(\vector{x},\vector{w})\bigg|_{\vector{x}=\vector{x}^k,\vector{w}=\vector{w}^k}\) is the gradient of the objective function in Equation \(\eqref{eq:lso}\) with respect to \(\op{x}\) at the \(k^{\mathrm{th}}\) iteration, and \(\mathrm{P}_\mathcal{X}\) denotes the projection onto the feasible set \(\mathcal{X}=\{\vector{x}:\|\vector{x}\|_1\leq\tau\}\).
Contrary to conventional \(\ell_1\)-norm promoting imaging where \(\vector{w}\) is assumed to be known, we need to evaluate this gradient \(\nabla_{\vector{x}} f(\vector{x},\vector{w})\bigg|_{\vector{x}=\vector{x}^k,\vector{w}=\vector{w}^k}\) with variable projection at each iteration for unknown \(\vector{w}^k\)’s. During each iteration, we accomplish this by solving the following optimization problem for each frequency: \[ \begin{equation} \min_{{w}_i}\quad\sum_{j\in\Sigma}\|\undvec{d}_{i,j}-{w}_i\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}}, \label{eq:vp} \end{equation} \] which permits the following closed-form solution for the source wavelet (Pratt, 1999): \[ \begin{equation} \widetilde{w}_i (\vector{x}) = \frac{\sum_{j\in\Sigma}\langle\undvec{d}_{i,j},\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\rangle}{\sum_{j\in\Sigma}\langle\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x},\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\rangle}. \label{eq:src} \end{equation} \] Here, the angular brackets \(\langle\cdot,\cdot\rangle\) denote the inner product of two vectors, and \(\widetilde{w}_i (\vector{x}), i\in\Omega\) denote complex-valued estimates of the source wavelet at different angular frequencies. The above solution enables us to eliminate the unknown source wavelet term in problem \(\mathrm{LASSO}(\tau)\), rendering the problem to solely and nonlinearly depend on \(\vector{x}\): \[ \begin{equation} \begin{aligned} & \min_{\vector{x}} \overline{f}(\vector{x})\doteq\sum_{i\in\Omega} \sum_{j\in\Sigma} \|\undvec{d}_{i,j}-\widetilde{w}_i(\vector{x})\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}} \\ & \mathrm{subject\ to} \quad \|\vector{x}\|_1\leq\tau. \end{aligned} \label{eq:llso} \end{equation} \] At first inspection, the above problem becomes more complicated than \(\mathrm{LASSO}(\tau)\) as we need to evaluate the derivative of \(\widetilde{w}_i(\vector{x}), i\in\Omega\) with respect to \(\vector{x}\). However, Aleksandr Y. Aravkin and van Leeuwen (2012) (Corollary 2.3) have proved that a stationary point of the above problem remains a stationary point of problem \(\mathrm{LASSO}(\tau)\). Furthermore, we have (Aleksandr Y. Aravkin and van Leeuwen, 2012, Theorem 2.1) \[ \begin{equation} \nabla_{\vector{x}}\overline{f}(\vector{x})=\nabla_{\vector{x}}f(\vector{x},\widetilde{\vector{w}}(\vector{x})). \label{eq:grad} \end{equation} \] The above equation means that in each iteration, after vector \(\vector{x}\) is modified according to Equation \(\eqref{eq:pg}\), we can compute \(\nabla_{\vector{x}}\overline{f}(\vector{x}) \) by evaluating \(\nabla_{\vector{x}}f(\vector{x},\vector{w})\) of problem \(\mathrm{LASSO}(\tau)\) at \(\vector{w}=\widetilde{\vector{w}}(\vector{x})\) via Equation \(\eqref{eq:src}\).
While there is no guarantee the variable projection approach will yield the globally optimal solution as the problem remains non-convex, numerical experiments show that in most cases it enables the optimization program to converge within less iterations (Golub and Pereyra, 2003). Rickett (2013) also showed its superior performance over the simultaneous descent method in source estimation for full-waveform inversion applications. Furthermore, the projections themselves (cf. Equation \(\eqref{eq:src}\)) are computationally affordable as they do not involve additional expensive operations such as wave-equation solves.
In this section, we describe our strategy to select the right \(\tau\) so we actually solve problem \(\mathrm{BPDN}(\sigma)\) by solving problem \(\mathrm{LASSO}(\tau)\). We follow the methodology proposed by van den Berg and Friedlander (2008) in \(\mathrm{SPG}\ell_1\), where a series of \(\tau\)’s are found by solving a root-finding problem on the Pareto trade-off curve. With the unknown source wavelet, we find the \(\tau\)’s by solving the value function \(\nu(\tau)\doteq\infim f(\vector{x},\vector{w})|_{|\vector{x}|_1\leq\tau}=\sigma^2\), which yields the following \(\ell_1\)-norm constraint for the \((l+1)^{th}\) LASSO subproblem using the Newton’s method with an initial guess of \(\tau^0=0\): \[ \begin{equation} \tau^{l+1} = \tau^{l}-\frac{\nu(\tau^l)-\sigma^2}{\nu^\prime(\tau^l)}. \label{eq:nu} \end{equation} \] Using this approach, we arrive at the solution of \(\mathrm{BPDN}(\sigma)\) by solving a series of \(\mathrm{LASSO}(\tau)\) subproblems for gradually increasing \(\tau\)’s. While \(\nu(\tau^k)\) in the above equation can be evaluated straightforwardly with the previous \(\tau\) estimate, the evaluation of \(\nu^\prime(\tau^k)\) is more difficult, because the nonlinearity introduced by the unknown source wavelet violates the linearity assumption on the forward model, which is needed to compute this value (A. Y. Aravkin et al., 2013). However, by treating \(\vector{w}\) as fixed, we approximate \(\nu^\prime(\tau^k)\) by (van den Berg and Friedlander, 2008; A. Y. Aravkin et al., 2013): \[ \begin{equation} \nu^\prime(\tau^k)\approx-\|\sum_{i\in\Omega}\sum_{j\in\Sigma}\op{C}\nabla\op{F}^*[\vector{m}_0,{w}_i\undvec{s}_j]\, (\undvec{d}_{i,j}-\nabla\op{F}[\vector{m}_0,{w}_i\undvec{s}_j]\op{C}^*\vector{x})\|_{\infty}. \label{eq:nuprime} \end{equation} \] With numerical examples, we will show that this approximation works reasonably well.
Except for the extra variable projection step, solving \(\mathrm{LASSO}(\tau)\) is similar to compressive imaging (Herrmann and Li, 2012), during which computational costs can be reduced by working with randomized subsets of data. As shown by Herrmann and Li (2012) and Tu and Herrmann (2015a), solutions of \(\mathrm{BPDN}(\vector{w};\sigma)\) for a given source wavelet can be accelerated significantly by selecting new independent randomized subsets of frequencies and simultaneous sources, after each corresponding LASSO subproblem is solved. We found empirically that working with these new independent subsets leads to faster decay of the model error and to improved robustness with respect to possible linearization errors (Tu et al., 2013)—i.e., \(\undvec{d}_{i,j}-\op{F}[\vector{m}_0,{w}_i\undvec{s}_j]\approx\nabla\op{F}[\vector{m}_0,{w}_i\undvec{s}_j]\delta\vector{m}\). Both findings can be explained because independent subsets tend to break correlations between errors in the solution and the randomly selected subsets of data (Herrmann, 2012).
While these empirical findings lead to a computationally feasible and robust compressive imaging scheme, we need to justify that working with different randomized subsets does not interact adversely with the proposed source estimation approach. Aside from additional empirical evidence from various examples included below, which suggests that there is no measurable adverse effect, we argue that (i) numerical evaluation of the value function, \(\nu^\prime(\tau)\), and its derivative \(\nu^\prime(\tau)\) remain similar as long as we keep the number of frequencies and simultaneous sources the same (evidenced by Herrmann and Li, 2012, Figure 2(c)); and (ii) we only draw new subsets after each \(\mathrm{LASSO}(\tau^{l})\) is solved. The latter argument assumes that estimates of the source at the \(l^{th}\) subproblem have converged, which is reasonable to make because the number of knowns in Equation \(\eqref{eq:src}\) for each frequency far exceeds the dimensionality of the single unknown complex-value for the source.
Despite the fact that there are indications that we arrived at a practical algorithm to estimate source wavelet as well as high-resolution least-squares images, fundamental issues related to the ambiguity with respect to amplitude scalings remain, which are intrinsic to any blind-deconvolution type of problems where both the source wavelet and the seismic image are unknown (Stockham Jr et al., 1975). However, compared to source estimation in data space using the convolution model (Ulrych et al., 1995), source estimation in the image space does not suffer from ambiguities with respect to global phase shifts, which can be explained by the multiplicity of data and move-out characteristics that are specific to the background-velocity model being used. As stated in the introduction, we assume this background model to be given and to be relatively accurate.
Unfortunately, the same observation does not apply to the ambiguity with respect to amplitude scalings, which correspond mathematically to an invariance of our objective functions with respect to amplitude scalings by \(\alpha\in\mathbb{R}^+\)—i.e., we have \[ \begin{equation} f(\vector{x},\vector{w})=f(\frac{1}{\alpha}\vector{x},\alpha\vector{w}). \label{eq:amab} \end{equation} \] This invariance results in amplitude ambiguity in the source estimation, which originates from the fact that the forward model in \(\mathrm{LASSO}(\tau)\), i.e., \({w}_i\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\), is bi-linear (i.e., when one variable is fixed, the forward model is linear w.r.t. the second variable) w.r.t. the source wavelet \(\vector{w}\) and the solution vector \(\vector{x}\).
Unfortunately, including sparsity constraints via the \(\ell_1\)-norm or even via the non-convex \(\ell_1/\ell_2\)-norm (Esser et al., 2015), are inadequate to resolve the above amplitude ambiguity. We therefore alternatively resort to the physics of the free surface. By incorporating the free surface, our problem becomes nonlinear because the forward model now includes a “feedback loop” (Verschuur et al., 1992), which generates surface-related multiples from primaries scaled by the (unknown) source wavelet. Extending the physics in this way allows us to mitigate the scaling ambiguity by using observed surface-related multiples to estimate the source wavelet as proposed during EPSI (G. J. A. van Groenestijn and Verschuur, 2009; Lin and Herrmann, 2013). Following recent work by Tu and Herrmann (2015a), we propose to do this by incorporating predictions of surface-related multiples into our formulation for compressive imaging.
If we ignore internal multiples, the total upgoing wavefield, including primaries and surface-related multiples, can be modelled by including the observed upgoing wavefields \(\vector{u}_{i,j}\) at the water surface as areal sources in the linearized Born scattering operator (Tu and Herrmann, 2015a), yielding \[ \begin{equation} \vector{u}_{i,j} \approx \nabla\op{F}[\vector{m}_0, w_i\vector{s}_j-\vector{u}_{i,j}]. \label{eq:lfmm} \end{equation} \] As the water surface has a reflection coefficient of approximately \(-1\), we place the downgoing spatially impulsive source wavefields \(w_i\vector{s}_j\) by “areal” sources \(w_i\vector{s}_j-\vector{u}_{i,j}\), i.e., we include the downgoing receiver wavefields at the water surface \(-\vector{u}_{i,j}\) into the source wavefields. We obtain \(\vector{u}_{i,j}\) after careful pre-processing to the observed seismic data, such as applying source-receiver reciprocity (to fill in missing traces), up-down decompositions (for receiver-side deghosting), and extrapolation of the upgoing wavefield from the receiver level to the free surface (see e.g., Verschuur et al., 1992). Note that source ghosts should be kept intact in the data (Verschuur et al., 1992). As a result, we arrive at a formulation that remains conducive to sparsity-promoting imaging with source estimation via variable projection. Because prediction for the multiples are carried out by the wave-equation solver, this forward model is also computationally viable compared to processing work flows that involve separate multiple-prediction/separation and RTM-imaging procedures.
To arrive at our final formulation, we first replace our objective by \[ \begin{equation} f(\vector{x},\vector{w})=\sum_{i\in\Omega} \sum_{j\in\Sigma}\|\undvec{u}_{i,j}-\nabla\op{F}[\vector{m}_{0},{w}_i\undvec{s}_{j}-\undvec{u}_{i,j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}}, \label{eq:fmul} \end{equation} \] which we obtain by substituting the primary wavefield with the total upgoing wavefield and the spatially impulsive sources with areal sources containing the total downgoing wavefield at the surface. Given this new definition for the objective, we proceed by solving the \(\mathrm{LASSO}(\tau)\) subproblems with variable projections. For this end, we solve for all (simultaneous) sources and for each frequency the following problem: \[ \begin{equation*} \min_{{w}_i}\quad\sum_{j\in\Sigma}\|\undvec{u}_{i,j}-\nabla\op{F}[\vector{m}_{0},-\undvec{u}_{i,j}]\op{C}^*\vector{x}-{w}_i\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\|_{\mathrm{2}}^{\mathrm{2}},\quad i\in\Omega, \end{equation*} \] which in turn permits the following analytic solution: \[ \begin{equation} \tilde{{w}}_i (\vector{x}) = \frac{\sum_{j\in\Sigma}\langle\undvec{u}_{i,j}-\nabla\op{F}[\vector{m}_{0},-\undvec{u}_{i,j}]\op{C}^*\vector{x},\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\rangle}{\sum_{j\in\Sigma}\langle\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x},\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\rangle}. \label{eq:srcmul} \end{equation} \] To arrive at the modified solution for the source wavelet above, we substitute the primary-only wavefield by an estimate for the primaries given by the total total upgoing wavefield minus the current prediction for the surface-related multiples—i.e., \(\undvec{d}_{i,j}\mapsto \undvec{u}_{i,j}-\nabla\op{F}[\vector{m}_{0},-\undvec{u}_{i,j}]\op{C}^*\vector{x}\). The multiple predictions do not require information on the source wavelet; rather, they are carried out by solving wave-equations with the areal sources given by \(-\undvec{u}_{i,j}\). With these substitutions, we are able to estimate the source wavelet by variable projection at additional costs of a single linearized forward modelling operation to predict the multiples.
In return of the increased computational costs, including surface-related multiples during imaging with source estimation has several key advantages, namely (i) including the multiples mitigates the ambiguity in the source estimation, which leads to a scaling of the source wavelet (and the image) that leads to the prediction and separation of the primary and multiple wavefields, which makes it possible to map multiple energy onto the true image; (ii) in seismic imaging, multiples provide extra illumination coverage from primaries (as each receiver acts as a virtual source), especially along the cross-line direction in 3D seismic data (Long et al., 2013; Tu and Herrmann, 2015b).
With the building blocks of our sparsity-promoting imaging with source estimation and multiples in place, we summarize our proposed imaging scheme in Algorithm (1).
As discussed above, our proposed inversion approach requires the knowledge of a reasonably accurate background velocity model \(\vector{m}_0\). As discussed in Tu et al. (2013), we choose a zero \(\sigma\) to prevent the optimization algorithm from being terminated prematurely (Line 3). The initial guess of the source wavelet is simply an impulse at zero time, which corresponds to a flat Fourier spectrum with unit amplitude and zero phase (Line 6). Both the solution vector \(\vector{x}\) and the sparsity level \(\tau\) are simply initialized as zeros.
In the main loop we solve a series of \(\mathrm{LASSO}(\tau^l)\) subproblems. For each subproblem, we draw new independent subsets of randomized frequencies and simultaneous sources (Line 8). We update the sparsity parameter \(\tau^l\) using Newton’s method (Line 9), and solve for the model parameters collected in vector \(\vector{x}\) (Line 10) as well as the source wavelet collected in vector \(\vector{w}\) (Line 13) by variable projection. Note that we do not impose any assumption on the phase of the source wavelet (e.g., minimum phase assumption as used in Robinson, 1967) during source estimation. We terminate the optimization program when the pre-specified maximal number of iterations \(k_{\mathrm{max}}\) is reached.
1. Input: 2. total upgoing wavefield \(\vector{u}\), background velocity model \(\vector{m}_0\), 3. tolerance \(\sigma=0\), iteration limit \(k_{\mathrm{max}}\) 4. Initialization: 5. iteration index \(k\leftarrow 0\), LASSO subproblem index \(l\leftarrow 0\), \(\tau^0\leftarrow 0\), \(\vector{x}^0\leftarrow\mathbf{0}\) 6. \(w_i=1\) for all \(i\in 1,\cdots,n_f\) 7. while \(k<k_{\mathrm{max}}\) do 8. \(\Omega_l\), \(\Sigma_l\), \(\undvec{u}_{i,j}\), \(\undvec{s}_j\leftarrow\) new independent draw 9. \(\tau^l\leftarrow\) determined from \(\tau^{l-1}\) and \(\sigma\) using Newton’s method10. \(\vector{x}^l\leftarrow\begin{cases}\mathop{\mathrm{argmin}}_{\vector{x}}\sum_{i\in\Omega_l, j\in\Sigma_l}\|\undvec{u}_{i,j}-\nabla\monoop{F}[\vector{m}_0,w_i(\vector{x})\undvec{s}_{j}-\undvec{u}_{i,j}]\op{C}^*\vector{x}\|_2^2 \\ \mathrm{subject\ to\ } \|\vector{x}\|_1\leq \tau^l \end{cases}\) 11. //warm start with \(\vector{x}^{l-1}\), solved in \(k^l\) iterations12. //in each iteration, update the source by 13. \({w}_i(\vector{x}) = \frac{\sum_{j\in\Sigma}\langle\undvec{u}_{i,j}-\nabla\op{F}[\vector{m}_{0},-\undvec{u}_{i,j}]\op{C}^*\vector{x},\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\rangle}{\sum_{j\in\Sigma}\langle\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x},\nabla\op{F}[\vector{m}_{0},\undvec{s}_{j}]\op{C}^*\vector{x}\rangle}\)14. \(k\leftarrow k+k^l, l\leftarrow l+1\)15. end while16. Output: model perturbations \(\delta\vector{x}=\op{C}^*\vector{x}\)
To validate the efficacy of the proposed imaging scheme with source estimation, we conduct a series of synthetic examples designed to illustrate key aspects of our algorithm including robustness to noise or modelling errors, resolving scaling ambiguity with multiples, and sensitivity to the initial guess of the source wavelet.
Most conventional seismic data processing work flows use the primary wavefield as the input for migration, i.e., after a de-multiple procedure to the observed data, for example using SRME (Verschuur et al., 1992). We therefore first apply our method to primary-only data to demonstrate its advantages over RTM when used in conventional data work flows.
We use a 2D slice of the SEG/EAGE salt model. We pad 10 grid points at the top of the model so that the water layer becomes slightly thicker, to better retain the water velocity near the surface when we smooth the true model to get the background model. In this way we can model and therefore remove the direct waves more accurately. The true and the background velocity models are shown in Figure 1. The model is 3.9 km deep and 15.7 km long, with a grid spacing of 24.38 m (80 ft). We use a fixed-spread configuration, and put 323 co-located sources and receivers with a 48.77 m lateral grid spacing (i.e., every other grid point) at a depth of 24.38 m (i.e., the second vertical grid point). We use a Ricker wavelet with a peak frequency of 5 Hz, and the peak of the wavelet is at 0.25 s. We record for 8 seconds, which in the frequency domain yields 96 discretized frequencies up to 12 Hz (roughly the highest frequency that can be used for a 80-ft grid spacing to avoid numerical dispersion).
To rule out the influence of any imperfection in the data, we first test our method with an idealized setup, i.e., we synthesize and invert the primary-only data using the same linearized modelling engine. To obtain a baseline image, we run a conventional RTM by assuming that we know the true source wavelet (Figure 2a). To remove the low-wavenumber artifacts in the RTM image, we apply a high-pass filtering along the depth dimension (Mulder and Plessix, 2003). Using the true source wavelet, we also obtain a least-squares migrated image using the compressive-imaging approach (Herrmann and Li, 2012), shown in Figure 2b. To demonstrate the importance of using an accurate source wavelet, we obtain the second least-squares migrated image using compressive imaging with a wrong source wavelet, shown in Figure 2c. The wrong source wavelet has a time advance of 0.1 s (i.e., a phase shift) compared with the true source wavelet. We obtain the third least-squares migrated image using the proposed method, shown in Figure 2d.
We can see that compared with conventional RTM (Figure 2a), the proposed approach produces an image of higher spatial resolution given the true source wavelet (Figure 2b). The higher spatial resolution is achieved because the least-squares migration, through many iterations, gradually removes the effect of the Gauss-Newton Hessian on the seismic image (we refer to the “basic equations” section of Nemeth et al., 2000 for more details). However, the image quality can be significantly compromised when the wavelet is wrong. In this case the shifted phase of the wavelet not only produces an image contaminated with subsampling-related noises, but also leads to a shift of the subsurface structures in depth (Figure 2c), which can result in erroneous interpretations. With the proposed source-estimation approach, we obtain a faithful image (Figure 2d), which is comparable to the image obtained using the true source wavelet (Figure 2b).
Remarks on the experiment: In terms of computational cost, we use 16 randomly selected frequencies, 16 simultaneous sources, and 60 iterations for all inversion results; as a result, the four images in Figure 2 involve roughly the same number of wave-equation solves. Note that source estimation by evaluating Equation \(\eqref{eq:src}\) does not involve extra wave-equation solves other than the ones needed for linearized forward modelling required by the imaging step. For all inversion results in this and later sections, after the inversion is finished, we apply a curvelet thresholding to remove residual incoherent noises in the image (Herrmann et al., 2008b). The threshold is chosen in such a way that the thresholded noise does not contain noticeable coherent energy. As our solution vector \(\vector{x}\) is already in the curvelet domain, extra computation incurred in this thresholding step is minimal.
As a quality-control (QC) procedure, we also plot in Figure 3a and 3b the spectra of the estimated source wavelet. For an intuitive interpretation, we also obtain the waveform of the wavelet and show it in Figure 3c. Because reliable source estimates output by the algorithm only exist at frequencies that are sampled in the last \(\mathrm{LASSO}(\tau)\) subproblem, we obtain the waveform of the source wavelet by first computing the spectrum of the source wavelet at all frequencies using Equation \(\eqref{eq:src}\) with the inverted seismic image, and then applying the inverse Fourier transform to the spectrum (this procedure is also applied to all later examples). Because of the scaling ambiguity (of the image as well as the source wavelet) we discussed above, we normalize the amplitude (i.e., the \(\ell_{\inf}\) norm) of both the true and the estimated source wavelets to compare them. The source estimates are highly accurate except for the absolute amplitude scaling.
Compared with data synthesized by linearized modelling, field seismic data are contaminated with all sorts of incoherent and coherent noises. While most incoherent noises can be removed in data pre-processing, coherent noises, such as internal multiples, are difficult to identify and remove and can therefore degrade the quality of the final seismic image. In least-squares migrations, as the objective is to match the predicted linearized data and the observed data, other sources of errors also include linearization errors (i.e., as the name suggests, the errors incurred during linearization of the wave equation) and errors by approximating earth physics using, for example, the acoustic wave-equation. To mimic some of these noises and errors, which we will generally refer to as “modelling errors” hereafter, we choose to use iWave, a 2D time-domain finite-difference acoustic modelling engine (Terentyev et al., 2014), to simulate the observed data, and use our in-house frequency-domain finite-difference acoustic modelling engine to invert the data.
To test the robustness of our method to these “modelling errors” as defined above, we compare the following examples. We again get the baseline image using conventional RTM with the true wavelet (Figure 4a). Next we obtain two least-squares migrated images: one with the true source wavelet, shown in Figure 4b; the second one with source estimation, shown in Figure 4c.
We observe that, in the presence of these modelling errors in the data, (i) the proposed source-estimation method still yields an image that is comparable to the one obtained with the true source wavelet; (ii) both images are still of higher spatial resolution than the conventional RTM image. The modelling errors (especially internal multiples in this case because of the high velocity contrast around the salt boundary) do result in some imaging artifacts beneath the salt structure, but they do not particularly cause problems for the source-estimation approach. The above observations lead to the conclusion that the proposed source-estimation approach is relatively robust with respect to coherent modelling errors in the primary-only data.
Remarks on the experiment: In generating the data with iWave, we use an absorbing boundary condition at the surface to avoid generating surface-related multiples. The observed data are the difference between the data modelled with the true model and the background model, and therefore contain internal multiples. We use the same inversion parameters as the above set of examples, and as a result, the computational costs of the three images in Figure 4 remain roughly the same.
In Figure 5 we again compare the true wavelet and the estimated wavelet as a QC procedure. We can see that in this case the accuracy of the estimated source wavelet degrades gracefully compared with the idealized case, although there exists some difference, mainly in the phase (see Figure 5b and 5c), between the two wavelets. To understand the cause of this difference, we compare in Figure 5d traces that contain strong top-salt reflections from two data sets: one synthesized using our linearized frequency-domain modelling and another one using iWave. We can see that the slight phase shift can also be observed from the data, most likely due to linearization errors especially by smoothing the salt structure and the difference between iWave and our in-house modelling engine. As the wavelet is estimated to fit the observed data, it is understandable that it contains such a phase error.
As demonstrated by the two sets of examples above, our proposed source estimation method, when applied to conventional primary-only data, produces seismic images that are comparable to images obtained using true source wavelets. The method is also relatively robust to modelling errors such as linearization errors and difference between different modelling engines. However, as we discussed above, the scaling ambiguity cannot be solved by imaging with primaries alone because of the blind-deconvolutional nature of the problem. Next we will demonstrate how we resolve the ambiguity using surface-related multiples.
To validate our proposal to incorporate surface-related multiples in the imaging procedure to resolve the amplitude ambiguity, we apply the proposed method to data that contain surface-related multiples. As above, we study the performance of the proposed approach both with an ideal data set and with a more realistic data set.
We use a sedimentary part of the Sigsbee 2B model (The SMAART JV, 2014), which by design has a strong ocean-bottom reflector, to carry out examples using surface-related multiples. We reduce the water depth of the original model, to allow for more orders of multiples to be recorded. The true and the background velocity models are shown in Figure 6. The model is 3.8 km deep and 6 km long, with 7.62 m (25 ft) grid spacing. We again use a fixed-spread configuration, and put 261 co-located sources and receivers with 22.86 m lateral grid spacing (i.e., every three grid point) at a depth of 15.24 meters (i.e., the third vertical grid point). We use a zero-phase Ricker wavelet with a peak frequency of 15 Hz, and record for 8.184 seconds (i.e., 1024 time samples with 8 ms sampling interval). This setup yields 311 discretized frequencies up to 38 Hz.
We synthesize the ideal data set including surface-related multiples using linearized modelling following Equation \(\eqref{eq:lfmm}\). Note that in this case, the multiples are not generated by including an explicit free surface in the model, but by injecting the downgoing receiver wavefield as areal sources (which equates to SRME-type multiple prediction, Tu and Herrmann, 2015a). To show the removal of amplitude ambiguity, we compare the imaging results obtained with source estimation, including both the image and estimates of the source wavelet during the last \(\mathrm{LASSO}(\tau)\) problem, against the true model and source wavelet. The inverted model perturbations are shown in Figure 7a. To show that it is indeed of the true amplitude and phase, we add it back to the background model to estimate the true model. The result is shown in Figure 7b, and is comparable to the true velocity model shown in Figure 6a. We compare the true and estimated source wavelets in Figure 8, without applying any normalization. This comparison shows that in this idealized case, the proposed imaging-with-multiple approach can retrieve the original amplitude (and phase) of both the model perturbations and the source wavelet with high accuracy.
Remarks on the experiment: In the inversion, we use 31 randomly selected frequencies (out of all 311 frequencies), 26 simultaneous sources (out of all 261 sequential sources), and 50 iterations. As a result, the number of wave-equation solves is roughly 1.5\(\times\) of one RTM with all the data (as discussed above, source estimation with multiples introduces a 50% increase in simulation cost).
As above, we model a more realistic data set that contains multiples using iWave with a free-surface boundary condition (Terentyev et al., 2014). The observed data is the difference between the data modelled with the true model and the background model. Note that the “modelling errors” in this data also arise from: (i) approximating the free-surface related source ghosts using a pair of mirrored sources without the free-surface, i.e., we place an extra virtual source with the opposite sign of the true source at a position that is symmetric with the true source w.r.t. the surface level; (ii) approximating the surface-related multiples in the observed data using the SRME-type multiple prediction (although the prediction is carried out implicitly using wave-equation solvers, Tu and Herrmann, 2015a). We discuss these two types of errors in more detail in the discussion section. Because the modelling engines used in data simulation and inversion have different scalings, we can no longer recover the original amplitudes of the image and the wavelet. However, the resulting scaling of the wavelet has a new meaning: it is the scaling that leads to the separation (in a minimum \(\ell\)-2 norm sense) of primaries and surface-related multiples based on the SRME relation. We compare the images obtained with the true source wavelet and with our source estimation in Figure 9. The inversion parameters and therefore the computational costs remain the same as the above set of examples. The two images are comparable, both containing minimal coherent artifacts from surface-related multiples (e.g., periodic copies of the ocean bottom at a deeper section of the image, see Figure 4c and 4e in Tu and Herrmann, 2015a).
We compare the true and the estimated source wavelets in Figure 10. Because we do not recover the original amplitude, we normalize their amplitudes to compare. We observe that compared to the idealized case, the estimated source wavelet has a significant bias towards the low frequencies. Our numerical examples indicate that this bias is related to the extremely strong and geologically unrealistic thin (one grid-distance) high-velocity layer located at the ocean bottom. This layer was included in this Sigsbee 2B benchmark model to generate strong surface-related multiples in the water column (The SMAART JV, 2014). While inclusion of this layer, makes sense to generate multiples it evidently causes issues with wave-equation based RTM, which is based on a linearization that assumes small perturbations while finite-difference solvers can generally not be expected to yield accurate results for such a thin and high-velocity layer. When we include this layer into the background model, we see in Figure 11 that part of the bias is removed because there is no longer a severe linearization error associated with this layer. However, the results are still not perfect since the afore-mentioned forward modelling errors remain. The corresponding source estimates included in Figure 11a is significantly improved and reasonably accurate at the high end of the available spectrum as expected. For completeness, we also include the estimate for the image in Figure 11b, which now no longer includes the unnatural velocity high at the ocean bottom.
To obtain a more quantitative understanding of the proposed method, we study and compare the convergence behaviors of the recovery algorithm with the true wavelet and with source estimation. We use the normalized cross-correlation (NCC) to measure how well the inverted results approximate true model perturbations. For two vectors \(\vector{v}_1\) and \(\vector{v}_2\) (in our case, we vectorize the true and inverted model perturbations first), the NCC is the inner product of two vectors divided by the product of the \(\ell_2\) norms of the two vectors (i.e., normalized inner product). Geometrically the NCC is the cosine of the angle between two vectors and is not sensitive to the amplitude of either vector. The smaller the angle, the closer the two vectors are, and the closer the NCC is to one. We plot the NCC between the model iterates (we save a model iterate whenever a LASSO subproblem is finished) and the true model perturbation.
We plot the NCC curves in Figure 12 for both the imaging-with-primary and the imaging-with-multiple results in the presence of modelling errors (see definition in the previous text). In both cases, we can see that the convergence behaviors of using the true source wavelet and source estimation are comparable, demonstrating the efficacy of the proposed source-estimation method.
We use an impulsive source as the initial guess of the source wavelet in all examples shown in this paper and get reasonable recoveries of the seismic images as well as the source wavelets. In this section, we would like to study the performance of the proposed method when the initial guess of the source wavelet contains notable phase errors. Robustness of the proposed method to these phase errors is desirable. We use the same experiment setup as the imaging-with-multiples examples. We show the two erroneous initial guesses of the source wavelet in Figure 13a, and show their corresponding solution paths in terms of model errors in Figure 13b. We can see despite that both initial guesses of the source wavelet can lead to local minima, the solution paths and the final model errors are comparable to the case where we use an impulsive initial guess, demonstrating the robustness of the proposed method with respect to errors in the the initial guess of the wavelet.
As the optimization approach used to solve the reduced formulation by variable-projection, i.e., Equation \(\eqref{eq:llso}\), resembles the optimization method that alternates between two variables (e.g., used in EPSI, G. J. A. van Groenestijn and Verschuur, 2009, Lin and Herrmann (2013)), one may wonder about the difference between the two terms. To the authors’ knowledge, the term “variable projection” says more about the problem formulation, i.e., to eliminate one variable that the forward model linearly depends on (in our case the unknown source wavelet) using the pseudo inverse (i.e., the least-squares solution), which leads to a formulation that is in the form of a projection onto the orthogonal complement of the column space of the forward model [Golub and Pereyra (2003); aravkin2012IPNuisance]; the term “alternating” optimization says more about the algorithmic procedure used to solve a nonlinear least-squares formulation.
Actually, solving the reduced formulation by variable projection inevitably involves the alternation of the two variables. Golub and Pereyra (2003) stated that “the variable projection algorithm consists of first minimizing for the nonlinear parameters (which also applies to the linear case — in our case it is the unknown model parameters \(\vector{x}\)) and then using the optimal value obtained for the nonlinear parameters to solve for the linear parameters”. The key of solving the reduced formulation is that, the pseudo-inverse (i.e., the least-squares solution) instead of an alternating gradient descent should be computed for the linear variable each time the nonlinear parameters are updated.
From the above discussion, we can see that unless the true source wavelet is known, obtaining the true-amplitude seismic image is out of the question when primary-only data are used due to the scaling ambiguity. Incorporating surface-related multiples in the inversion formulation indicates that we can resolve the ambiguity, i.e., we get both the source estimates and the image with definite amplitudes. However, these definite amplitudes are the “true” amplitudes only if the linearized modelling engine can accurately reproduce the input wavefield given the true source wavefield. In practice, many factors hinder this requirement from being satisfied, for example, scaling of the modelling engine, scaling of the SRME-type multiple-prediction to match the true amplitudes of multiples, and all sorts of coherent “modelling errors” in the data. Nevertheless, the presented method provides a viable approach for the least-squares type of seismic imaging algorithm with an unknown source wavelet to be one step closer towards the retrieval of true amplitudes.
Data pre-processing, as we discussed in the paragraph following Equation \(\eqref{eq:lfmm}\), is of utmost importance for imaging with surface-related multiples based on the SRME relation. Because we formulate our imaging-with-multiple problem using a data-fitting objective, correct modelling of primaries and surface-related multiples in the presence of a free surface is the foundation for successful inversions.
The SRME relation requires that we only remove the receiver ghosts by separating the up- and down-going wavefields (receiver ghosts are the downgoing wavefield), but retain the source ghosts in the data. However, as the Green’s function in the SRME relation should be surface-free, we can only approximate the source ghosts using a “mirror” source, i.e., as explained above, we place an extra virtual source with the opposite sign of the true source (because the free-surface has a reflectivity close to -1) at a position that is symmetric to the true source w.r.t. the surface level (again, no actual surface exists as the Green’s function is surface-free). Our synthetic example shows that the accuracy of this approximation is fairly high for zero-offset traces but decreases with the receiver offset. Figure 14 shows comparisons of two primary data sets with source ghosts modelled with a free-surface and with the mirror source (the finite-difference scheme, except for the boundary condition at the surface, remains the same for these two datasets). Two zero-offset traces and two far-offset (about 2.3 km receiver offset) traces are shown. The error at farther offsets may be one cause for the errors in the estimated source wavelet shown in Figure 10.
Our synthetic example also shows that multiple-prediction using the SRME relation (implicitly embedded in Equation \(\ref{eq:fmul}\)) can be fairly accurate. We again compare two multiple data sets, one modelled with a free-surface (predicted primaries using the mirror-source approach, as shown above, are subtracted from the modelled total upgoing data to obtain a rough estimate of the multiples) and another one modelled using the SRME relation. Note that as primaries, these multiples are upgoing wavefields with source ghosts. Two zero-offset traces and two far-offset traces, after proper scaling (this scalar is one reason that source estimation for imaging-with-multiple is necessary), are shown in Figure 15. For field data, lots of factors can affect successful applications of the SRME relation: missing near-offset data, acquisition aperture for multiple-prediction, cable-feathering, out-of-plane propagation effect, etc (G. J. A. van Groenestijn and Verschuur, 2009; Dragoset and Jeričevi' c, 1998). As a result, inaccurate multiple prediction can be a major source of errors for imaging with multiples.
The proposed method aims to obtain a high-resolution seismic image via sparsity-constrained least-squares migration, without the prior knowledge of the source wavelet. As mentioned previously, because reliable source estimates only exist at frequencies sampled in the last \(\mathrm{LASSO}(\tau)\) subproblem, this method does not replace source estimation methods during pre-processing, which may be needed for certain data processing applications.
When a source wavelet estimated during pre-processing is accurate enough, one can use it to image primary-only data using the compressive imaging method proposed by Herrmann and Li (2012). As shown in Figure 2b and 4b, this approach produces high-fidelity image in this case. However, to obtain an accurate estimation of a source wavelet in pre-processing, e.g., as used in full waveform inversion (Pratt, 1999; Virieux and Operto, 2009), can be computationally expensive by solving extra wave equations. By comparison, the proposed method has the advantage that the extra computation for the source estimates by Equation \(\eqref{eq:src}\) is minimal compared to wave-equation solves, while the convergence of the model is still comparable to the case where we use the true source wavelet, as demonstrated in Figure 12.
In the joint imaging of primaries and surface-related multiples, the proposed method leads to a 50% computational overhead compared with the case where a properly scaled source wavelet is given — “properly scaled” in the sense that this wavelet leads to the separation of the primary and multiple wavefields. However, as in SRME or EPSI, this scaling of the source wavelet cannot be figured out in advance, but should be estimated as part of the inversion (Verschuur, 2011).
We have proposed a fast least-squares imaging procedure that, unlike conventional reverse-time migration, doesn’t require prior knowledge of the source wavelet which is difficult to accurately determine. We formulated this wavelet-free imaging procedure as a nonlinear sparse-inversion problem, by tightly integrating source estimation into the compressive imaging framework, and proposed to solve the problem using variable projection. Because of the blind-deconvolutional nature of this problem, applying variable projection alone could not address the scaling ambiguity in estimating the source wavelet and the seismic image. We therefore continued by resolving the issue through the incorporation of surface-related multiples in the inversion process (although at the expense of an increased computational cost). We subsequently provided a series of examples demonstrating that our method can produce highly accurate estimates of the source wavelet and high-resolution seismic images that are comparable to the ones obtained with the true source wavelet. We also provided examples where we overcome the scaling ambiguity by incorporating surface-related multiples, yielding definite amplitudes for the estimates of the source wavelet as well as for the seismic image. In all these examples, we controlled the simulation cost of the inversion to be more or less comparable to a single reverse-time migration with all the data, without compromising the quality of the images.
We would like to thank the authors of \(\mathrm{SPG}\ell_1\), \(\mathrm{Curvlab}\), and \(\mathrm{iWave}\). We also thank the authors of the SEG/EAGE salt model and the Sigsbee 2B model.
This work was in part financially supported by the Natural Sciences and Engineering Research Council of Canada via the Collaborative Research and Development Grant DNOISEII (375142-08). This research was carried out as part of the SINBAD II project which is supported by the following organizations: BG Group, BGP, CGG, Chevron, ConocoPhillips, DownUnder GeoSolutions, Hess, Petrobras, PGS, Schlumberger, Sub Salt Solutions and Woodside. Ning Tu was also partially financially supported by the Chinese Scholarship Council.
Aravkin, A. Y., and van Leeuwen, T., 2012, Estimating nuisance parameters in inverse problems: Inverse Problems, 28, 115016. doi:10.1088/0266-5611/28/11/115016
Aravkin, A. Y., Burke, J., and Friedlander, M., 2013, Variational properties of value functions: SIAM Journal on Optimization, 23, 1689–1717. doi:10.1137/120899157
Aravkin, A. Y., Leeuwen, T. van, Calandra, H., and Herrmann, F. J., 2012, Source estimation for frequency-domain FWI with robust penalties: 74th eAGE conference and exhibition 2012. Retrieved from https://www.slim.eos.ubc.ca/Publications/Public/Conferences/EAGE/2012/aravkin2012EAGErobust/aravkin2012EAGErobust.pdf
Baysal, E., Kosloff, D., and Sherwood, J., 1983, Reverse time migration: GEOPHYSICS, 48, 1514–1524. doi:10.1190/1.1441434
Berkhout, A. J., 1993, Migration of multiple reflections: SEG technical program expanded abstracts. SEG.
Candès, E. J., and Donoho, D. L., 2004, New tight frames of curvelets and optimal representations of objects with piecewise c2 singularities: Communications on Pure and Applied Mathematics, 57, 219–266. doi:10.1002/cpa.10116
Davydenko, M., Verschuur, D., and Groenestijn, G. van, 2015, Full wavefield migration applied to field data: In 77th eAGE conference and exhibition 2015 (pp. We–N101–10).
Dragoset, W. H., and Jeričevi' c, 1998, Some remarks on surface multiple attenuation: Geophysics, 63, 772–789.
Esser, E., Lin, T. T., Wang, R., and Herrmann, F. J., 2015, A lifted \(\ell_1/\ell_2\) constraint for sparse blind deconvolution: 77th eAGE conference and exhibition 2015.
Fomel, S., Backus, M., DeAngelo, M., Murray, P., and Hardage, B. A., 2003, Multicomponent seismic data registration for subsurface characterization in the shallow gulf of mexico: Offshore technology conference.
Golub, G., and Pereyra, V., 1973, The differentiation of pseudo-inverses and nonlinear least squares problems whose variables separate: SIAM Journal on Numerical Analysis, 10, 413–432. doi:10.1137/0710036
Golub, G., and Pereyra, V., 2003, Separable nonlinear least squares: The variable projection method and its applications: Inverse Problems, 19, R1–R26.
Groenestijn, G. J. A. van, and Verschuur, D. J., 2009, Estimating primaries by sparse inversion and application to near-offset data reconstruction: Geophysics, 74, A23–A28. Retrieved from http://link.aip.org/link/?GPY/74/A23/1
Guitton, A., 2002, Shot-profile migration of multiple reflections: SEG technical program expanded abstracts.
Herrmann, F. J., 2012, Pass on the message: Recent insights in large-scale sparse recovery: 74th eAGE conference and exhibition 2012. Retrieved from https://www.slim.eos.ubc.ca/Publications/Public/Conferences/EAGE/2012/herrmann2012EAGEpmr/herrmann2012EAGEpmr.pdf
Herrmann, F. J., and Li, X., 2012, Efficient least-squares imaging with sparsity promotion and compressive sensing: Geophysical Prospecting, 60, 696–712. doi:10.1111/j.1365-2478.2011.01041.x
Herrmann, F. J., Moghaddam, P. P., and Stolk, C., 2008a, Sparsity- and continuity-promoting seismic image recovery with curvelet frames: Applied and Computational Harmonic Analysis, 24, 150–173. doi:10.1016/j.acha.2007.06.007
Herrmann, F. J., Wang, D., Hennenfent, G., and Moghaddam, P. P., 2008b, Curvelet-based seismic data processing: A multiscale and nonlinear approach: Geophysics, 73, A1–A5. doi:10.1190/1.2799517
Kaufman, L., 1975, A variable projection method for solving separable nonlinear least squares problems: BIT Numerical Mathematics, 15, 49–57.
Lailly, P., 1983, The seismic inverse problem as a sequence of before stack migrations: Conference on inverse scattering: Theory and application. SIAM.
Leeuwen, T. van, and Mulder, W. A., 2009, A variable projection method for waveform inversion: 71st eAGE conference and exhibition.
Li, M., Rickett, J., and Abubakar, A., 2013, Application of the variable projection scheme for frequency-domain full-waveform inversion: Geophysics, 78, R249–R257.
Lin, T. T., and Herrmann, F. J., 2013, Robust estimation of primaries by sparse inversion via one-norm minimization: Geophysics, 78, R133–R150. doi:10.1190/geo2012-0097.1
Liu, Y., Chang, X., Jin, D., He, R., Sun, H., and Zheng, Y., 2011, Reverse time migration of multiples for subsalt imaging: Geophysics, 76, WB209–WB216.
Long, A. S., Lu, S., Whitmore, D., LeGleut, H., Jones, R., Chemingui, N., and Farouki, M., 2013, Mitigation of the 3D cross-line acquisition footprint using separated wavefield imaging of dual-sensor streamer seismic: 75th eAGE conference and exhibition 2013. EAGE.
Lu, S., Whitmore, D., Valenciano, A., and Chemingui, N., 2014, Enhanced subsurface illumination from separated wavefield imaging: First Break, 32, 87–92.
Métivier, L., 2011, Interlocked optimization and fast gradient algorithm for a seismic inverse problem: Journal of Computational Physics, 230, 7502–7518. doi:http://dx.doi.org/10.1016/j.jcp.2011.06.019
Métivier, L., Lailly, P., Delprat-Jannaud, F., and Halpern, L., 2011, A 2D nonlinear inversion of well-seismic data: Inverse Problems, 27, 055005. Retrieved from http://stacks.iop.org/0266-5611/27/i=5/a=055005
Muijs, R., Robertsson, J. O. A., and Holliger, K., 2007, Prestack depth migration of primary and surface-related multiple reflections: Part i—Imaging: Geophysics, 72, S59–S69.
Mulder, W. A., and Plessix, R. E., 2003, One-way and two-way wave-equation migration: SEG technical program expanded abstracts 2003. doi:10.1190/1.1818081
Nemeth, T., Sun, H., and Schuster, G. T., 2000, Separation of signal and coherent noise by migration filtering: Geophysics, 65, 574–583.
Nemeth, T., Wu, C., and Schuster, G. T., 1999, Least-squares migration of incomplete reflection data: Geophysics, 64, 208–221.
Peters, B., and Herrmann, F. J., 2014, A sparse reduced hessian approximation for multi-parameter wavefield reconstruction inversion: SEG technical program expanded abstracts 2014. doi:10.1190/segam2014-1667.1
Peters, B., Herrmann, F. J., and Leeuwen, T. van, 2014, Wave-equation based inversion with the penalty method - adjoint-state versus wavefield-reconstruction inversion: 76th eAGE conference and exhibition 2014.
Pratt, R. G., 1999, Seismic waveform inversion in the frequency domain, part 1: Theory and verification in a physical scale model: Geophysics, 64, 888–901.
Rickett, J., 2013, The variable projection method for waveform inversion with an unknown source function: Geophysical Prospecting, 61, 874–881.
Robinson, E. A., 1967, PREDICTIVE dECOMPOSITION oF tIME sERIES wITH aPPLICATION tO sEISMIC eXPLORATION: GEOPHYSICS, 32, 418–484. doi:10.1190/1.1439873
Stockham Jr, T. G., Cannon, T. M., and Ingebretsen, R. B., 1975, Blind deconvolution through digital signal processing: Proceedings of the IEEE, 63, 678–692.
Terentyev, I. S., Vdovina, T., Symes, W. W., Wang, X., and Sun., D., 2014, iWave: A framework for wave simulation:. http://www.trip.caam.rice.edu/software/iwave/doc/html/index.html.
The SMAART JV, 2014, Sigsbee2B FS & NFS 2D synthetic datasets:. http://www.delphi.tudelft.nl/SMAART/sigsbee2b.htm.
Tibshirani, R., 1996, Regression shrinkage and selection via the lasso: Journal of the Royal Statistical Society Series B Methodological, 58, 267–288. Retrieved from http://www.jstor.org/stable/2346178
Tu, N., and Herrmann, F. J., 2015a, Fast imaging with surface-related multiples by sparse inversion: Geophysical Journal International, 201, 304–317. doi:10.1093/gji/ggv020
Tu, N., and Herrmann, F. J., 2015b, Fast least-squares imaging with surface-related multiples: Application to a North-Sea data set: The Leading Edge, 34, 788–794. doi:10.1190/tle34070788.1
Tu, N., Li, X., and Herrmann, F. J., 2013, Controlling linearization errors in \(\ell_1\) regularized inversion by rerandomization: SEG technical program expanded abstracts. doi:10.1190/segam2013-1302.1
Ulrych, T. J., Velis, D. R., and Sacchi, M. D., 1995, Wavelet estimation revisited: The Leading Edge, 14, 1139–1143. doi:10.1190/1.1437089
van den Berg, E., and Friedlander, M. P., 2008, Probing the Pareto frontier for basis pursuit solutions: SIAM Journal on Scientific Computing, 31, 890–912. doi:10.1137/080714488
Verschuur, D. J., 2011, Seismic migration of blended shot records with surface-related multiple scattering: Geophysics, 76, A7–A13.
Verschuur, D. J., Berkhout, A. J., and Wapenaar, C. P. A., 1992, Adaptive surface-related multiple elimination: Geophysics, 57, 1166–1177. Retrieved from http://link.aip.org/link/?GPY/57/1166/1
Virieux, J., and Operto, S., 2009, An overview of full-waveform inversion in exploration geophysics: GEOPHYSICS, 74, WCC1–WCC26. doi:10.1190/1.3238367
Whitmore, N. D., Valenciano, A. A., and Sollner, W., 2010, Imaging of primaries and multiples using a dual-sensor towed streamer: SEG technical program expanded abstracts. Retrieved from http://link.aip.org/link/?SGA/29/3187/1
Wong, M., Biondi, B., and Ronen, S., 2014, Imaging with multiples using least-squares reverse time migration: The Leading Edge, 33, 970–976. doi:10.1190/tle33090970.1
Zhang, D., and Schuster, G. T., 2014, Least-squares reverse time migration of multiples: GEOPHYSICS, 79, S11–S21. doi:10.1190/geo2013-0156.1