Julia seismic modeling and inversion toolbox

Philipp Witte (pwitte@eos.ubc.ca)

Overview

This Julia toolbox provides a framework for solving large-scale forward and adjoint wave equations using Devito, a domain specific language compiler that auto-generates highly optimized and parallel stencil codes. The Julia framework is based on abstract linear operators and vectors, in order to formulate a seismic experiment as a chain of matrix-matrix and matrix-vector multiplications. For example, we can express a seismic experiment with four source locations as the following linear algebra operation: $\begin{equation*} \begin{aligned} \begin{pmatrix} \mathbf{d}_1 \\ \mathbf{d}_2 \\ \mathbf{d}_3 \\ \mathbf{d}_4 \end{pmatrix} = \begin{pmatrix} \mathcal{P}_{r_1} & & & \\ & \mathcal{P}_{r_2} & & \\ & & \mathcal{P}_{r_3} & \\ & & & \mathcal{P}_{r_4} \end{pmatrix} \cdot \begin{pmatrix} \mathbf{F}_1 & & & \\ & \mathbf{F}_2 & & \\ & & \mathbf{F}_3 & \\ & & & \mathbf{F}_4 \end{pmatrix} \cdot \begin{pmatrix} \mathcal{P}_{s_1}^\top & & & \\ & \mathcal{P}_{s_2}^\top & & \\ & & \mathcal{P}_{s_3}^\top & \\ & & & \mathcal{P}_{s_4}^\top \end{pmatrix} \cdot \begin{pmatrix} \mathbf{q}_1 \\ \mathbf{q}_2 \\ \mathbf{q}_3 \\ \mathbf{q}_4 \end{pmatrix}, \end{aligned} \end{equation*}$ where $\mathcal{P}_{r_i}$ is a projection operator that restricts the wavefields of the $i^{th}$ experiment modelled in $\mathbf{F}_i$ to the receiver locations. $\mathcal{P}_{s_i}^\top$ is the injection operator of the $i^{th}$ experiment that injects the source $\mathbf{q}_i$ into $\mathbf{F}_i$ . In the most general case, the receiver and source locations as well as sampling intervals and recorded number of time steps can be different for all four sources. In terms of code, this modeling operation is expressed in the following way

d = Pr*F*Ps'*q  # forward modeling

where Pr and Ps are projection operators that contain all the necessary geometry and recording information for the four different experiments. The vectors d and q are seismic data containers that look like vectors, but internally contain the seismic data in its original dimensions as well as its meta data (i.e. coordinates and sampling intervals). The same operators can then be used to solve an adjoint wave equation, in which the four shot records are backproagated and restricted to the source locations:

qad = Ps*F'*Pr'*d   # adjoint modeling

This manual gives an overview over the main components of the Julia modeling framework and how to set up linear operators for forward and adjoint wave equations and linearized wave equations.

Structures

Geometry structure

Information about the seismic data is grouped together in the Geometry structure. The structure contains the x,y and z-coordinates in $$m$$ , the sampling interval of the traces in $$ms$$ and the number of samples. Each of these attributes are stored as cell arrays (Array{Any,1}), where each cell of the array corresponds to an individual seismic experiment.

type Geometry
    xloc::Array{Any,1}
    yloc::Array{Any,1}
    zloc::Array{Any,1}
    dt::Array{Any,1}
    nt::Array{Any,1}
end

A Geometry structure contains either the geometry of one or multiple sources or the geometry of one or multiple shot records. Each attribute of the Geometry structure is a cell array and each cell entry contains the information for the corresponding source or shot record number. For example, the Geometry structure for a 2D seismic data set with two shot records is set up in the following way:

# Empty cell arrays of length 2, since we define the geometry for two shot records
xrec = Array{Any}(2)
yrec = Array{Any}(2)
zrec = Array{Any}(2)
dt_rec = Array{Any}(2)
nt_rec = Array{Any}(2)

# Receiver coordinates
xrec[1] = linspace(100,500,50)  # 50 receivers from 100 to 500 m for the first experiment
xrec[2] = linspace(500,1000,60) # 60 receivers from 500 to 1000 m for the second experiment
yrec[1] = 0
yrec[2] = 0
zrec[1] = linspace(20,20,50)    # receivers at 20 m depth
zrec[2] = linspace(30,30,60)    # receivers at 30 m depth

# Receiver sampling interval and number of time steps
dt_rec[1] = 4   # sample data at 4 ms
dt_rec[2] = 2   # sample data at 2 ms
nt_rec[1] = 1000    # record 1000 samples
nt_rec[2] = 800     # record 800 samples

# Set up receiver geometry
recGeometry = Geometry(xrec,yrec,zrec,dt_rec,nt_rec)

Setting up the geometry for the seismic sources happens exactly the same, because our modeling workflow does not distinguish between source and receiver geometries, since both are defined through the same structure. If we use the Geometry structure to define the source geometry, each cell entry of the x,y and z-locations contains either a single coordinate (injection of a single source wavelet) or an array of coordinates (inject multiple sources simultaneously). For example, the source geometry for a 2D seismic survey with two experiments is set up in the following way:

# Empty cell arrays
xsrc = Array{Any}(2)
ysrc = Array{Any}(2)
zsrc = Array{Any}(2)
dt_src = Array{Any}(2)
nt_src = Array{Any}(2)

# Source coordinates
xsrc[1] = [200,400] # inject two sources in the first experiment at 200 and 400 m
xsrc[2] = 500   # inject one source in the second experiment at 500 m 
ysrc[1] = 0
ysrc[2] = 0
zsrc[1] = [5,10]    # sources at depth of 5 and 10 m
zsrc[2] = 15    # source at depth of 15 m

# Source sampling interval and number of time steps
dt_src[1] = 3   # sampling interval of first source
dt_src[2] = 1   # sampling interval of second source
nt_src[1] = 1200    # length of first source
nt_src[2] = 800 # length of second source

# Set up receiver geometry
srcGeometry = Geometry(xsrc,ysrc,zsrc,dt_src,nt_src)

This way of handling source and receiver geometries is very flexible, because it is possible to assign different receiver positions for each seismic experiment. This allows for example to set up marine seismic surveys with streamers, where the receiver locations vary for each shot. However, in other cases (e.g. land acquisition or ocean bottom nodes acquisition) the receiver location is the same for each experiment. In order to avoid having to set up cell arrays that all contain the exact same receiver geometry, there is a different Geometry constructor, that takes one set of dt, nt, x,y and z-coordinates plus the number of experiments as an additional argument. The constructor then sets up the cell arrays of length numExperiments itself and copies the five input parameters into each cell entry.

# Attributes can be arrays, linspaces or cell arrays of length 1
numExperiments = 4
xrec = linspace(100,1000,200)
yrec = 0.
zrec = linspace(50,60,200)
dt = 2.
nt = 1000

# Set up receiver geometry for 4 experiments with constant receiver location
recGeometry = Geometry(xrec,yrec,zrec,dt,nt,numExperiments)

Model structure

Information about the subsurface models is collected in the Model structure. The most basic Model structure has four attributes:

type Model
    n::Tuple
    d::Tuple    
    o::Tuple    
    m::Array
   (rho::Array)
end

where n is a 2 or 3 element tuple (nx,nz) or (nx,ny,nz) with the number of grid points, d is a tuple with the grid spacing in $$m$$ and o is a tuple with the coordinate system origin (also in $$m$$ ). The field m is a 2 or 3D array of the squared slowness in $$s^2/km^2$$ . Furthermore there is an optional field rho for density ( $$g/cm^3$$ ).

For example, the Model structure for a 3D constant velocity model is set up in the following way:

n = (120,150,100)   # no. of gridpoints in x,y,z direction
d = (10,10,10)
o = (0,0,0)

# Constant velocity model with 1.5 km/s
v = ones(n)*1.5
m = (1./v).^2
rho = ones(n)

# Set up model structure with slowness squared only
model = Model(n,d,o,m)

# Set up model structure with slowness squared and density
model = Model(n,d,o,m,rho)

Info structure

The Info structure contains information which is required to set up the JOLI modeling operators. Some information like number of grid points n, number of experiments nsrc or number of computational time steps nt is required by several different operators (i.e. modeling, projection and Jacobian operators) and is therefore collected in a central structure:

type Info
    n::Integer
    nsrc::Integer
    nt::Array{Any,1}
end

The field nt is a cell array that contains the number of computational time steps for each experiment and defines the dimensions of the linear operators. However, the actual number of computational time steps are internally determined from the number of samples of the right-hand-side. The following function returns a cell arrays with the actual number of computational time steps:

nt_comp = get_computational_nt(srcGeometry,recGeometry,model)

The info structure can then be set up in the following way:

N = prod(model.n)   # total number of grid points
nsrc = 4    # number of experiments
info = Info(N,nsrc,nt_comp)

Data container

Seismic data in the Julia modeling framework is collected in a seismic data container called joData. This structure is an abstract JOLI vector, which can be used like a regular vector, but internally contains the seismic data in its original dimensions, as well as its metadata (i.e. the geometry). Both recorded shot records as well as source functions are stored as data containers.

Julia data container

The Julia seismic data container has the following attributes:

type joData # JOLI vector
    name::string
    m::Integer
    n::Integer  
    nsrc::Integer   
    geometry::Geometry
    data::Array{Any,1}
end

The field m is the length of the data container, n=1 is the number of columns, nsrc is the number of experiments, geometry is either the source or receiver geometry as a Geometry structure and data is the source or receiver data, stored as a cell array. Each cell entry of data contains the shot record or source function in its original dimensions. The length m of the data container is the total number of data points in data, i.e. the data container looks like a vectorized version of the data.

A data container can be generated from a Geometry structure and a cell array containing the data for each experiment:

# create data container for source and receiver data
q = joData(srcGeometry,wavelets)
d = joData(recGeometry,data)

Data container operations

So far, there is a limited number of arithmetic operators that are allowed for data containers of type joData:

# Set up data containers
d1 = joData(recGeometry1,data1)
d2 = joData(recGeometry2,data2)

# Addition and subtraction
d2 = d1 + d2    # geometries must be the same
d3 = d1 - d2

# Vertical stacking
d4 = [d1;d2]    # geometries can be different

# Norms
norm(d1,2)  # l2 norm
norm(d1,1)  # l1 norm

Adding or subtracting two data containers requires the geometries in both containers to match and an error is thrown otherwise. Vertical stacking generates a new data container with the length of length(d1) + length(d2). For stacking, the geometries and the data dimensions of the two inputs can be different.

Set up linear operators

Linear operators for seismic modeling are JOLI operators, i.e. they are matrix-free linear operators that can be applied to vectors, transposed or combined with (certain) other operators. The operators are defined as immutable types, which means that the attributes of the operators cannot be changed after their initial construction, which prevents accidental modification.

joProjection

The projection operator for source and receiver restriction/injection is called joProjection and has the following attributes:

immutable joProjection  # JOLI operator
    name::string
    m::Integer
    n::Integer  
    info::Info
    geometry::Geometry
end

The constructor for the projection operator requires only the Info structure as well as the source or receiver Geometry structure:

# Setup projection operators
Pr = joProjection(info,recGeometry) # receiver projection
Ps = joProjection(info,srcGeometry) # source projection

The forward projection operator can be applied to the modeling operator from the next section, in order to restrict the wavefields to the source/receiver locations. The adjoint projection operator can be applied to seismic data containers joData to inject the source/receiver data into the subsurface. An error is thrown if the geometry in the projection operator does not match the geometry in the data container.

joModeling

The basic modeling operator is called joModeling and only contains the Info as well as the Model structure:

immutable joModeling # JOLI operator
    name::string
    m::Integer
    n::Integer  
    info::Info
    model::Model
end

Since the operator has no information about the source and receiver geometries, it can only be used together with a projection operator on either side:

# Setup modeling operator
F = joModeling(info,model)
q = joData(srcGeometry,wavelet)
d = Pr*F*Ps'*q  # forward modeling
qad = Ps*F'*Pr'*d   # adjoint modeling

The forward modeling operator returns the data d as a Julia data container. The adjoint modeling operator can then be applied to d directly in order to backpropagate the data and restrict it to the source position.

We can also combine the operators Pr, F and Ps' into a single combined operator F2, which can then be applied to q directly:

F2 = Pr*F*Ps'
d = F2*q
qad = F2'*d

Combining the operators into a single modeling operator under the hood generates an operator of a different type, called joPDEfull:

immutable joPDEfull # JOLI operator
    name::string
    m::Integer
    n::Integer  
    info::Info
    model::Model
    srcGeometry::Geometry
    recGeometry::Geometry
end

Since joPDEfull is used without the source or receiver projection operators, the source and receiver geometries are stored in joPDEfull itself. Instead of forming this operator by chaining Pr,F and Ps' together, the operator can also be constructed directly by passing the source and receiver geometry structures as additional arguments to joModeling():

# Setup full modeling operator
F2 = joModeling(info,model,srcGeometry,recGeometry)
d = F2*q    # forward modeling
qad = F2'*d # adjoint modeling

joJacobian

The Jacobian is a JOLI operator with the same fields as joPDEfull with an additional field that contains the source (as a data container):

immutable joJacobian # JOLI operator
    name::string
    m::Integer
    n::Integer  
    info::Info
    model::Model
    srcGeometry::Geometry
    recGeometry::Geometry
    source::joData
end

The Jacobian can also constructed from the full modeling operator and the seismic source directly. The forward operator can be applied to a vector dm of type Array and the adjoint operator can be applied to seismic data of type joData:

J = joJacobian(Pr*F*Ps',q)  # or J = joJacobian(F2,q)
dD = J*dm   # returns data container
rtm = J'*dD