---
title: "JUDIAgent: a scientific coding agent for JUDI workflows in wave-equation imaging"
author:
  - name: Haoyun Li
  - name: Abhinav Prakash Gahlot
  - name: Felix J. Herrmann
abstract: |
  We present JUDIAgent, a scientific coding assistant for JUDI.jl, an open-source Julia framework for wave-equation-based seismic modeling, imaging, and inversion. In seismic experimentation, the practical difficulty is not only writing executable code, but assembling a complete seismic workflow with the required physical model, acquisition geometry, modeling or migration operator, and saved outputs needed for inspection.

  JUDIAgent addresses this problem by retrieving JUDI examples, generating Julia code, running the code in the target environment, and checking whether the generated script contains the requested workflow pieces and outputs, such as acquisition geometry and saved figures. We demonstrate the system with validated 2D forward-modeling and reverse-time migration (RTM) examples. These case studies show that, for JUDI-based seismic scripts, domain-aware validation can turn a lightweight user request into a more complete experimental workflow by detecting missing workflow steps that are not captured by runtime correctness alone and by making the resulting scripts easier to inspect and revise.
bibliography: abstract.bib
crossref:
  fig-prefix: Figure
  eq-prefix: Equation
format:
  html:
    page-layout: full
    sidebar: false
    lightbox: true
    crossrefs-hover: true
  pdf:
    template: IMAGEAbstractTemplate.latex
    csl: apa.csl
    cite-method: natbib
    keep-tex: true
    pdf-engine: pdflatex
execute:
  echo: false
---

# Introduction

Running a seismic experiment usually requires more than calling a single modeling or imaging operator. The user must still construct a workflow that specifies the physical model, acquisition geometry, source wavelet, operator sequence, and saved outputs needed for inspection and reruns. In this paper, those workflows are constructed in JUDI.jl, a Julia environment for wave-equation-based seismic modeling, imaging, and inversion [@witte2019judi; @herrmann2019judi], with Devito providing the underlying finite-difference wave-equation solves and stencil generation [@louboutin2019devito; @luporini2020devito].

JUDIAgent targets that workflow-construction step. Given a natural-language request, the system retrieves relevant JUDI examples and documentation, writes Julia code, runs it, and checks whether the generated script contains the workflow pieces and outputs requested in the prompt. The paper describes a domain-specific coding assistant that lowers the barrier to running seismic experiments by making JUDI-based modeling and imaging workflows easier to generate, validate, and inspect.

::: {.content-visible when-format="html"}
More broadly, recent coding-agent work has shown that retrieval, tool use, and iterative repair can improve performance on software tasks [@lewis2020rag; @yao2023react; @yang2024sweagent]. We use those ideas in a geophysical setting. Here retrieval-augmented generation means that code is not produced from the prompt alone: the system first looks up relevant JUDI examples and documentation, passes that material into generation, and then uses failures from execution or workflow review to guide the next repair step. <a href="#fig-iterative">Figure 1</a> summarizes this loop.
:::

::: {.content-visible when-format="pdf"}
More broadly, recent coding-agent work has shown that retrieval, tool use, and iterative repair can improve performance on software tasks [@lewis2020rag; @yao2023react; @yang2024sweagent]. We use those ideas in a geophysical setting. Here retrieval-augmented generation means that code is not produced from the prompt alone: the system first looks up relevant JUDI examples and documentation, passes that material into generation, and then uses failures from execution or workflow review to guide the next repair step. \hyperlink{fig-iterative-anchor}{Figure~1} summarizes this loop.
:::

The implementation was developed for JUDI-based seismic workflows, but it also benefited from the broader idea that agent tooling can be paired with scientific simulators. The open-source JutulGPT project provided a reference point for agent-oriented scientific software design in another physics domain [@jutulgpt2026], while recent work on scientific workflows, benchmark design, and simulator-centered decision pipelines helped frame the evaluation [@li2024digitaltwin; @li2025seisflowbench; @stodden2014best]. The paper focuses on forward-modeling and imaging tasks in JUDI. Its main contribution is a workflow-oriented coding assistant that combines code generation with runtime checking and with task-specific review, where the validation checklist changes with the seismic task named in the prompt.

# Benchmark specification

The system is easiest to understand through one benchmark task and the rules that define what the generated script must contain. JUDIAgent is paired with a benchmark catalog that covers multiple seismic tasks, including forward modeling and RTM, together with task-specific workspace rules that supply defaults a user would otherwise need to state manually. One RTM prompt can therefore stay short:

```{=latex}
\begin{quote}\small
\texttt{Write a basic RTM example using JUDI.jl and save one RTM figure and one migrated image.}
\end{quote}
```

::: {.content-visible when-format="html"}
```text
Write a basic RTM example using JUDI.jl and save one RTM figure and one migrated image.
```
:::

The short prompt is possible because the task-specific rules and prompt engineering were prepared in advance. For RTM, the benchmark rules require the generated script to distinguish the model used to generate the synthetic data from the migration-velocity model, construct Jacobian-adjoint imaging, save the requested migrated image, and follow plotting conventions taken from retrieved JUDI examples. The same rules also specify gray colormaps, the plotted spatial extent of the migrated image, symmetric clipping chosen from image magnitude, and muting used to suppress shallow and early-time acquisition artifacts. This design reduces how much domain knowledge the user must spell out while still keeping the generated experiment aligned with expert expectations.

JUDIAgent retrieves JUDI examples and then returns Julia code such as:

```{=latex}
\begin{quote}\small\ttfamily
using JUDI, PythonPlot\\
model\_true = Model(n, d, o, m\_true)\\
model\_mig = Model(n, d, o, m\_mig)\\
xsrc = 150f0 .+ 300f0 .* (0:4); \; xrec = 0f0 .+ 12f0 .* (0:99)\\
src\_geometry = Geometry(xsrc, ysrc, zsrc; dt=dt\_src, t=t\_src); \; rec\_geometry = Geometry(xrec, yrec, zrec; dt=dt, t=tn, nsrc=nsrc)\\
F\_true = judiModeling(model\_true, src\_geometry, rec\_geometry); \; d\_obs = F\_true * q\\
F\_mig = judiModeling(model\_mig, src\_geometry, rec\_geometry); \; d\_residual = d\_obs - F\_mig * q\\
J = judiJacobian(F\_mig, q)\\
rtm\_image\_raw = reshape(adjoint(J) * d\_residual, n)\\
rtm\_image\_muted[:, 1:mute\_rows] .= 0f0
\end{quote}
```

::: {.content-visible when-format="html"}
```julia
using JUDI, PythonPlot
model_true = Model(n, d, o, m_true)
model_mig = Model(n, d, o, m_mig)
xsrc = 150f0 .+ 300f0 .* (0:4); xrec = 0f0 .+ 12f0 .* (0:99)
src_geometry = Geometry(xsrc, ysrc, zsrc; dt=dt_src, t=t_src)
rec_geometry = Geometry(xrec, yrec, zrec; dt=dt, t=tn, nsrc=nsrc)
F_true = judiModeling(model_true, src_geometry, rec_geometry); d_obs = F_true * q
F_mig = judiModeling(model_mig, src_geometry, rec_geometry); d_residual = d_obs - F_mig * q
J = judiJacobian(F_mig, q)
rtm_image_raw = reshape(adjoint(J) * d_residual, n)
rtm_image_muted[:, 1:mute_rows] .= 0f0
```
:::

This example shows how the user-facing prompt stays lightweight while the benchmark rules carry much of the task specification. The RTM imaging workflow and its saved outputs are constrained before generation starts, so the produced script can be checked against task requirements rather than against runtime success alone.

# Validation

::: {.content-visible when-format="html"}
Validation in JUDIAgent has two layers. The first establishes whether the generated Julia script runs in the target environment. The second establishes whether the requested seismic workflow was produced. We refer to that second layer as task-specific review because its checklist changes with the seismic task named in the prompt. For forward modeling, the review checks for a model, acquisition geometry, a source wavelet, a forward operator, saved synthetic data, and saved figures. For reverse-time migration (RTM), it additionally checks for a migration-velocity model, synthetic observed data, Jacobian-adjoint imaging, and a saved migrated image [@baysal1983reverse; @virieux2009overview]. <a href="#fig-iterative">Figure 1</a> summarizes this loop from prompt to generated code, execution, review, and repair.
:::

::: {.content-visible when-format="pdf"}
Validation in JUDIAgent has two layers. The first establishes whether the generated Julia script runs in the target environment. The second establishes whether the requested seismic workflow was produced. We refer to that second layer as task-specific review because its checklist changes with the seismic task named in the prompt. For forward modeling, the review checks for a model, acquisition geometry, a source wavelet, a forward operator, saved synthetic data, and saved figures. For reverse-time migration (RTM), it additionally checks for a migration-velocity model, synthetic observed data, Jacobian-adjoint imaging, and a saved migrated image [@baysal1983reverse; @virieux2009overview]. \hyperlink{fig-iterative-anchor}{Figure~1} summarizes this loop from prompt to generated code, execution, review, and repair.
:::

The difference between these two layers matters because passing a runtime check is not the same as producing a complete seismic workflow. A script may execute but fail to save the requested data, omit the migration model, or never produce the image named in the prompt. In those cases JUDIAgent returns a concrete reason for failure and asks the model to repair the script rather than treating runtime success alone as enough.

```{=latex}
\begin{figure*}[t]
\centering
\hypertarget{fig-iterative-anchor}{}
\includegraphics[width=0.94\textwidth]{figs/paper_iterative_workflow.png}
\caption{Iterative JUDIAgent validation loop. A user request is matched with retrieved JUDI examples and documentation, translated into Julia code, executed in the target environment, and then checked for task-required workflow pieces and outputs before either passing or returning repair feedback.}
\label{fig-iterative}
\end{figure*}
```

::: {.content-visible when-format="html"}
<div id="fig-iterative">

![Figure 1. Iterative JUDIAgent validation loop. A user request is matched with retrieved JUDI examples and documentation, translated into Julia code, executed in the target environment, and then checked for task-required workflow pieces and outputs before either passing or returning repair feedback.](figs/paper_iterative_workflow.png)

</div>
:::

::: {.content-visible when-format="html"}
The current codebase also supports a command-line interface that exposes the prompt, generated code, validation messages, and saved outputs during a session, as shown in <a href="#fig-cli">Figure 2</a>.
:::

::: {.content-visible when-format="pdf"}
The current codebase also supports a command-line interface that exposes the prompt, generated code, validation messages, and saved outputs during a session, as shown in \hyperlink{fig-cli-anchor}{Figure~2}.
:::

```{=latex}
\begin{figure*}[t]
\centering
\hypertarget{fig-cli-anchor}{}
\includegraphics[width=0.72\textwidth]{figs/paper_cli_view.png}
\caption{JUDIAgent Console view showing the user prompt, generated Julia code, validation feedback, and saved outputs exposed during one coding session. The interface illustrates how the system presents its main functions to the user rather than only returning a final script.}
\label{fig-cli}
\end{figure*}
```

::: {.content-visible when-format="html"}
<div id="fig-cli">

![Figure 2. JUDIAgent Console view showing the user prompt, generated Julia code, validation feedback, and saved outputs exposed during one coding session. The interface illustrates how the system presents its main functions to the user rather than only returning a final script.](figs/paper_cli_view.png)

</div>
:::

# Benchmark results

We evaluate JUDIAgent as a scientific coding system rather than as a new numerical imaging method. The question is whether it can generate executable and inspectable JUDI workflows for two representative tasks: 2D forward modeling and RTM. This setup also serves a reproducibility goal in computational science because the system should produce scripts and outputs that another user can inspect and rerun [@stodden2014best; @wilson2017good; @li2025seisflowbench].

The two benchmark tasks stress different parts of the workflow. Forward modeling tests whether the agent can assemble a script that defines the model, geometry, synthetic data, and requested outputs. RTM is stricter because, in this synthetic benchmark, it requires the agent to keep track of the model used to generate the synthetic data, a separate migration-velocity model, Jacobian-adjoint imaging, and image export.

::: {.content-visible when-format="html"}
The forward-modeling case asks for a two-layer acoustic model, five sources, one hundred surface receivers, synthetic data, and two saved figures. The resulting shot gather shows a clear direct arrival and a weaker later reflection that is consistent with the layered model. The setup panel documents the physical model that the prompt requested. <a href="#fig-forward-case">Figure 3</a> shows both outputs side by side.
:::

::: {.content-visible when-format="pdf"}
The forward-modeling case asks for a two-layer acoustic model, five sources, one hundred surface receivers, synthetic data, and two saved figures. The resulting shot gather shows a clear direct arrival and a weaker later reflection that is consistent with the layered model. The setup panel documents the physical model that the prompt requested. \hyperlink{fig-forward-case-anchor}{Figure~3} shows both outputs side by side.
:::

::: {.content-visible when-format="html"}
The RTM case asks for a migration workflow with two subsurface models: one model used to produce the synthetic data and a smoother migration-velocity model used to form the image. The generated script also has to form residual data against the migration background, apply Jacobian-adjoint imaging, suppress shallow and early-time acquisition artifacts with muting, and save the migrated image to disk. Unlike the single-model forward setup in <a href="#fig-forward-case">Figure 3</a>, <a href="#fig-rtm-case">Figure 4</a> shows the two-model RTM setup together with the saved image from this benchmark.
:::

::: {.content-visible when-format="pdf"}
The RTM case asks for a migration workflow with two subsurface models: one model used to produce the synthetic data and a smoother migration-velocity model used to form the image. The generated script also has to form residual data against the migration background, apply Jacobian-adjoint imaging, suppress shallow and early-time acquisition artifacts with muting, and save the migrated image to disk. Unlike the single-model forward setup in \hyperlink{fig-forward-case-anchor}{Figure~3}, \hyperlink{fig-rtm-case-anchor}{Figure~4} shows the two-model RTM setup together with the saved image from this benchmark.
:::

```{=latex}
\begin{figure*}[t]
\centering
\hypertarget{fig-forward-case-anchor}{}
\begin{minipage}[t]{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/basic_2d_forward_model_paper.png}
\caption*{(a) Single-model two-layer setup used in the validated forward-modeling benchmark.}
\end{minipage}\hfill
\begin{minipage}[t]{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/basic_2d_forward_shot_paper.png}
\caption*{(b) Central-shot gather produced by the generated JUDI workflow.}
\end{minipage}
\caption{Validated forward-modeling benchmark. The generated workflow produces both the requested two-layer setup panel and a central-shot gather whose direct arrival and later reflection are consistent with the layered model and acquisition geometry.}
\label{fig-forward-case}
\end{figure*}
```

::: {.content-visible when-format="html"}
<div id="fig-forward-case">

| ![(a) Single-model two-layer setup used in the validated forward-modeling benchmark.](figs/basic_2d_forward_model_paper.png) | ![(b) Central-shot gather produced by the generated JUDI workflow.](figs/basic_2d_forward_shot_paper.png) |
|---|---|

Figure 3. Validated forward-modeling benchmark. The generated workflow produces both the requested two-layer setup panel and a central-shot gather whose direct arrival and later reflection are consistent with the layered model and acquisition geometry.

</div>
:::

```{=latex}
\begin{figure*}[t]
\centering
\hypertarget{fig-rtm-case-anchor}{}
\begin{minipage}[t]{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/rtm_basic_setup_paper.png}
\caption*{(a) RTM setup with the model used to generate the synthetic data, a migration-velocity model, and the benchmark acquisition geometry.}
\end{minipage}\hfill
\begin{minipage}[t]{0.49\textwidth}
\centering
\includegraphics[width=\textwidth]{figs/rtm_basic_image_paper.png}
\caption*{(b) RTM image formed from residual data after background subtraction, muting, and display clipping.}
\end{minipage}
\caption{RTM case study generated by JUDIAgent. The left panel shows the two-model benchmark setup; the right panel shows the saved RTM image after subtracting the migration-background prediction from the observed data, applying muting to suppress shallow and early-time acquisition artifacts, and clipping the display by a robust percentile rule. The resulting image recovers the main horizontal reflector and demonstrates that the requested imaging workflow was generated and executed successfully.}
\label{fig-rtm-case}
\end{figure*}
```

::: {.content-visible when-format="html"}
<div id="fig-rtm-case">

| ![(a) RTM setup with the model used to generate the synthetic data, a migration-velocity model, and the benchmark acquisition geometry.](figs/rtm_basic_setup_paper.png) | ![(b) RTM image formed from residual data after background subtraction, muting, and display clipping.](figs/rtm_basic_image_paper.png) |
|---|---|

Figure 4. RTM case study generated by JUDIAgent. The left panel shows the two-model benchmark setup; the right panel shows the saved RTM image after subtracting the migration-background prediction from the observed data, applying muting to suppress shallow and early-time acquisition artifacts, and clipping the display by a robust percentile rule. The resulting image recovers the main horizontal reflector and demonstrates that the requested imaging workflow was generated and executed successfully.

</div>
:::

Taken together, these cases show what we can support today: JUDIAgent can generate executable JUDI workflows that save figures and data products a user can inspect after runtime success. That is a practical step toward lowering the effort required to run and inspect seismology workflows in JUDI.

# Conclusion

JUDIAgent addresses a practical problem in seismology: translating natural-language requests into JUDI workflows that not only run, but also produce the model setup, operator sequence, saved figures, and saved data products that the user asked for. By combining retrieval from JUDI examples with runtime checking and task-specific review, the system can reject incomplete scripts and repair them with concrete feedback such as missing data files, missing figures, or missing migration steps. The contribution is a lower-barrier interface for generating, validating, and inspecting seismic experiments rather than a new imaging method.

::: {.content-visible when-format="html"}
The current forward-modeling and RTM examples are modest, and we do not claim state-of-the-art imaging performance. An important next step is to extend workflow-completeness review with task-aware diagnostics. For migration tasks, these may include image-residual or illumination summaries that can be evaluated in future benchmark design. The codebase is maintained at <a href="https://github.com/haoyunl2/JUDIAgent">https://github.com/haoyunl2/JUDIAgent</a>.
:::

::: {.content-visible when-format="pdf"}
The current forward-modeling and RTM examples are modest, and we do not claim state-of-the-art imaging performance. An important next step is to extend workflow-completeness review with task-aware diagnostics. For migration tasks, these may include image-residual or illumination summaries that can be evaluated in future benchmark design. The codebase is maintained at \url{https://github.com/haoyunl2/JUDIAgent}.
:::

# Acknowledgment

This work was supported by the Georgia Tech SLIM Group. We thank the developers of JUDI.jl, Devito, LangChain, LangGraph, and JutulGPT for releasing open-source tools that informed this work. During preparation of this manuscript, the authors used generative AI tools in text revision and figure iteration. All content and figures were subsequently reviewed and edited by the authors.

```{=latex}
\clearpage
```

# References {.unnumbered}

::: {#refs}
:::