fast-resolve: Fast Bayesian radio interferometric imaging

Jakob Roth; Philipp Frank; Hertzog L. Bester; Oleg M. Smirnov; Rüdiger Westermann; Torsten A. Enßlin

doi:10.1051/0004-6361/202451107

Home

All issues

Volume 690 (October 2024)

A&A, 690 (2024) A387

Full HTML

Open Access

Issue		A&A Volume 690, October 2024


Article Number		A387
Number of page(s)		23
Section		Numerical methods and codes
DOI		https://doi.org/10.1051/0004-6361/202451107
Published online		23 October 2024

A&A, 690, A387 (2024)

`fast-resolve`: Fast Bayesian radio interferometric imaging

Jakob Roth¹^,2^,3^★, Philipp Frank¹, Hertzog L. Bester⁴^,5, Oleg M. Smirnov⁵^,4^,6, Rüdiger Westermann³ and Torsten A. Enßlin¹^,2

¹ Max Planck Institute for Astrophysics, Karl-Schwarzschild-Str. 1, 85748 Garching, Germany
² Ludwig-Maximilians-Universität, Geschwister-Scholl-Platz 1, 80539 Munich, Germany
³ Technische Universität München (TUM), Boltzmannstr. 3, 85748 Garching, Germany
⁴ South African Radio Astronomy Observatory (SARAO), Cape Town, 7925, South Africa
⁵ Centre for Radio Astronomy Techniques and Technologies (RATT), Department of Physics and Electronics, Rhodes University, Makhanda 6140, South Africa
⁶ Institute for Radioastronomy, National Institute of Astrophysics (INAF IRA), Via Gobetti 101, 40129 Bologna, Italy

^★ Corresponding author; roth@mpa-garching.mpg.de

Received: 13 June 2024
Accepted: 9 September 2024

Abstract

Context. The aim of this paper is to reduce the computational complexity of the Bayesian imaging algorithm resolve, enabling the application of Bayesian imaging for larger datasets.

Aims. By combining computational shortcuts of the CLEAN algorithm with the Bayesian imaging algorithm resolve we developed an accurate and fast imaging algorithm that we named fast-resolve.

Methods. We validate the accuracy of the presented fast-resolve algorithm by comparing it with results from resolve on VLA Cygnus A data. Furthermore, we demonstrate the computational advantages of fast-resolve on a large MeerKAT ESO 137-006 dataset, which is computationally out of reach for resolve.

Results. The presented algorithm is significantly faster than previous Bayesian imaging algorithms, broadening the applicability of Bayesian interferometric imaging. Specifically, for the single channel VLA Cygnus A datasets fast-resolve is about 144 times faster than resolve. For the MeerKAT dataset with multiple channels the computational increase in speed of fast-resolve is even greater.

Key words: instrumentation: interferometers / methods: data analysis / methods: statistical / techniques: interferometric

© The Authors 2024

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model.

Open Access funding provided by Max Planck Society.

1 Introduction

Interferometric imaging is a versatile technique in astronomy that allows us to achieve enormous sensitivity and resolution by combining multiple telescopes. The effective resolution is roughly equivalent to that of a single telescope with a diameter equal to the largest distance between individual stations of the interferometer, which can be thousands of kilometers. For the upcoming Square Kilometer Array (Labate et al. 2022; Swart et al. 2022), the total collecting area might eventually approach one square kilometer, resulting in superior sensitivity compared to any single-dish telescope. Overcoming the resolution and sensitivity limitations of single telescopes comes at the cost of making it more difficult to retrieve images from the observational data.

Mathematically, the recovery of the sky images from interferometric data can be formulated as an inverse problem. The forward relation that computes the corresponding measurement data from a given sky image is known as the radio interferometric measurement equation (Smirnov 2011). As discussed in detail in Sec. 2, the measured data points are essentially noisy and undersampled Fourier modes of the sky image. This makes it impossible to obtain the sky image via direct inversion of the measurement equation, rendering radio interferometric imaging an ill-posed inverse problem.

Solving the inverse problem of radio interferometric imaging requires sophisticated algorithms that impose additional constraints on the sky brightness and regularize possible solutions to it. Historically, CLEAN (Högbom 1974; Schwab & Cotton 1983) has been by far the most widely used imaging algorithm because it is computationally efficient, simple, and easy to use. Over the last decades, CLEAN-based imaging algorithms have been significantly improved, especially for diffuse emission imaging, spectral imaging, and wide-field imaging (Bhatnagar & Cornwell 2004; Cornwell et al. 2008; Rau & Cornwell 2011; Offringa et al. 2014; Offringa & Smirnov 2017). However, CLEAN-based algorithms have several drawbacks, such as limited image fidelity, suboptimal resolution of recovered images, and lack of uncertainty quantification (see, e.g., Arras et al. 2021a for a detailed discussion). To improve these limitations, many other imaging algorithms have been developed.

A large class of new imaging algorithms builds on applying compressed sensing techniques to astronomical data (Wiaux et al. 2009). Several incarnations of such algorithms have shown significant improvements in terms of image fidelity and resolution over CLEAN-based reconstructions. Recent examples are Dabbech et al. (2018), Abdulaziz et al. (2019), and Dabbech et al. (2021) utilizing sparsity-based regularizers in combination with convex optimization. Birdi et al. (2018) and Birdi et al. (2020) extended these approaches to full polarization imaging. In Repetti et al. (2019) and Liaudat et al. (2024), some form of uncertainty quantification was added. Dabbech et al. (2022) and Thouvenin et al. (2023) parallelized the image regularizer to distribute it over multiple CPUs. Terris et al. (2023a) and Dabbech et al. (2022) presented a plug-and-play algorithm combining convex optimization with a deep-learning regularizer. Terris et al. (2023b) further explored neural network regularizes and studied the influence of the training dataset on the regularizer.

Bayesian imaging algorithms are another important class of imaging algorithms that address the uncertainty quantification problem. Early Bayesian imaging approaches such as Cornwell & Evans (1985) built on the maximum entropy principle. Other Bayesian imaging techniques (e.g., Sutton & Wandelt 2006; Sutter et al. 2014; Cai et al. 2018; Tiede 2022) rely on posterior sampling techniques. The Bayesian imaging framework resolve¹ originally proposed by Junklewitz et al. (2016) builds on variational inference instead of sampling-based techniques for posterior approximation, reducing the computational costs. resolve has already been successfully applied to VLA, EHT, VLBA, GRAVITY, and ALMA data, providing high-quality radio maps with superior resolution compared to CLEAN, as well as an uncertainty map. Recent examples are Arras et al. (2019), who joined Bayesian imaging with calibration. In Arras et al. (2021a) resolve is compared with CLEAN, while in Roth et al. (2023), direction-dependent calibration is added and compared with the sparsity-based imaging results of Dabbech et al. (2021). In GRAVITY Collaboration (2022) resolve is applied to the optical interferometer GRAVITY. In Arras et al. (2022) resolve is applied to EHT data.

The improvements in imaging algorithms mentioned above come at the cost of increased computational complexity. For Very Long Baseline Interferometer (VLBI) observations, data sizes are typically small and the higher computational cost of advanced imaging methods is unproblematic. However, for large arrays such as MeerKAT (Jonas & MeerKAT Team 2016), this limits the applicability of many of these algorithms. For this reason, only a very few advanced imaging algorithms have to date been successfully applied to datasets the size of a typical MeerKAT observation. Dabbech et al. (2022) presents an application of two convex optimization imaging algorithms to the MeerKAT ESO 137-006 observation, which one algorithm using a sparsity-based regularizer (uSARA) and the other a neural network regularizers (AIRI). To handle the enormous computational complexity, Dabbech et al. (2022) massively parallelize their imaging algorithm and distribute it across a high-performance computing system, which requires hundreds to thousands of CPU hours to converge. In Thouvenin et al. (2023), this approach of parallelizing the regularizer is extended to the spectral domain.

To the best of our knowledge, to date no Bayesian radio interferometric imaging algorithm has been applied to a similar sized dataset. The goal of the resolve framework has always been to enable Bayesian imaging for a wide range of radio telescopes, not just VLBI observations. However, for instruments such as MeerKAT, imaging with resolve becomes prohibitively computationally expensive. In this paper we present the algorithm fast-resolve, which significantly reduces the computational complexity of classical resolve, and enables Bayesian image reconstruction for large datasets. Greiner et al. (2016) previously named an imaging method fastRESOLVE. The goal of the fastRESOLVE algorithm of Greiner et al. (2016) is the same as the new incarnation of fast-resolve presented in this paper. However, while the new and the old variants of fast-resolve share some ideas, there are significant differences. In Sec. 3.4 we highlight the common concepts between the old the and the new fast-resolve. We note that the framework behind both fast-resolve and resolve has evolved significantly since the time of Greiner et al. (2016).

Machine learning-based imaging algorithms are a third and relatively new class of imaging algorithms. Once the underlying machine learning models are trained, the computational cost for imaging is often lower than for other algorithms. Nevertheless, assessing image fidelity is challenging in such a framework, given the lack of interpretability of machine learning models. Recent examples of such algorithms are Aghabiglou et al. (2024); Dabbech et al. (2024); Connor et al. (2022); Schmidt et al. (2022).

The remainder of the paper is organized as follows. In Sec. 2 we discuss the radio interferometric measurement equation and the imaging inverse problem in detail. In Sec. 3 we briefly review the existing resolve framework in its current form and outline the CLEAN algorithm. Building on some of the computational shortcuts of the CLEAN algorithm, we derive fast-resolve, and in Sec. 4 we show several applications of it. In particular, in Sec. 4.1 we compare imaging results of fast-resolve with the classical resolve framework on VLA data (Perley et al. 2011) to validate the fidelity of the resulting sky images, and in Sec. 4.2 we present a fast-resolve reconstruction of the ESO 137-006 MeerKAT observation to demonstrate the computational speedup.

2 The inverse problem

The radio interferometric measurement equation, for example as derived in Smirnov (2011), relates the radio sky brightness to the datapoints, often called visibilities. More specifically, via the measurement equation, model visibilities $\tilde{V}$ $\[\tilde{V}\]$ can be computed for an assumed sky brightness I and antenna sensitivity G. Under the assumption of scalar antenna gains G, the model visibility ${\tilde{V}}_{p q t}$ $\[\tilde{V}_{p q t}\]$ of antennas p and q at time t is given as ${\tilde{V}}_{p q t} = \int C (l, w_{p q t}) I (l) G_{p} (t, l) G_{q}^{*} (t, l) e^{- 2 π i (k_{p q t} \cdot l)} d l,$ $\[\tilde{V}_{p q t}=\int C\left(\boldsymbol{l}, w_{p q t}\right) I(\boldsymbol{l}) G_p(t, \boldsymbol{l}) G_q^*(t, \boldsymbol{l}) e^{-2 \pi i\left(\boldsymbol{k}_{p q t} \cdot \boldsymbol{l}\right)} \mathrm{d} \boldsymbol{l},\]$ (1)

where

l = (l, m) are the sky coordinates,
t is the time coordinate,
k_pqt = (u_pqt, v_pqt) are the baseline uv-coordinates in units of the imaging wavelength,
C(l, w_pqt) = exp $(- 2 π i w_{p q t} (\sqrt{1 - l^{2}} - 1)) / \sqrt{1 - l^{2}}$ $\[\left(-2 \pi i w_{p q t}\left(\sqrt{1-\boldsymbol{l}^2}-1\right)\right) / \sqrt{1-\boldsymbol{l}^2}\]$ is the w-or non-coplanar baselines effect,
I(l) is the sky brightness, and
G_p(t, l) is the gain of antenna p depending on time and potentially also direction.

The model visibilities are therefore Fourier-like components of the sky brightness I modulated by the antenna gains G_p and G_q as well as the w-effect C. The visibilities actually recorded in the measurement are related to the model visibilities according to $V_{p q t} = {\tilde{V}}_{p q t} + n = R (I) + n,$ $\[V_{p q t}=\tilde{V}_{p q t}+n=R(I)+n,\]$ (2)

with n representing some unknown noise in the measurement, and R the mapping from sky brightness to visibilities defined in Eq. (1). With Eqs. (1) and (2), it is straightforward to compute simulated visibilities for an assumed sky brightness I, antenna gain G, and noise statistics. Inverting this relation is not possible without additional assumptions since first, the measured visibilities V are corrupted by the measurement noise n, and second, Eq. (1) is generally not invertible since in practical applications not all Fourier components are measured.

The non-uniqueness of solutions to Eq. (2) means that additional regularization needs to be imposed to discriminate between all possible sky images compatible with the data. This additional information is either an explicit prior or regularization term or is implicitly encoded in the structure of the imaging algorithm.

3 Methods

In this section we derive the fast-resolve algorithm (Sec. 3.3). Since fast-resolve builds on the already existing Bayesian imaging framework resolve, we start with a brief review of the classic resolve imaging method (Sec. 3.1). As some of the computational speedups of fast-resolve are inspired by the CLEAN imaging algorithm, we also outline the basic concepts behind CLEAN (Sec. 3.2).

3.1 `resolve`

The resolve imaging algorithm addresses the imaging inverse problem from a probabilistic perspective. Thus, instead of reconstructing a single estimate of the sky brightness, it infers the posterior probability distribution P(I|V) of possible sky images given the measured visibilities. Bayes’ theorem, $P (I ∣ V) = \frac{P (V ∣ I) P (I)}{P (V)},$ $\[P(I {\mid} V)=\frac{P(V {\mid} I) P(I)}{P(V)},\]$ (3)

expresses the posterior probability P(I|V) in terms of the likelihood P(V|I), the prior P(I), and the evidence P(V). resolve provides models for the likelihood P(V|I) and the prior P(I). The posterior distribution can be inferred for a given prior and likelihood model building on the functionality of the Bayesian inference package NIFTy² (Selig et al. 2013; Steininger et al. 2019; Edenhofer et al. 2024a). In the next three subsections we briefly outline the likelihood and prior of resolve as well as the variational inference algorithms of NIFTy.

3.1.1 `resolve` prior

The resolve framework provides predefined prior models for the two types of radio emission, point sources and extended diffuse emission. Both priors encode that the brightness must be positive since there is no negative flux. Furthermore, both priors are very flexible and allow for brightness variations over several orders of magnitude.

For the point source prior, pixels are independently modeled with an inverse gamma prior for their intensity. The inverse gamma distribution is strictly positive and has a wide tail, allowing extremely bright sources. In the example in Sec. 4.1, such a prior is applied for the two bright point sources in the core of Cygnus A. While the brightness of the point sources is reconstructed from the data, the locations currently need to be manually set. This limits the applicability of the current point source prior, as discussed in the application to MeerKAT ESO 137-006 data (Sec. 4.2).

In addition to positivity and possible variations over several orders of magnitude, the diffuse emission prior also encodes correlations of the brightness of nearby pixels, which is essentially the defining property of diffuse emission. The correlation of nearby pixels in the diffuse emission prior is modeled by Gaussian processes. The result of the Gaussian process is exponentiated to ensure positivity. Detailed explanations of the resolve prior models can be found in Arras et al. (2021a). An important aspect of all the prior models in resolve is that they are fast and scalable to very large numbers of pixels. For example, in Arras et al. (2021a) a 4096 × 2048 pixel Gaussian process-based diffuse emission prior is used to image Cygnus A at various frequencies.

3.1.2 `resolve` likelihood

The likelihood is evaluated in resolve using Eq. (1). More specifically, using Eq. (1), we can write the likelihood as $P (V ∣ I) = P (V ∣ R (I)) = P (V ∣ \tilde{V})$ $\[P(V {\mid} I)=P(V {\mid} R(I))=P(V {\mid} \tilde{V})\]$ . The noise statistics in Eq. (2) then determines the likelihood. In resolve, Gaussian noise statistics are assumed. For numerical reasons, resolve works with the negative logarithm of the likelihood, named the likelihood Hamiltonian, instead of the likelihood itself. With the Gaussian noise assumption, the likelihood Hamiltonian is given by $H (V ∣ I) = \frac{1}{2} (V - R (I))^{†} N^{- 1} (V - R (I)) + \frac{1}{2} \ln | 2 π N |,$ $\[H(V {\mid} I)=\frac{1}{2}~(V-R(I))^{\dagger} N^{-1}~(V-R(I))+\frac{1}{2} ~\ln ~|2 \pi N|,\]$ (4)

with † denoting a complex conjugate transpose and $\frac{1}{2} \ln | 2 π N |$ $\[\frac{1}{2} ~\ln~ |2 \pi N|\]$ coming from the normalization of the Gaussian, which can be ignored in many applications. Different possibilities are implemented for the covariance of the noise N. In the simplest case, the weights of the visibilities are used as the inverse noise covari-ance. Alternatively, the noise covariance can also be estimated during the image reconstruction.

The computationally important aspect of the likelihood in the classic resolve framework is that for every update step where the likelihood is evaluated, also R(I) and thus Eq. (1) need to be computed. This can be computationally expensive, especially for datasets with many visibilities, as we discuss in detail later.

3.1.3 `resolve` posterior inference

resolve is built on the probabilistic programming package NIFTy, which provides variational inference methods (Knollmüller & Enßlin 2019; Frank et al. 2021) to approximate the posterior distribution for a given prior and likelihood function. The advantage of variational inference techniques over sampling techniques such as Markov chain Monte Carlo (MCMC) or Hamiltonian Monte Carlo (HMC) is that they scale better with the number of parameters, which is the number of pixels in the imaging context. For example, the variational inference method of NIFTy has recently been used in a 3D reconstruction with 607 million voxels (Edenhofer et al. 2024b). While the variational inference algorithms scale very well with the number of parameters, they still need to evaluate the likelihood very often. This means variational inference is fast as long as the evaluation of the likelihood and prior is fast.

As discussed in Sect. 3.1.2, evaluating the likelihood in resolve boils down to evaluating the radio interferometric measurement equation (Eq. (1)). To do so resolve relies on the parallelizable wgridder (Arras et al. 2021b) implemented in the ducc³ library. Nevertheless, evaluating the measurement equation becomes computationally intensive for datasets with many visibilities. For example, for the MeerKAT dataset considered in Sec. 4.2, evaluating Eq. (1) on eight threads takes about 45 seconds. This becomes prohibitive for algorithms that require many thousands of likelihood evaluations.

3.2 `CLEAN`

In this section we briefly outline some of the concepts behind the CLEAN algorithm (Högbom 1974; Schwab & Cotton 1983) as the computational speedups of fast-resolve over classic resolve are partly inspired by CLEAN. We refrain from delving into the details behind the numerous improvements that have been made to the CLEAN algorithm over the last decades (see Rau et al. (2009) for example) and focus only on the aspects that are relevant to speeding up resolve. We refer to Arras et al. (2021a) for a detailed comparison between CLEAN and resolve.

CLEAN is an iterative optimization algorithm minimizing the weighted square residuals between the measured visibilities and the model visibilities computed from the sky brightness model. Expressed as a formula, the objective function minimized by CLEAN is identical to Eq. (4), the likelihood Hamiltonian of resolve. Minimizing Eq. (4) with respect to the sky brightness I is equivalent to solving $R^{†} N^{- 1} R I = R^{†} N^{- 1} V,$ $\[R^{\dagger} N^{-1} R I=R^{\dagger} N^{-1} ~V,\]$ (5)

with R^† being the adjoint operation of R, thus mapping from visibilities to the sky brightness. Neglecting wide-field effects originating from non-coplanar baselines, the operation R^†N⁻¹ R is equivalent to a convolution with the effective point spread function of the interferometer I^PSF. R^† N⁻¹ V is the back projection of the noise-weighted data into the sky domain, called dirty image I^D. This can be expressed with the formula $I^{P S F} * I \approx R^{†} N^{- 1} V = I^{D},$ $\[I^{\mathrm{PSF}} * I \approx R^{\dagger} N^{-1} V=I^{\mathrm{D}},\]$ (6)

with I^PSF being the PSF of the interferometer and * denoting convolution. Thus, the dirty image I^D = R^† N⁻¹ V is approximately the true sky brightness I convolved with the PSF I^PSF of the interferometer. Radio interferometric imaging is therefore nearly equivalent to deconvolving the dirty image.

As discussed in Sec. 2, no unique solution to imaging inverse problems exists. The absence of a unique solution manifests as R^† N⁻¹ R and I^PSF* not being invertible operations. In resolve additional regularization was provided via an explicit prior; instead, regularization of the sky images in CLEAN is implicitly encoded into the structure of the algorithm. More specifically, CLEAN starts with an empty sky model as an initial estimate $I_{0}^{m}$ $\[I_0^m\]$ for the true sky brightness and iteratively adds components, in the simplest form point sources, to this estimate until a stopping criterion is met. This introduces the implicit prior that the sky is sparsely represented by a finite number of CLEAN components.

In the following we briefly summarize the algorithmic structure of CLEAN as relevant for fast-resolve. The iterative procedure of adding components to the current model image I^m is split into major and minor cycles. In the major cycles, a current residual image $I^{R E S} = R^{†} N^{- 1} (V - R I_{i}^{m})$ $\[I^{\mathrm{RES}}=R^{\dagger} N^{-1}\left(V-R I_i^m\right)\]$ is computed, with $I_{i}^{m}$ $\[I_i^m\]$ being the current model image. In the minor cycle, additional components are added to the model, and their PSFs are removed from the residual image. In the subsequent major cycle, the residual image is recomputed for the updated model image, and in the following new minor cycle, more components get added to the model. This scheme is iterated until a global stopping criterion is met.

From a computational perspective, the important aspect of the major-minor scheme is that R and R^† only need to be evaluated in major cycles. In minor cycles, the PSF of the added CLEAN components is subtracted from the residual image, but for this no evaluations of R and R^† are needed since the PSF can be precomputed once. Most of the computations of the CLEAN algorithm are performed in image space, and only major cycles go back to the visibility space. In contrast, resolve needs to map from image to data space by applying R for every likelihood evaluation.

3.3 `fast-resolve`

As outlined in the previous section, evaluating the radio interferometric instrument response (Eq. (1)) contributes a substantial fraction to the overall runtime of a resolve image reconstruction. Therefore, reducing the number of necessary evaluations of Eq. (1) has the potential for significant speedups. The basic idea of fast-resolve is to perform most of the computations in image space, similar to CLEAN, and to only evaluate the radio interferometric instrument response once in a while.

3.3.1 `fast-resolve` measurement equation

The radio interferometric measurement equation in the form of Eq. (2) leads to the likelihood Hamiltonian Eq. (4) of the classic resolve framework, which involves an evaluation of the interferometric instrument response R. To get a likelihood Hamiltonian that does not include evaluating R, we have to transform the measurement equation such that all involved quantities live in image space, not data space. To project all involved quantities to image space, we apply R^† N⁻¹ from the left to Eq. (2), and get $R^{†} N^{- 1} V = R^{†} N^{- 1} R I + R^{†} N^{- 1} n$ $\[R^{\dagger} N^{-1} V=R^{\dagger} N^{-1} R I+R^{\dagger} N^{-1} n\]$ (7) $I^{D} = R^{'} I + n^{'},$ $\[I^{\mathrm{D}}=R^{\prime} I+n^{\prime},\]$ (8)

with R′ = R^† N⁻¹ R and n′ = R^† N⁻¹n. The quantities of the new measurement equation are I^D, I, and n′ and are all defined in image space and not in data space. The statistics of the new noise n′ remains Gaussian as R^† N⁻¹ is a linear transformation. The corresponding likelihood Hamiltonian is given by $H (I^{D} ∣ I) = \frac{1}{2} {(I^{D} - R^{'} I)}^{†} N^{' - 1} (I^{D} - R^{'} I) + \frac{1}{2} \ln | 2 π N^{'} | .$ $\[H\left(I^{\mathrm{D}} \mid I\right)=\frac{1}{2}\left(I^{\mathrm{D}}-R^{\prime} I\right)^{\dagger} N^{\prime-1}\left(I^{\mathrm{D}}-R^{\prime} I\right)+\frac{1}{2} ~\ln ~\left|2 \pi N^{\prime}\right|.\]$ (9)

The covariance of the transformed noise is given by $N^{'} = ⟨ n^{'} n^{' †} ⟩ = R^{†} N^{- 1} ⟨ n n^{†} ⟩ N^{- 1} R = R^{†} N^{- 1} R .$ $\[N^{\prime}=\left\langle n^{\prime} n^{\prime \dagger}\right\rangle=R^{\dagger} N^{-1}\left\langle n n^{\dagger}\right\rangle N^{-1} R=R^{\dagger} N^{-1} R.\]$ (10)

This new noise covariance N′ is not invertible as R^† N⁻¹ R is not invertible, which is problematic as the likelihood Hamiltonian contains the inverse noise covariance N′⁻¹. To mitigate the problem of a singular noise covariance, we modify the measurement equation once more by adding uncorrelated Gaussian noise with a small amplitude. This leads to a noise covariance $N^{'} = R^{†} N^{- 1} R + ϵ 𝕀,$ $\[N^{\prime}=R^{\dagger} N^{-1} R+\epsilon \mathbb{I},\]$ (11)

with 𝕀 being the unit matrix and ϵ a small number, which is the variance of the additional noise. With this artificially introduced additional noise, the full noise covariance N′ is invertible, and the new likelihood Hamiltonian Eq. (9) becomes well defined.

3.3.2 `fast-resolve` response R′

How large the speedup of using the new likelihood Hamiltonian Eq. (9) is compared to the old Hamiltonian Eq. (4) depends on how much faster the new Hamiltonian can be evaluated. The idea of fast-resolve is to approximate R′ = R^† N⁻¹ R with a PSF convolution R′ = R^† N⁻¹ R ≈ I^PSF*, as it is done in a similar way in the CLEAN algorithm. Approximating R′ = R^† N⁻¹ R by a convolution with the PSF is only exact for coplanar arrays. Corrections for the inaccuracy of the approximation are applied in some major cycles in analogy to the CLEAN algorithm, as we describe in Sect. 3.3.4.

The convolution with the PSF I^PSF* can be efficiently applied via a fast Fourier transform (FFT) convolution. For datasets with many visibilities, this FFT-based convolution with the PSF I^PSF has the potential to be significantly faster than an evaluation of the interferometer response R (Eq. (1)). In Sec. 4 we compare the computational speedup of the new likelihood for different datasets.

As the sky brightness I and the PSF I^PSF are non-periodic, some padding of the sky is needed for an FFT-based convolution. More specifically, to exactly evaluate I^PSF * I, the PSF with which we convolve needs to be twice as big as the field of view we want to image since some emission in the sky I could be at the edge of the field we are imaging. The sky image needs to be padded with zeros to the same size as the PSF. As a formula, this can be noted as $R^{'} I = R^{†} N^{- 1} R I$ $\[R^{\prime} I=R^{\dagger} N^{-1} R I\]$ (12) $\approx I^{P S F} * I$ $\[\approx I^{\mathrm{PSF}} * I\]$ (13) $= P^{†} {F F T}^{- 1} [F F T [I^{P S F}] \cdot F F T [P I]],$ $\[=P^{\dagger} \mathrm{FFT}^{-1}\left[\mathrm{FFT}\left[I^{\mathrm{PSF}}\right] \cdot \mathrm{FFT}[P I]\right],\]$ (14)

with P denoting the padding operation and P^† for slicing out the region that is not padded. By neglecting PSF sidelobes and reducing its size, the necessary amount of zero padding can be reduced. This, however, reduces the accuracy of the approximation R′ ≈ I^PSF*, which might make it necessary to perform more major cycles in the image reconstruction.

3.3.3 `fast-resolve` noise model

To evaluate the likelihood Hamiltonian of fast-resolve, we need to apply both R′ = R^† N⁻¹ R and N′⁻¹ = (R^† N⁻¹ R + ϵ𝕀)⁻¹. Similar to R′, we need to approximate N′⁻¹ such that we can apply it without having to evaluate R and R^† every time. The basic idea of the approximation of N′⁻¹ is the same as for R′, thus replacing the exact operation with an FFT convolution. Expressed as a formula, we compute the application of N′ to an input x via $N^{'} (x) \approx {F F T}^{- 1} [K \cdot F F T (x)],$ $\[N^{\prime}(x) \approx \mathrm{FFT}^{-1}[K \cdot \mathrm{FFT}(x)],\]$ (15)

with an appropriate Kernel K. The approximate noise covariance still needs to fulfill the mathematical properties of covariances because, for our Bayesian model, the likelihood is a probability and not just an arbitrary cost function. Therefore, the approximated noise covariance matrix needs to be Hermitian and positive definite. Specifically, for the convolution approximation, this implies that all entries of the convolution kernel K must be real and strictly positive. To fulfill these constraints, we parameterize K as $K_{ξ} = \exp ξ + ϵ,$ $\[K_{\xi}=\exp \xi+\epsilon,\]$ (16)

with ϵ as defined in Eq. (11) and ξ being an implicitly defined real-valued vector. We set this vector ξ such that it minimizes the square residual between the true and the approximate noise covariance when applied to a test image with a point source in the center. Thus, ξ is set to $\bar{ξ} = {argmin}_{ξ} {(N^{'} (I_{δ}) - {F F T}^{- 1} [(\exp ξ + ϵ) \cdot F F T (I_{δ})])}^{2},$ $\[\bar{\xi}=\operatorname{argmin}_{\xi}\left(N^{\prime}\left(I_\delta\right)-\mathrm{FFT}^{-1}\left[(\exp \xi+\epsilon) \cdot \mathrm{FFT}\left(I_\delta\right)\right]\right)^2,\]$ (17)

with I_δ having a point source or delta peak in the center of the field of view, and otherwise zero. In other words, we approximate the noise covariance N′ with an appropriate Kernel K that yields a proper covariance by construction, is easy to invert, and minimizes the squared distance to the effect that N′ has on a point source.

To evaluate the likelihood (Eq. (9)) we need to apply N′⁻¹. We do this by convolving with the inverse kernel: $N^{' - 1} (x) \approx {F F T}^{- 1} [K_{\bar{ξ}}^{- 1} \cdot FFT (x)] .$ $\[N^{\prime-1}(x) \approx \mathrm{FFT}^{-1}\left[K_{\bar{\xi}}^{-1} \cdot \operatorname{FFT}(x)\right].\]$ (18)

This involves a second approximation as we here implicitly assume periodic boundary conditions. As long as the main lobe of the PSF is much smaller than the field of view, this assumption does not create large errors.

3.3.4 `fast-resolve` inference scheme

The previous sections derived the approximate likelihood for fast-resolve inspired by the CLEAN algorithm. In the same spirit, we adapt the major-minor cycle scheme of CLEAN to utilize it with Bayesian inference for a probabilistic sky brightness reconstruction. In the minor cycles, we optimize the current estimate of the posterior distribution P(I|V) for the sky brightness I using the above approximations for a fast likelihood evaluation. In the major cycles, similarly to CLEAN, we apply the exact response operations to correct for the approximation error.

The algorithm starts by initializing the dirty image I^D = R^† N⁻¹ V from the visibilities V using the exact response function R. The dirty image is the input data d₀ = I^D for the first minor cycle. The first minor cycle computes an initial estimate of the posterior distribution P₀ of the sky brightness using the measurement equation $d_{0} = I^{P S F} * I + n^{'},$ $\[d_0=I^{\mathrm{PSF}} * I+n^{\prime},\]$ (19)

with the approximations discussed in Sects. 3.3.2 and 3.3.3. We discuss the exact algorithm used to estimate the posterior distribution in Sect. 3.3.5. From this initial estimate of the posterior distribution P₀, we compute the posterior mean of the sky brightness $I_{0} = ⟨ I ⟩_{P_{0}}$ $\[I_0=\langle I\rangle_{P_0}\]$ . The posterior mean I₀ is the output of the first minor cycle and the input to the first major cycle. The first major cycle computes the residual d₁ between the dirty image I^D and the posterior mean of the first minor cycle passed through the response R^† N⁻¹ R of the fast resolve measurement equation: $d_{1} = I^{D} - R^{†} N^{- 1} R I_{0} .$ $\[d_1=I^{\mathrm{D}}-R^{\dagger} N^{-1} R I_0.\]$ (20)

In contrast to the minor cycle, here R^† N⁻¹ R is computed exactly and is not approximated with I^PSF*. The residual d₁ is the output of the major cycle and the input to the second minor cycle. In the second minor cycle, the posterior P₁ of I for the measurement equation, $d_{1} = I^{P S F} * (I - I_{0}) + n^{'},$ $\[d_1=I^{\mathrm{PSF}} *\left(I-I_0\right)+n^{\prime},\]$ (21)

Algorithm 1fast-resolve inference scheme

is approximated. This approximation can be done efficiently by starting with the posterior estimate of the previous minor cycle and refining this estimate. The output of the second minor cycle is the updated posterior mean $I_{1} = ⟨ I ⟩_{P_{1}}$ $\[I_1=\langle I\rangle_{P_1}\]$ which is the input to the next major cycle computing the new residual to the dirty image using the exact response: $d_{2} = I^{D} - R^{†} N^{- 1} R I_{1} .$ $\[d_2=I^{\mathrm{D}}-R^{\dagger} N^{-1} R I_1.\]$ (22)

The new residual data d₂ is the input to the next minor cycle. This scheme is iterated until converged, thus until no significant structures are left in the residual d_n and the posterior estimates I_n are no longer changing. In Algorithm 1 the full inference scheme is summarized as a pseudocode algorithm.

The overall structure of the fast-resolve major–minor scheme is very similar to the major-minor scheme of CLEAN. The main difference is that fast-resolve updates in the minor cycles a probability distribution for possible sky images instead of a simple model image, as is the case for CLEAN. The algorithm for inferring the posterior distribution is outlined in Sect. 3.3.5.

3.3.5 `fast-resolve` minor cycles

In the minor cycles, we optimize the approximation of the posterior distribution P(I|V) using the scalable variational inference algorithms (Knollmüller & Enßlin 2019; Frank et al. 2021) of the NIFTy package already employed in the classic resolve algorithm. These variational inference algorithms account for correlations between the parameters of the model. While the Metric Gaussian Variational Inference (MGVI) method of Knollmüller & Enßlin (2019) relies on a Gaussian approximation of the posterior distribution, the geometric Variational Inference (geoVI) algorithm of Frank et al. (2021) can also capture non-Gaussian posterior distributions. Although fast-resolve relies for the minor cycles on the same inference algorithms as resolve, updating the posterior distribution is much faster in fast-resolve since the approximate likelihood described above is used instead of evaluating the exact measurement equation.

Furthermore, the NIFTy package has also undergone a major rewrite, switching from a NumPy-based (Harris et al. 2020) implementation to a backend based on JAX⁴ (Bradbury et al. 2018) that allows for GPU-accelerated computing. While resolve was based on the old NumPy-based NIFTy, fast-resolve makes use of the JAX accelerated NIFTy version, also named NIFTy.re (Edenhofer et al. 2024a). In particular, the FFT convolutions in fast-resolve have the potential for significant GPU acceleration. In Sec. 4 we compare the runtime of resolve and fast-resolve using the CPU and the GPU backend.

In the future, we also plan to port the classic resolve framework to JAX. Porting resolve is more involved than porting fast-resolve because it requires binding a high-performance implementation of the radio interferometric measurement operator to JAX. As a preparatory work, we developed the JAXbind⁵ package (Roth et al. 2024), which allows custom functions to be called in JAX. However, since the wgridder from ducc⁶ is used to evaluate the radio interferometric measurement equation, the JAX version of resolve will also be limited to run on the CPU.

3.4 Previous fast`RESOLVE` of Greiner et al. (2016)

The algorithm named fastRESOLVE introduced by Greiner et al. (2016) was an earlier attempt to speed up resolve. Similar to the fast-resolve presented in this work, Greiner et al. (2016) used approximations of the likelihood in order to avoid applications of R in every step. More specifically, Greiner et al. (2016) derived a maximum a posteriori estimator of the sky brightness using the gridded weights of the visibilities as a noise covariance. Since fastRESOLVE had some limitations, for example it was not able to account for the w-term and could not provide uncertainty estimates, this approach was not followed up in subsequent developments of resolve in Arras et al. (2019, 2021a) or Roth et al. (2023).

4 Applications

For verification, we demonstrate the proposed fast-resolve method on different datasets. First, we reconstruct the Cygnus A VLA observation at four different frequency bands, and second, we image ESO 137-006 using a MeerKAT observation. The Cygnus A data is suitable to validate the accuracy as the fast-resolve images can be compared to previous results from Arras et al. (2021a) and Roth et al. (2023) obtained on the same dataset. The Cygnus A reconstruction has a relatively small field of view, and wide-field effects should be negligible. The MeerKAT observation is significantly larger than the VLA observation, and imaging with resolve is computationally out of reach, which demonstrates the computational advantages of fast-resolve over resolve. Furthermore, the field of view of the ESO 137-006 observations is also significantly larger such that the major cycles of fast-resolve correcting for wide-field effects become more important.

4.1 Application to VLA Cygnus A data

4.1.1 Data and algorithm setup

For imaging Cygnus A, we used the exact same data already imaged in Arras et al. (2021a) with resolve and CLEAN. The data contains single frequency channels at 2052 MHz (S-band), 4811 MHz (C-band), 8427 MHz (X-band), and 13 360 MHz (Ku-band). In addition, in Roth et al. (2023) an uncalibrated version of the S-band data was jointly calibrated and imaged with resolve. In all observing bands, all four VLA configurations were used. The computationally relevant details of the observations, such as the number of visibilities and the grid size of the reconstructed images, are listed in Table 1. For details on the calibration of the data, we refer to Sebokolodi et al. (2020).

As a prior model, we used the combination of the diffuse emission prior and the point source prior described in Sect. 3.1.1. This prior model is identical to the prior model already used in the existing resolve framework. Specifically, in Arras et al. (2021a) and Roth et al. (2023), this prior model was already used for Cygnus A reconstructions. For completeness, we summarize the main aspects of the prior model here. We refer to Arras et al. (2021a) for details on the prior model. We modeled the diffuse emission with an exponentiated Gaussian process spanning the entire field of view. For the two point sources in the nucleus of Cygnus A, we inserted two separate point source models at the locations of these sources. For the brightness of these sources we used an inverse gamma prior. The exact hyper-parameters for the Gaussian process and the inverse gamma prior are listed in Appendix A. The number of pixels used to model the diffuse emission for the different frequencies are listed in Table 1.

In Arras et al. (2021a), the MGVI algorithm was used to infer the posterior of the sky brightness map. For direct comparability between the results of this work and the previous resolve maps, we also use the MGVI in the minor cycles (Sect. 3.3.5) for the posterior approximation. For the imaging of the S-band data in Roth et al. (2023), the geoVI algorithm was used. However, we do not expect significant differences in the resulting sky maps.

To summarize, Arras et al. (2021a) and Roth et al. (2023) used the same prior model setup as we used for imaging Cygnus A with fast-resolve. Furthermore, the posterior inference algorithm is expected to produce similar results for a given likelihood. Therefore, the previous resolve results of Arras et al. (2021a) and Roth et al. (2023) are ideally suited to validate the fast-resolve likelihood approximation in a direct comparison on a dataset with negligible wide-field effects. Furthermore, we compare with the multi-scale CLEAN reconstruction of the S-band data from Arras et al. (2021a). For a more detailed comparison with CLEAN, we refer to Arras et al. (2021a), where resolve was extensively compared to single-and multi-scale CLEAN for all frequency bands.

Table 1

VLA Cygnus A observations.

4.1.2 Comparison with previous `resolve` results

In the following we compare the fast-resolve reconstructions with previous results to validate the accuracy of fast-resolve. Figure 1 displays the resolve reconstructions of Arras et al. (2021a) and Roth et al. (2023) as well as the S-band data multi-scale CLEAN reconstruction in comparison with the fast-resolve reconstructions of this work. The overall quality of the fast-resolve maps is on par with the resolve results. The multi-scale CLEAN reconstruction has a lower resolution in the bright regions of the lobes than the resolve-based reconstructions. All bright emission features are consistently reconstructed by all algorithms in all frequency bands. The fast-resolve and CLEAN maps have a higher dynamic range than the results of Arras et al. (2021a). Nevertheless, this is not due to a conceptional problem of the resolve algorithm, but rather because the reconstructions of Arras et al. (2021a) were not fully converged, due to the very high computational cost of resolve. resolve reconstructs the brightest emission of the field before modeling fainter features. Thus, faint features are missing in the radio map when a reconstruction is stopped before convergence. For fast-resolve, where the overall runtime of the algorithm is much shorter, it is easier to ensure that the reconstruction is fully converged. In Sect. 4.1.3 the convergence of resolve and fast-resolve is analyzed in detail. The resolve reconstruction of Roth et al. (2023) has a similar dynamic range to the fast-resolve reconstructions.

In the comparison between CLEAN and resolve in Arras et al. (2021a), resolve produced significantly higher resolved maps than CLEAN in regions with high surface brightness. A zoomed-in image of such a region, the eastern hotspot of Cygnus A, is shown in Fig. 2. In this region, the results are consistent, and fast-resolve achieves the same resolution as the resolve-based reconstructions⁷. Furthermore, fast-resolve also shows minimal imaging artifacts around the hotspot, as does resolve. The multi-scale CLEAN map from Arras et al. (2021a) is added again for comparison. As already discussed in Arras et al. (2021a), the CLEAN reconstruction has a significantly lower resolution than the resolve reconstruction. These super-resolution capabilities were validated by comparing the morphological features of the resolve reconstructions with higher frequency observations under the assumption of spectral smoothness.

Figure 3 shows a zoomed-in image of the nucleus and jet of Cygnus A. In the magnified image of the nucleus the pixels modeling the two point sources are clearly visible. The fast-resolve S-band reconstruction and the resolve reconstruction of Roth et al. (2023) show consistent results depicting the nucleus and jet. In the S-band reconstruction of Arras et al. (2021a), the jet is also visible, although due to the smaller dynamic range of the map it is less pronounced. In the higher frequency bands, the shape of the core of Cygnus A is consistent between the reconstructions of resolve and fast-resolve. As the old resolve reconstructions are not fully converged, their dynamic range is lower, and the fainter emission of the jet is barely visible at higher frequencies. At the highest frequencies of 13 360 MHz, the jet is also barely visible in the fast-resolve reconstruction. The CLEAN image of the jet is consistent with the fast-resolve reconstruction and the resolve reconstruction from Roth et al. (2023).

Both resolve and fast-resolve provide posterior samples of the sky brightness distribution. From these samples, the posterior mean and also other summary statistics can be computed. As an example, we show in Fig. 4 the pixelwise relative uncertainty of the S-band data reconstruction of fast-resolve in comparison with the corresponding uncertainty map of Arras et al. (2021a). The estimated uncertainty of resolve tends to be slightly higher than the uncertainty estimate of fast-resolve. This difference might come from the fact that the resolve reconstruction of Arras et al. (2021a) was not fully converged since the uncertainty estimate of resolve usually becomes smaller when the algorithm converges. Nevertheless, we cannot exclude that this is related to the approximations of fast-resolve.

Fig. 1

Comparison of fast-resolve Cygnus A reconstructions with resolve and CLEAN reconstructions. The top left panel shows the multi-scale CLEAN map of the S-band data form Arras et al. (2021a). The left column below the CLEAN map shows fast-resolve reconstructions at four different frequencies indicated in the top left corner of each panel. The right column shows resolve reconstructions of Roth et al. (2023) and Arras et al. (2021a) using the same data. The dynamic range of the fast-resolve reconstructions is higher than in some of the previous resolve reconstructions because the old resolve reconstructions were not fully converged due to the high computational cost.

Fig. 2

Zoomed-in image of the eastern hot spot of the Cygnus A reconstructions from Fig. 1. The resolution of the fast-resolve maps in the left column is on par with the resolve maps in the right column. The resolution of the CLEAN map is significantly lower than in the other maps.

Fig. 3

Zoomed-in image of the nucleus and jet of the Cygnus A reconstructions form Fig. 1. In all resolve and fast-resolve reconstructions, the two point sources in the nucleus are modeled with two pixels that are uncorrelated with the brightness of the neighboring pixels. Since the resolve of Arras et al. (2021a) are not fully converged due to the high computational cost, their dynamic range is lower, and the jet of Cygnus A is not visible. The multi-scale CLEAN map from Arras et al. (2021a) is consistent with the fast-resolve result and the resolve reconstruction from Roth et al. (2023). At the highest frequency the jet is hardly visible also with fast-resolve, with background artifacts having a similar brightness.

Fig. 4

Pixel-wise relative uncertainty map of fast-resolve S-band reconstruction compared to the resolve uncertainty map of Arras et al. (2021a). The relative uncertainty is lower for both reconstructions in regions of high surface brightness. In the hotspot, the relative uncertainty is around 10⁻³−10⁻² and increases toward the lower surface brightness regions. Outside of Cygnus A, the relative uncertainty fluctuates around 1, indicating that the reconstructed intensity is statistically consistent with zero surface brightness. The estimated uncertainty of resolve tends to be slightly higher than the uncertainty estimate of fast-resolve. This might be due to the resolve reconstruction of Arras et al. (2021a) not being fully converged.

4.1.3 Computational time of `fast-resolve`

In the previous subsection, we compare the results of fast-resolve with resolve, confirming that fast-resolve can deliver the same high-quality radio maps as resolve. In this subsection we analyze the computational speed of fast-resolve, and compare it to the existing resolve framework. To have a direct comparison between the runtimes, we imaged the S-band data with exactly the same hyperparameters for the prior model with fast-resolve and resolve and saved snapshots of the reconstructions at several points during the optimization. While resolve can only be executed on a CPU, fast-resolve can also run on a GPU.

Figure B.1 depicts the direct comparison between resolve and fast-resolve with both algorithms being executed on the same CPU. Specifically, the algorithms were run on an Intel Xeon W-1270P CPU with 3.80 GHz clock speed and eight cores. For the resolve reconstruction, the MGVI algorithm inferring the posterior distribution was parallelized with four MPI tasks, each using two cores for evaluating the radio interferometric measurement equation (Eq. (2)), thus utilizing eight cores in total. The fast-resolve reconstruction ran on the single Python process without manual threading, relying on JAX to parallelize the computations over the cores. Thus, in addition to the algorithmic difference, differences in the parallelization strategy and programming language might impact the reconstruction runtime. Snapshots of the reconstructions after 10, 60, 300, 600, and 1440 min are displayed. After 10 min, the hotspot and parts of the lobes are visible in the reconstructions of both algorithms. Nevertheless, the core and the jet are missing, and both reconstructions have significant background artifacts. After 60 min runtime, fainter parts of the lobes become visible, and the fast-resolve map shows significantly more details in the lower surface brightness regions. After 300 min the fast-resolve map also depicts the very low surface brightness regions with high resolution, while in the resolve map, the outflow is still partially missing, and the lower surface brightness regions of the lobes are reconstructed with low resolution. After 600 min the fast-resolve map shows only slight changes compared to the 300 min snapshot. At this point, we assume the fast-resolve algorithm to be fully converged. In total we performed 20 fast-resolve major cycles. After 600 min the resolve reconstruction is also not yet fully converged. The final snapshot of the resolve reconstruction was taken after 1440 min. At this stage the resolve reconstruction is also nearly converged. Only in the very low surface brightness regions of the lobes is the resolution still lower than in the fast-resolve reconstruction.

While Fig. B.1 analyzes the convergence of fast-resolve on the CPU, fast-resolve is mainly designed for GPUs. The performance of fast-resolve on the S-band data using GPUs is shown in Fig. B.2. Specifically, Fig. B.2 displays snapshots of the fast-resolve reconstruction after 1, 5, 10, 22, and 46 min using an NVIDIA A100 high-performance computing GPU compared with a reconstruction on a consumer-level GPU, an NVIDIA GeForce RTX 3090. When comparing the results, one should consider that the A100 GPU is on the order of ten times more expensive than the RTX 3090. On both GPUs we executed exactly the same fast-resolve reconstruction that we ran on a CPU in Fig. B.1. Similar to the CPU run, the high surface brightness regions are reconstructed first before the method picks up the low surface brightness flux. On both GPUs, the algorithm is much faster than on the CPU. While on the CPU, the reconstruction was finished after 600 min, the same number of major and minor cycles were finished on the RTX 3090 GPU in 46 min and on the A100 GPU in 22 min. Thus, the fast-resolve reconstruction was 13 times faster on the RTX 3090 GPU and 27 times faster on the A100 GPU than the CPU reconstruction. In comparison, the resolve reconstruction on the CPU was not fully converged even after 1416 min.

To quantify the convergence rate of the reconstructions, we computed the mean square residual between the logarithmic brightness of the final fast-resolve iteration on the NVIDIA A100 GPU and earlier iterations of resolve and fast-resolve reconstructions. We reran the fast-resolve reconstructions with different random seeds to be independent of the random seed used for the NVIDIA A100 reconstruction with respect to which the residuals are computed. Figure 5 shows the mean square residuals for resolve and the three fast-resolve reconstructions as a function of wall time. The fast-resolve reconstructions converge within the displayed time, and their curves of the mean square residual flatten. The reconstruction’s final mean square residuals are not exactly the same since the final maps are not numerically identical because of the different hardware, JAX, and CUDA versions. The resolve reconstruction does not converge in the displayed time interval and the residual error keeps falling until the end after 1416 min (1 day). Using the residual of the logarithmic brightness is an arbitrary metric for quantifying convergence speed. Nevertheless, it can roughly quantify the speedups of fast-resolve. After 1416 min, the mean square residual of the resolve reconstruction is around 10⁻³. In comparison, the fast-resolve reconstructions reach the same mean squared residual after around 10 min, 20 min, and 200 min for runs on the NVIDIA A100 GPU, the NVIDIA RTX 3090 GPU, and the Intel Xeon CPU, respectively. Thus, the indicative speedup of fast-resolve over resolve on the Cygnus A S-band data is a factor of 144 for the A100 GPU, a factor of 72 for the RTX 3090 GPU, and a factor of 7.2 for the CPU run. The timings reported above do not include the time needed to precompute the convolution kernels for the response and noise of fast-resolve. Nevertheless, these times are small compared to the time needed for imaging. For the S-band data, for example, the computation of the kernels takes only 0.6 min, which is much shorter than the imaging runtime, even on the GPU.

As indicated in Table 1, the Cygnus A data has only a single frequency channel. For datasets with more baselines and frequency channels, such as the MeerKAT datasets considered in the next section (see Table 2), the algorithmic advantage of fast-resolve of not having to compute the radio response in each evaluation of the likelihood is much greater. Thus, for such datasets, the speedup of fast-resolve will be significantly larger than for the Cygnus A single channel imaging. For the MeerKAT dataset discussed in Sect. 4.2, the imaging with resolve is computationally out of scope, while with fast-resolve the images can still be reconstructed with moderate computational costs.

Fig. 5

Mean square residual of the log brightness of resolve and fast-resolve reconstruction as a function of time for the S-band data. The residual is computed to the final iteration of fast-resolve on the NVIDIA A100 GPU depicted in Fig. B.2, which we believe to be converged. The fast-resolve reconstructions are recomputed with a different random seed. The mean square residual of the fast-resolve reconstructions falls until their curves flatten when they converge. Due to the different hardware, JAX, and CUDA versions, the final results are numerically not exactly identical. The resolve reconstruction (blue curve) does not fully converge in the displayed time range.

4.2 Application to MeerKAT data

In this section we present an application of fast-resolve to an L-band (856–1712 MHz) MeerKAT (Jonas & MeerKAT Team 2016) observation of the radio galaxy ESO 137-006. Originally, this observation was presented in Ramatsoku et al. (2020). The observation utilized all 64 MeerKAT antennas and the 4k mode of the SKARAB correlator. The total on target observation time is 14 h in full polarization with 4096 channels. For the VLA Cygnus A observations above, only a single frequency channel was used for each band. Here, we use two sub-bands, each with about 200 channels (after averaging) and a bandwidth of approximately 200 MHz. Consequently the MeerKAT datasets are more than 400 times larger than the VLA datasets of the previous section, making an evaluation of the radio interferometer response (Eq. (1)) significantly more expensive. A resolve reconstruction, where the exact response needs to be computed for each evaluation of the likelihood (Eq. (3.1.2)), is computationally unfeasible for such a dataset.

Ramatsoku et al. (2020) detected previously unknown collimated synchrotron threads linking the lobes of the radio galaxy ESO 137-006 in this observation. Since then, this dataset has also been used by Dabbech et al. (2022) to demonstrate convex optimization-based imaging algorithms. Dabbech et al. (2022) also published the results as FITS files, allowing us to use them as a reference to validate the fast-resolve algorithm for a MeerKAT-sized reconstruction.

Technical details relating to the initial flagging and transfer calibration of the ESO 137-006 data using the CARACAL pipeline (Józsa et al. 2020) are given in Ramatsoku et al. (2020). As in Ramatsoku et al. (2020), the data is averaged from 4096 to 1024 frequency channels and split into two sub-bands spanning 961–1145 MHz and 1295–1503 MHz that are relative free from radio frequency intereference, known as the LO and HI bands, respectively. These sub-bands are then phase self-calibrated using WSClean multi-scale CLEAN (Offringa & Smirnov 2017) for imaging and CubiCal (Kenyon et al. 2018) for calibration. We then image the data independently in the LO and HI band with fast-resolve. The computationally relevant details of the calibrated data in the two bands are listed in Table 2.

4.2.1 `fast-resolve` prior model

As for the Cygnus A observation, we built our prior model around the exponentiated Gaussian process model described in Sect. 4.1.1 and in detail in Arras et al. (2021a). Nevertheless, for the MeerKAT observation, in addition to the main source, ESO 137-006, there is also a second galaxy, ESO 137-007, in the field of view. Due to the nature of the generative prior models in resolve, the full prior model can easily be composed of multiple components. We model each of these sources with a separate exponentiated Gaussian process to decouple the prior models. Furthermore, due to the high sensitivity of MeerKAT, many compact background sources are detected. For the Cygnus A reconstruction, we placed two point source models at the locations of the two point sources in the core of Cygnus A. However, for the MeerKAT observation, the number of background sources is far too large to manually place point source models at their locations. Therefore, we used a third Gaussian process model to represent all the background sources outside of the models for ESO 137-006 and ESO 137-007. A sketch of the layout of the three models is shown in Fig. 6. The exact parameters for all Gaussian process models are listed in Appendix C.

Table 2

MeerKAT ESO 137-006 observations.

Fig. 6

Reconstruction of the ESO 137-006 MeerKAT observation in the LO sub-band (961–1145 MHz) with fast-resolve. The radio galaxies ESO 137-006 and ESO 137-007, as well as the background sources, are modeled with independent Gaussian processes. White boxes indicate the spatial extent of each Gaussian process. Zooms on the reconstructions of ESO 137-006 and ESO 137-007 for both sub-bands are shown in Figs. 7, B.3, B.4, and B.5.

4.2.2 Imaging of Dabbech et al. (2022)

In Dabbech et al. (2022), ESO 137-006 is imaged with WSClean (Offringa et al. 2014; Offringa & Smirnov 2017) and two convex optimization algorithms in the same bands 961–1145 MHz and 1295–1503 MHz using the same data as we do. The two convex optimization algorithms of Dabbech et al. (2022) utilize two different regularizers. One regularizer is named uSARA, promoting sparsity in a wavelet basis. The other regularizer, AIRI, originally presented in Terris et al. (2023a), is based on a neural network denoiser. The resulting images for both regularizers are compared with results from WSClean, showing significant improvements in the image quality. The exact computational costs of all three imaging algorithms are reported in Dabbech et al. (2022). Imaging with the sparsity promoting regularizer, uSARA needed about 1500–3000 CPU hours per band to converge. With the neural network-based denoiser AIRI, around 900–1600 CPU hours plus around 5 GPU hours were needed.

4.2.3 Imaging results

Figure 6 depicts the results of the fast-resolve reconstruction for the LO band data. 25 and 28 fast-resolve major cycles were executed for the LO and the HI band reconstructions. Zoomed-in images of the two radio galaxies are shown in Figs. 7 and B.3 for the LO band and in Figs. B.4 and B.5 for the HI band. This means that Figs. 7 and B.4 display the ESO 137-006 radio galaxy, while Figs. B.3 and B.5 show the radio galaxy ESO 137-007 north of it. These figures also include the convex optimization reconstruction results of Dabbech et al. (2022) for comparison and validation. The imaging results of fast-resolve and Dabbech et al. (2022) are consistent. The fast-resolve maps have higher background artifacts than the convex optimization maps. In Ramatsoku et al. (2020), collimated synchrotron threads between the two lobes of the galaxy ESO 137-006 were detected. In addition to these already known threads, the fast-resolve map shows additional threads north of the core of the galaxy. These additional threads are probably artifacts in the image. With a lower intensity, similar features are also found in the maps of Dabbech et al. (2022) and are mostly considered artifacts.

We believe these features are imaging artifacts and originate from suboptimal calibration. Due to the very flexible prior model, both fast-resolve and resolve are very sensitive to the data. In the case of suboptimal calibration, this leads to imaging artifacts. At present, self-calibration with fast-resolve is not yet possible. For the future, the integration of a self-calibration routine into fast-resolve is planned, which could potentially mitigate such artifacts. Appendix D presents an application of fast-resolve to synthetic data with the same uv-coverage as the MeerKAT LO band data. The synthetic data has no calibration inaccuracies, and the reconstructions do not show similar artifacts. Another reason for the artifacts could be the large width of the imaged band, as indicated in Table 2.

In addition to self-calibration, a dedicated prior model for compact sources could improve the fast-resolve reconstructions. At present, point sources can be modeled either by manually placing inverse gamma distributed pixels at their locations (see Sect. 3.1.1) or by including them in the Gaussian process model. For the VLA Cygnus A reconstruction, the point sources in the nucleus were modeled by manually placing inverse gamma distributed pixels at their locations. Thus, imaging artifacts around the two-point source could be avoided, even though they are very bright. Due to the high number of background sources, this is impractical for the ESO 137-006 reconstruction. Compact sources are therefore also modeled using an exponentiated Gaussian process prior. The Gaussian process-based prior is very sensitive to diffuse emission, but is less sensitive to point sources. For this reason many fainter point sources are not picked up by the model, but are left in the fast-resolve residual data, which is shown in Appendix E. A dedicated prior model that is also applicable to observations with a very high number of background sources could improve their reconstruction. Ideally, such a point source model should also be capable of reconstructing the exact positions of the sources and not rely on error-prone manual placement of sources as in the current point source prior version. However, the development of an improved point source model is left for future work.

Furthermore, a close examination of the results reveal minor artifacts at the edges of the prior models of the sources ESO137-006 and ESO137-007. These artifacts could be avoided by an overlap between source models and the background model and interpolating between them.

Fig. 7

Radio galaxy ESO 137-006 in the phase center of the observation. The top panel shows the fast-resolve reconstruction. For comparison, the middle and bottom panels display the reconstructions form Dabbech et al. (2022) with the AIRI and uSARA regularizers. All reconstructions, but especially fast-resolve, are affected by the suboptimal calibration.

4.2.4 Computational costs

Despite the high number of visibilities and frequency channels, the computational costs of fast-resolve are still moderate since fast-resolve only needs to evaluate the full radio interferometric measurement equation (Eq. (1)) in the major cycles. As outlined above, 25 and 28 major cycles were executed for the LO and HI band. The fast-resolve reconstructions of both frequency bands were done on an NVIDIA A100 GPU. Before the actual fast-resolve imaging, the convolution kernels for the response and the noise were constructed on a CPU. For both frequency bands, less than two hours were needed on a single core of an Intel Xeon CPU to compute the convolution kernel. For imaging, 24 hours using an NVIDIA A100 GPU and eight cores of an Intel Xeon CPU were needed. Since most of the fast-resolve operators, except for the major update steps, are done on the GPU, the CPU was mostly idle during imaging. A limitation of the current implementation of the fast-resolve algorithm is the high GPU memory consumption, preventing applications with significantly larger grid sizes than those presented in this paper. We believe these constraints can be softened with future improvements in the implementation.

A direct one-to-one comparison of the computational cost to the reconstructions of Dabbech et al. (2022) is not possible as the reconstructions of Dabbech et al. (2022) were executed on different hardware and because the algorithms are implemented in different programming languages. Nevertheless, the computational costs of the Dabbech et al. (2022) reconstructions tend to be significantly higher as they range between 900 and 3000 CPU hours per band and algorithm. Dabbech et al. (2022) for comparison also presented a WSClean reconstruction. For the WSClean reconstruction Dabbech et al. (2022) report a total computational cost of 132 and 236 CPU hours for the two imaging bands. Although CPU and GPU hours are not directly comparable, and the hardware and programming languages differ, this underlines the computational performance and applicability of fast-resolve to imaging setups with massive datasets.

5 Conclusion

In this paper we introduced the fast Bayesian imaging algorithm fast-resolve. fast-resolve combines the accuracy of the Bayesian imaging framework resolve with computational shortcuts of the CLEAN algorithm. This significantly broadens the applicability of Bayesian radio interferometric imaging. fast-resolve transforms the likelihood of the Bayesian radio interferometric imaging problem into a likelihood of a deconvolution problem, which is much faster to evaluate. Using the major-minor cycle scheme of CLEAN, fast-resolve corrects for inaccuracies of the transformed likelihood and accounts for the w-effect. The accuracy of fast-resolve was validated on Cygnus A VLA data by comparing them with previous resolve and multi-scale CLEAN reconstructions. The comparison showed that fast-resolve achieves the same resolution as resolve. Likewise, the imaging artifacts are comparable and on a very low level.

Furthermore, the computational speed of fast-resolve was analyzed and compared to resolve, showing significant speedups for the VLA Cygnus A data. As fast-resolve is implemented in JAX, it can also be executed on a GPU, accelerating the reconstruction compared to the CPU by more than an order of magnitude. For the single-channel Cygnus A VLA dataset, fast-resolve is at least 140 times faster than resolve when executed on a GPU. For datasets with more frequency channels, the computational advantages of fast-resolve can be even larger.

Additionally, we presented a Bayesian image reconstruction of the radio galaxies ESO 137-006 and ESO 137-007 from MeerKAT data with fast-resolve and compared the results for validation with Dabbech et al. (2022). The MeerKAT dataset is significantly larger than the VLA datasets, but the computational costs of fast-resolve remain moderate due to the major-minor cycle scheme. A reconstruction of these sources with the classic resolve algorithm using the same amount of data would be computationally out of scope. To the best of our knowledge, no other Bayesian radio interferometric imaging algorithm has been successfully applied to a dataset of similar size before.

Data availability

The raw data of the Cygnus A observation is publicly available in the NRAO Data Archive under project ID 14B-336, at https://data.nrao.edu/portal/

The raw data for the ESO137-006 observation is publicly available via the SARAO archive (project ID SCI-20190418-SM-01), at https://archive.sarao.ac.za

The fast-resolve reconstruction results are archived on Zenodo, at https://doi.org/10.5281/zenodo.13754292

The implementation of the fast-resolve algorithm will be integrated into the resolve algorithm, at https://gitlab.mpcdf.mpg.de/ift/resolve

Acknowledgements

The MeerKAT telescope is operated by the South African Radio Astronomy Observatory, which is a facility of the National Research Foundation, an agency of the Department of Science and Innovation. J.R. acknowledges financial support from the German Federal Ministry of Education and Research (BMBF) under grant 05A23WO1 (Verbundprojekt D-MeerKAT III). P.F. acknowledges funding through the German Federal Ministry of Education and Research for the project “ErUM-IFT: Informationsfeldtheorie für Experimente an Großforschungsanlagen” (Förderkennzeichen: 05D23EO1). O.M.S.’s research is supported by the South African Research Chairs Initiative of the Department of Science and Technology and National Research Foundation (grant No. 81737).

Appendix A Prior parameters for Cygnus A imaging

In Table A.1 we list the hyper parameters of the Gaussian process model for the diffuse emission in the Cygnus A reconstructions. The exact definition of these parameters is explained in detail in Arras et al. (2021a).

Table A.1

Prior parameters for Cygnus A reconstructions

Appendix B Additional figures

Figures B.1 and B.2 show the convergence of resolve and fast-resolve on the Cygnus A S-band data for CPU and GPU reconstructions as discussed in the main text. For completeness, similar plots are provided for the C-, X-, and Ku-band data on zenodo.⁸

Fig. B.1

fast-resolve and resolve reconstructions of the 2052 MHz Cygnus A data at different stages of the reconstruction. The left column shows fast-resolve the right column resolve. The rows display snapshots of the reconstruction after different amounts of wall time indicated in the lower left of each panel. The fast-resolve reconstruction was performed on two cores of an Intel Xeon CPU. The resolve reconstruction used eight cores of the same CPU. The fast-resolve reconstructions in the last two rows are at identical time snapshots, since the fast-resolve reconstruction is considered to be converged after 600 min. The resolve reconstruction is not fully converged even after 1416 min.

Fig. B.2

fast-resolve reconstructions of the 2052 MHz Cygnus A data. The left column displays snapshots of a fast-resolve reconstruction on an NVIDIA A100 GPU. The right column shows snapshots of a fast-resolve reconstruction on an NVIDIA RTX 390 GPU. Both reconstructions ran the same number of major and minor iterations. Since the reconstruction on the A100 GPU was finished after 22 min, the bottom left panel is identical to the row above. On the A100 GPU fast-resolve is approximately twice as fast as on a RTX 390 GPU, and 25 times faster than on the CPU displayed in Fig. B.1.

Figure B.3 shows the reconstruction of the second radio galaxy ESO137-007 in the LO band data. Figures B.4 and B.5 display the reconstructions of the sources ESO137-006 and ESO137-007 from the HI band data. All figures include the results of the Dabbech et al. (2022) reconstructions for comparison.

Fig. B.3

Radio galaxy ESO 137-007 north of the phase center of the observation. The top panel shows the fast-resolve reconstruction. For comparison the lower two panels display the reconstructions form Dabbech et al. (2022) with the AIRI and uSARA regularizers. All reconstructions, but especially fast-resolve, are affected by the suboptimal calibration.

Fig. B.4

As in Fig. 7, but at 1399 MHz.

Fig. B.5

As in Fig. B.3, but at 1399 MHz.

Appendix C Prior parameters for ESO 137 imaging

In Table C.1 we list the hyper parameters of the Gaussian process model for the diffuse emission in the ESO 137 reconstructions. The exact definition of these parameters is explained in detail in Arras et al. (2021a).

Table C.1

Prior parameters for ESO 137 reconstructions.

Appendix D Application to synthetic data

For further validation, we applied fast-resolve to synthetic data for which we know the underlying ground truth sky brightness distribution. The synthetic data mimics the setup of the MeerKAT ESO137-006 observation in the LO band. As in the MeerKAT example, a field of view of 2 × 2 deg is chosen. Furthermore, the uvw-coverage and frequency channels of the MeerKAT ESO137-006 LO band data (see Table 2) are used. A prior sample was drawn from the diffuse emission sky brightness model to generate the sky brightness ground truth. The prior sample was multiplied by a taper function that exponentially decays to zero at the edges of the field of view to ensure no bright emission is at the edges. The resulting sky brightness distribution used as a ground truth is shown in Fig. D.1. Model visibilities were computed from this ground truth. Gaussian noise was added to the mode visibilities using the weights of the real ESO137-006 observation for the inverse noise covariance. The resulting dataset was reconstructed with fast-resolve.

Fig. D.1

fast-resolve reconstruction of synthetic data. The left panel shows the fast-resolve reconstruction result. The right panel displays the underlying ground truth sky brightness. The field of view (2×2 deg) and the uvw-coverage are the same as in the MeerKAT LO band observation (see Table 2).

The reconstruction results are displayed in Fig. D.1. All bright emission features are accurately reconstructed. Also, fainter features could successfully be reconstructed, although with a lower resolution. Some very faint features could not be recovered. Overall, no significant artifacts are visible in the reconstruction. In particular, no artifacts similar to the spurious synchrotron threads or the ripple around bright sources as in the real MeerKAT data reconstruction are visible. As discussed in the main text, we assume that a suboptimal calibration creates these artifacts in the MeerKAT data reconstruction. The absence of similar artifacts in the synthetic data supports this assumption, as no calibration errors were added to the synthetic data.

Appendix E Residual data

In Fig. E.1, the fast-resolve residual data of the last major cycle of the MeerKAT ESO137-006 LO band reconstruction is depicted. The residual data of the HI band reconstruction looks similar. The left shows the residual for the full field of view. On large scales, the edges of the box containing the prior model for the source ESO137-006 (see Fig. 6) are visible. Otherwise, only very little structure of the primary sources remains in the residual.

Fig. E.1

fast-resolve residual data of the ESO137-006 reconstruction. The left panel shows the fast-resolve residual data of the last minor cycle in arbitrary units. On large scales, no significant structure of the source are left in the residual, except for some small structures at the corners of the box containing the source ESO137-006. The right panel shows a zoom into a representative region marked by the box on the left. The zoomed image shows many point sources that are not fitted by the fast-resolve reconstruction, but are left in the residual data.

As discussed in the main text for the MeerKAT observations, only the diffuse emission prior was used, which also modeled the flux of the point sources. The diffuse emission prior is not tailored to point sources, reducing sensitivity. For this reason, many point sources are not captured by the reconstruction but are left in the residual. The right panel of Fig. E.1 shows a zoom into a representative region of the residual data image. As expected, many faint point sources that were not (fully) captured by the reconstruction are visible in the zoomed image. For observations like the ESO137-006 MeerKAT data, a dedicated point source model for resolve and fast-resolve, capable of handling many sources, would be beneficial but is left for future work.

References

Abdulaziz, A., Dabbech, A., & Wiaux, Y. 2019, MNRAS, 489, 1230 [Google Scholar]
Aghabiglou, A., Chu, C. S., Dabbech, A., & Wiaux, Y. 2024, ApJS, 273, 3 [NASA ADS] [CrossRef] [Google Scholar]
Arras, P., Frank, P., Leike, R., Westermann, R., & Enßlin, T. A. 2019, A&A, 627, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Arras, P., Bester, H. L., Perley, R. A., et al. 2021a, A&A, 646, A84 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Arras, P., Martin, R., Westermann, R., & Enßlin, T. A. 2021b, A&A, 646, A58 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Arras, P., Frank, P., Haim, P., et al. 2022, Nat. Astron., 6, 259 [NASA ADS] [CrossRef] [Google Scholar]
Bhatnagar, S., & Cornwell, T. J. 2004, A&A, 426, 747 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Birdi, J., Repetti, A., & Wiaux, Y. 2018, MNRAS, 478, 4442 [CrossRef] [Google Scholar]
Birdi, J., Repetti, A., & Wiaux, Y. 2020, MNRAS, 492, 3509 [Google Scholar]
Bradbury, J., Frostig, R., Hawkins, P., et al. 2018, JAX: composable transformations of Python+NumPy programs [Google Scholar]
Cai, X., Pereyra, M., & McEwen, J. D. 2018, MNRAS, 480, 4154 [Google Scholar]
Connor, L., Bouman, K. L., Ravi, V., & Hallinan, G. 2022, MNRAS, 514, 2614 [NASA ADS] [CrossRef] [Google Scholar]
Cornwell, T. J., & Evans, K. F. 1985, A&A, 143, 77 [NASA ADS] [Google Scholar]
Cornwell, T. J., Golap, K., & Bhatnagar, S. 2008, IEEE J. Selected Topics Signal Process., 2, 647 [NASA ADS] [CrossRef] [Google Scholar]
Dabbech, A., Onose, A., Abdulaziz, A., et al. 2018, MNRAS, 476, 2853 [Google Scholar]
Dabbech, A., Repetti, A., Perley, R. A., Smirnov, O. M., & Wiaux, Y. 2021, MNRAS, 506, 4855 [NASA ADS] [CrossRef] [Google Scholar]
Dabbech, A., Terris, M., Jackson, A., et al. 2022, ApJ, 939, L4 [NASA ADS] [CrossRef] [Google Scholar]
Dabbech, A., Aghabiglou, A., Chu, C. S., & Wiaux, Y. 2024, ApJ, 966, L34 [NASA ADS] [CrossRef] [Google Scholar]
Edenhofer, G., Frank, P., Roth, J., et al. 2024a, J. Open Source Softw., 9, 6593 [NASA ADS] [CrossRef] [Google Scholar]
Edenhofer, G., Zucker, C., Frank, P., et al. 2024b, A&A, 685, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Frank, P., Leike, R., & Enßlin, T. A. 2021, Entropy, 23, 693 [NASA ADS] [CrossRef] [Google Scholar]
GRAVITY Collaboration (Abuter, R., et al.) 2022, A&A, 657, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Greiner, M., Vacca, V., Junklewitz, H., & Enßlin, T. A. 2016, arXiv e-prints [arXiv:1605.04317] [Google Scholar]
Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357 [NASA ADS] [CrossRef] [Google Scholar]
Högbom, J. A. 1974, A&AS, 15, 417 [Google Scholar]
Jonas, J., & MeerKAT Team. 2016, in MeerKAT Science: On the Pathway to the SKA, 1 [Google Scholar]
Józsa, G. I. G., White, S. V., Thorat, K., et al. 2020, ASP Conf. Ser., 527, 635 [Google Scholar]
Junklewitz, H., Bell, M. R., Selig, M., & Enßlin, T. A. 2016, A&A, 586, A76 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Kenyon, J. S., Smirnov, O. M., Grobler, T. L., & Perkins, S. J. 2018, MNRAS, 478, 2399 [NASA ADS] [CrossRef] [Google Scholar]
Knollmüller, J., & Enßlin, T. A. 2019, arXiv e-prints [arXiv:1901.11033] [Google Scholar]
Labate, M. G., Waterson, M., Alachkar, B., et al. 2022, J. Astron. Telesc. Instrum. Syst., 8, 011024 [NASA ADS] [CrossRef] [Google Scholar]
Liaudat, T. I., Mars, M., Price, M. A., et al. 2024, RAS Techn. Instrum., 3, 505 [CrossRef] [Google Scholar]
Offringa, A. R., & Smirnov, O. 2017, MNRAS, 471, 301 [Google Scholar]
Offringa, A. R., McKinley, B., Hurley-Walker, N., et al. 2014, MNRAS, 444, 606 [Google Scholar]
Perley, R. A., Chandler, C. J., Butler, B. J., & Wrobel, J. M. 2011, ApJ, 739, L1 [Google Scholar]
Ramatsoku, M., Murgia, M., Vacca, V., et al. 2020, A&A, 636, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Rau, U., & Cornwell, T. J. 2011, A&A, 532, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Rau, U., Bhatnagar, S., Voronkov, M. A., & Cornwell, T. J. 2009, IEEE Proc., 97, 1472 [NASA ADS] [CrossRef] [Google Scholar]
Repetti, A., Pereyra, M., & Wiaux, Y. 2019, SIAM J. Imaging Sci., 12, 87 [Google Scholar]
Roth, J., Arras, P., Reinecke, M., et al. 2023, A&A, 678, A177 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Roth, J., Reinecke, M., & Edenhofer, G. 2024, J. Open Source Softw., 9, 6532 [NASA ADS] [CrossRef] [Google Scholar]
Schmidt, K., Geyer, F., Fröse, S., et al. 2022, A&A, 664, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Schwab, F. R., & Cotton, W. D. 1983, AJ, 88, 688 [Google Scholar]
Sebokolodi, M. L. L., Perley, R., Eilek, J., et al. 2020, ApJ, 903, 36 [NASA ADS] [CrossRef] [Google Scholar]
Selig, M., Bell, M. R., Junklewitz, H., et al. 2013, A&A, 554, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Smirnov, O. M. 2011, A&A, 527, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Steininger, T., Dixit, J., Frank, P., et al. 2019, Ann. Phys., 531, 1800290 [NASA ADS] [CrossRef] [Google Scholar]
Sutter, P. M., Wandelt, B. D., McEwen, J. D., et al. 2014, MNRAS, 438, 768 [Google Scholar]
Sutton, E. C., & Wandelt, B. D. 2006, ApJS, 162, 401 [NASA ADS] [CrossRef] [Google Scholar]
Swart, G. P., Dewdney, P. E., & Cremonini, A. 2022, J. Astron. Telesc. Instrum. Syst., 8, 011021 [NASA ADS] [CrossRef] [Google Scholar]
Terris, M., Dabbech, A., Tang, C., & Wiaux, Y. 2023a, MNRAS, 518, 604 [Google Scholar]
Terris, M., Tang, C., Jackson, A., & Wiaux, Y. 2023b, arXiv e-prints [arXiv:2312.07137] [Google Scholar]
Thouvenin, P.-A., Abdulaziz, A., Dabbech, A., Repetti, A., & Wiaux, Y. 2023, MNRAS, 521, 1 [CrossRef] [Google Scholar]
Tiede, P. 2022, J. Open Source Softw., 7, 4457 [NASA ADS] [CrossRef] [Google Scholar]
Wiaux, Y., Jacques, L., Puy, G., Scaife, A. M. M., & Vandergheynst, P. 2009, MNRAS, 395, 1733 [NASA ADS] [CrossRef] [Google Scholar]

¹

https://gitlab.mpcdf.mpg.de/ift/resolve

²

https://github.com/NIFTy-PPL/NIFTy

³

https://gitlab.mpcdf.mpg.de/mtr/ducc

⁴

https://github.com/google/jax

⁵

https://github.com/NIFTy-PPL/JAXbind

⁶

https://gitlab.mpcdf.mpg.de/mtr/ducc

⁷

The pixelizations of the multi-scale CLEAN, the previous resolve, and the fast-resolve reconstructions differ as they originate from different works.

⁸

https://doi.org/10.5281/zenodo.13754292

All Tables

Table 1

VLA Cygnus A observations.

In the text

Table 2

MeerKAT ESO 137-006 observations.

In the text

Table A.1

Prior parameters for Cygnus A reconstructions

In the text

Table C.1

Prior parameters for ESO 137 reconstructions.

In the text

All Figures

Fig. 1

Comparison of fast-resolve Cygnus A reconstructions with resolve and CLEAN reconstructions. The top left panel shows the multi-scale CLEAN map of the S-band data form Arras et al. (2021a). The left column below the CLEAN map shows fast-resolve reconstructions at four different frequencies indicated in the top left corner of each panel. The right column shows resolve reconstructions of Roth et al. (2023) and Arras et al. (2021a) using the same data. The dynamic range of the fast-resolve reconstructions is higher than in some of the previous resolve reconstructions because the old resolve reconstructions were not fully converged due to the high computational cost.

In the text

	Fig. 2 Zoomed-in image of the eastern hot spot of the Cygnus A reconstructions from Fig. 1. The resolution of the `fast-resolve` maps in the left column is on par with the `resolve` maps in the right column. The resolution of the `CLEAN` map is significantly lower than in the other maps.
In the text

Fig. 3

Zoomed-in image of the nucleus and jet of the Cygnus A reconstructions form Fig. 1. In all resolve and fast-resolve reconstructions, the two point sources in the nucleus are modeled with two pixels that are uncorrelated with the brightness of the neighboring pixels. Since the resolve of Arras et al. (2021a) are not fully converged due to the high computational cost, their dynamic range is lower, and the jet of Cygnus A is not visible. The multi-scale CLEAN map from Arras et al. (2021a) is consistent with the fast-resolve result and the resolve reconstruction from Roth et al. (2023). At the highest frequency the jet is hardly visible also with fast-resolve, with background artifacts having a similar brightness.

In the text

Fig. 4

Pixel-wise relative uncertainty map of fast-resolve S-band reconstruction compared to the resolve uncertainty map of Arras et al. (2021a). The relative uncertainty is lower for both reconstructions in regions of high surface brightness. In the hotspot, the relative uncertainty is around 10⁻³−10⁻² and increases toward the lower surface brightness regions. Outside of Cygnus A, the relative uncertainty fluctuates around 1, indicating that the reconstructed intensity is statistically consistent with zero surface brightness. The estimated uncertainty of resolve tends to be slightly higher than the uncertainty estimate of fast-resolve. This might be due to the resolve reconstruction of Arras et al. (2021a) not being fully converged.

In the text

Fig. 5

Mean square residual of the log brightness of resolve and fast-resolve reconstruction as a function of time for the S-band data. The residual is computed to the final iteration of fast-resolve on the NVIDIA A100 GPU depicted in Fig. B.2, which we believe to be converged. The fast-resolve reconstructions are recomputed with a different random seed. The mean square residual of the fast-resolve reconstructions falls until their curves flatten when they converge. Due to the different hardware, JAX, and CUDA versions, the final results are numerically not exactly identical. The resolve reconstruction (blue curve) does not fully converge in the displayed time range.

In the text

Fig. 6

Reconstruction of the ESO 137-006 MeerKAT observation in the LO sub-band (961–1145 MHz) with fast-resolve. The radio galaxies ESO 137-006 and ESO 137-007, as well as the background sources, are modeled with independent Gaussian processes. White boxes indicate the spatial extent of each Gaussian process. Zooms on the reconstructions of ESO 137-006 and ESO 137-007 for both sub-bands are shown in Figs. 7, B.3, B.4, and B.5.

In the text

	Fig. 7 Radio galaxy ESO 137-006 in the phase center of the observation. The top panel shows the `fast-resolve` reconstruction. For comparison, the middle and bottom panels display the reconstructions form Dabbech et al. (2022) with the AIRI and uSARA regularizers. All reconstructions, but especially `fast-resolve`, are affected by the suboptimal calibration.
In the text

Fig. B.1

fast-resolve and resolve reconstructions of the 2052 MHz Cygnus A data at different stages of the reconstruction. The left column shows fast-resolve the right column resolve. The rows display snapshots of the reconstruction after different amounts of wall time indicated in the lower left of each panel. The fast-resolve reconstruction was performed on two cores of an Intel Xeon CPU. The resolve reconstruction used eight cores of the same CPU. The fast-resolve reconstructions in the last two rows are at identical time snapshots, since the fast-resolve reconstruction is considered to be converged after 600 min. The resolve reconstruction is not fully converged even after 1416 min.

In the text

Fig. B.2

fast-resolve reconstructions of the 2052 MHz Cygnus A data. The left column displays snapshots of a fast-resolve reconstruction on an NVIDIA A100 GPU. The right column shows snapshots of a fast-resolve reconstruction on an NVIDIA RTX 390 GPU. Both reconstructions ran the same number of major and minor iterations. Since the reconstruction on the A100 GPU was finished after 22 min, the bottom left panel is identical to the row above. On the A100 GPU fast-resolve is approximately twice as fast as on a RTX 390 GPU, and 25 times faster than on the CPU displayed in Fig. B.1.

In the text

	Fig. B.3 Radio galaxy ESO 137-007 north of the phase center of the observation. The top panel shows the `fast-resolve` reconstruction. For comparison the lower two panels display the reconstructions form Dabbech et al. (2022) with the AIRI and uSARA regularizers. All reconstructions, but especially `fast-resolve`, are affected by the suboptimal calibration.
In the text

	Fig. B.4 As in Fig. 7, but at 1399 MHz.
In the text

	Fig. B.5 As in Fig. B.3, but at 1399 MHz.
In the text

	Fig. D.1 `fast-resolve` reconstruction of synthetic data. The left panel shows the `fast-resolve` reconstruction result. The right panel displays the underlying ground truth sky brightness. The field of view (2×2 deg) and the uvw-coverage are the same as in the MeerKAT LO band observation (see Table 2).
In the text

Fig. E.1

fast-resolve residual data of the ESO137-006 reconstruction. The left panel shows the fast-resolve residual data of the last minor cycle in arbitrary units. On large scales, no significant structure of the source are left in the residual, except for some small structures at the corners of the box containing the source ESO137-006. The right panel shows a zoom into a representative region marked by the box on the left. The zoomed image shows many point sources that are not fitted by the fast-resolve reconstruction, but are left in the residual data.

In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Abdulaziz, A., Dabbech, A., & Wiaux, Y. 2019, MNRAS, 489, 1230 [Google Scholar]

[2] Aghabiglou, A., Chu, C. S., Dabbech, A., & Wiaux, Y. 2024, ApJS, 273, 3 [NASA ADS] [CrossRef] [Google Scholar]

[3] Arras, P., Frank, P., Leike, R., Westermann, R., & Enßlin, T. A. 2019, A&A, 627, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[4] Arras, P., Bester, H. L., Perley, R. A., et al. 2021a, A&A, 646, A84 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[5] Arras, P., Martin, R., Westermann, R., & Enßlin, T. A. 2021b, A&A, 646, A58 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[6] Arras, P., Frank, P., Haim, P., et al. 2022, Nat. Astron., 6, 259 [NASA ADS] [CrossRef] [Google Scholar]

[7] Bhatnagar, S., & Cornwell, T. J. 2004, A&A, 426, 747 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[8] Birdi, J., Repetti, A., & Wiaux, Y. 2018, MNRAS, 478, 4442 [CrossRef] [Google Scholar]

[9] Birdi, J., Repetti, A., & Wiaux, Y. 2020, MNRAS, 492, 3509 [Google Scholar]

[10] Bradbury, J., Frostig, R., Hawkins, P., et al. 2018, JAX: composable transformations of Python+NumPy programs [Google Scholar]

[11] Cai, X., Pereyra, M., & McEwen, J. D. 2018, MNRAS, 480, 4154 [Google Scholar]

[12] Connor, L., Bouman, K. L., Ravi, V., & Hallinan, G. 2022, MNRAS, 514, 2614 [NASA ADS] [CrossRef] [Google Scholar]

[13] Cornwell, T. J., & Evans, K. F. 1985, A&A, 143, 77 [NASA ADS] [Google Scholar]

[14] Cornwell, T. J., Golap, K., & Bhatnagar, S. 2008, IEEE J. Selected Topics Signal Process., 2, 647 [NASA ADS] [CrossRef] [Google Scholar]

[15] Dabbech, A., Onose, A., Abdulaziz, A., et al. 2018, MNRAS, 476, 2853 [Google Scholar]

[16] Dabbech, A., Repetti, A., Perley, R. A., Smirnov, O. M., & Wiaux, Y. 2021, MNRAS, 506, 4855 [NASA ADS] [CrossRef] [Google Scholar]

[17] Dabbech, A., Terris, M., Jackson, A., et al. 2022, ApJ, 939, L4 [NASA ADS] [CrossRef] [Google Scholar]

[18] Dabbech, A., Aghabiglou, A., Chu, C. S., & Wiaux, Y. 2024, ApJ, 966, L34 [NASA ADS] [CrossRef] [Google Scholar]

[19] Edenhofer, G., Frank, P., Roth, J., et al. 2024a, J. Open Source Softw., 9, 6593 [NASA ADS] [CrossRef] [Google Scholar]

[20] Edenhofer, G., Zucker, C., Frank, P., et al. 2024b, A&A, 685, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[21] Frank, P., Leike, R., & Enßlin, T. A. 2021, Entropy, 23, 693 [NASA ADS] [CrossRef] [Google Scholar]

[22] GRAVITY Collaboration (Abuter, R., et al.) 2022, A&A, 657, A82 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[23] Greiner, M., Vacca, V., Junklewitz, H., & Enßlin, T. A. 2016, arXiv e-prints [arXiv:1605.04317] [Google Scholar]

[24] Harris, C. R., Millman, K. J., van der Walt, S. J., et al. 2020, Nature, 585, 357 [NASA ADS] [CrossRef] [Google Scholar]

[25] Högbom, J. A. 1974, A&AS, 15, 417 [Google Scholar]

[26] Jonas, J., & MeerKAT Team. 2016, in MeerKAT Science: On the Pathway to the SKA, 1 [Google Scholar]

[27] Józsa, G. I. G., White, S. V., Thorat, K., et al. 2020, ASP Conf. Ser., 527, 635 [Google Scholar]

[28] Junklewitz, H., Bell, M. R., Selig, M., & Enßlin, T. A. 2016, A&A, 586, A76 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[29] Kenyon, J. S., Smirnov, O. M., Grobler, T. L., & Perkins, S. J. 2018, MNRAS, 478, 2399 [NASA ADS] [CrossRef] [Google Scholar]

[30] Knollmüller, J., & Enßlin, T. A. 2019, arXiv e-prints [arXiv:1901.11033] [Google Scholar]

[31] Labate, M. G., Waterson, M., Alachkar, B., et al. 2022, J. Astron. Telesc. Instrum. Syst., 8, 011024 [NASA ADS] [CrossRef] [Google Scholar]

[32] Liaudat, T. I., Mars, M., Price, M. A., et al. 2024, RAS Techn. Instrum., 3, 505 [CrossRef] [Google Scholar]

[33] Offringa, A. R., & Smirnov, O. 2017, MNRAS, 471, 301 [Google Scholar]

[34] Offringa, A. R., McKinley, B., Hurley-Walker, N., et al. 2014, MNRAS, 444, 606 [Google Scholar]

[35] Perley, R. A., Chandler, C. J., Butler, B. J., & Wrobel, J. M. 2011, ApJ, 739, L1 [Google Scholar]

[36] Ramatsoku, M., Murgia, M., Vacca, V., et al. 2020, A&A, 636, L1 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[37] Rau, U., & Cornwell, T. J. 2011, A&A, 532, A71 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[38] Rau, U., Bhatnagar, S., Voronkov, M. A., & Cornwell, T. J. 2009, IEEE Proc., 97, 1472 [NASA ADS] [CrossRef] [Google Scholar]

[39] Repetti, A., Pereyra, M., & Wiaux, Y. 2019, SIAM J. Imaging Sci., 12, 87 [Google Scholar]

[40] Roth, J., Arras, P., Reinecke, M., et al. 2023, A&A, 678, A177 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[41] Roth, J., Reinecke, M., & Edenhofer, G. 2024, J. Open Source Softw., 9, 6532 [NASA ADS] [CrossRef] [Google Scholar]

[42] Schmidt, K., Geyer, F., Fröse, S., et al. 2022, A&A, 664, A134 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[43] Schwab, F. R., & Cotton, W. D. 1983, AJ, 88, 688 [Google Scholar]

[44] Sebokolodi, M. L. L., Perley, R., Eilek, J., et al. 2020, ApJ, 903, 36 [NASA ADS] [CrossRef] [Google Scholar]

[45] Selig, M., Bell, M. R., Junklewitz, H., et al. 2013, A&A, 554, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[46] Smirnov, O. M. 2011, A&A, 527, A106 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[47] Steininger, T., Dixit, J., Frank, P., et al. 2019, Ann. Phys., 531, 1800290 [NASA ADS] [CrossRef] [Google Scholar]

[48] Sutter, P. M., Wandelt, B. D., McEwen, J. D., et al. 2014, MNRAS, 438, 768 [Google Scholar]

[49] Sutton, E. C., & Wandelt, B. D. 2006, ApJS, 162, 401 [NASA ADS] [CrossRef] [Google Scholar]

[50] Swart, G. P., Dewdney, P. E., & Cremonini, A. 2022, J. Astron. Telesc. Instrum. Syst., 8, 011021 [NASA ADS] [CrossRef] [Google Scholar]

[51] Terris, M., Dabbech, A., Tang, C., & Wiaux, Y. 2023a, MNRAS, 518, 604 [Google Scholar]

[52] Terris, M., Tang, C., Jackson, A., & Wiaux, Y. 2023b, arXiv e-prints [arXiv:2312.07137] [Google Scholar]

[53] Thouvenin, P.-A., Abdulaziz, A., Dabbech, A., Repetti, A., & Wiaux, Y. 2023, MNRAS, 521, 1 [CrossRef] [Google Scholar]

[54] Tiede, P. 2022, J. Open Source Softw., 7, 4457 [NASA ADS] [CrossRef] [Google Scholar]

[55] Wiaux, Y., Jacques, L., Puy, G., Scaife, A. M. M., & Vandergheynst, P. 2009, MNRAS, 395, 1733 [NASA ADS] [CrossRef] [Google Scholar]

fast-resolve: Fast Bayesian radio interferometric imaging

1 Introduction

2 The inverse problem

3 Methods

3.1 resolve

3.1.1 resolve prior

3.1.2 resolve likelihood

3.1.3 resolve posterior inference

3.2 CLEAN

3.3 fast-resolve

3.3.1 fast-resolve measurement equation

3.3.2 fast-resolve response R′

3.3.3 fast-resolve noise model

3.3.4 fast-resolve inference scheme

3.3.5 fast-resolve minor cycles

3.4 Previous fastRESOLVE of Greiner et al. (2016)

4 Applications

4.1 Application to VLA Cygnus A data

4.1.1 Data and algorithm setup

4.1.2 Comparison with previous resolve results

4.1.3 Computational time of fast-resolve

4.2 Application to MeerKAT data

4.2.1 fast-resolve prior model

4.2.2 Imaging of Dabbech et al. (2022)

4.2.3 Imaging results

4.2.4 Computational costs

5 Conclusion

Data availability

Acknowledgements

Appendix A Prior parameters for Cygnus A imaging

Appendix B Additional figures

Appendix C Prior parameters for ESO 137 imaging

Appendix D Application to synthetic data

Appendix E Residual data

References

All Tables

All Figures

`fast-resolve`: Fast Bayesian radio interferometric imaging

3.1 `resolve`

3.1.1 `resolve` prior

3.1.2 `resolve` likelihood

3.1.3 `resolve` posterior inference

3.2 `CLEAN`

3.3 `fast-resolve`

3.3.1 `fast-resolve` measurement equation

3.3.2 `fast-resolve` response R′

3.3.3 `fast-resolve` noise model

3.3.4 `fast-resolve` inference scheme

3.3.5 `fast-resolve` minor cycles

3.4 Previous fast`RESOLVE` of Greiner et al. (2016)

4.1.2 Comparison with previous `resolve` results

4.1.3 Computational time of `fast-resolve`

4.2.1 `fast-resolve` prior model