Forecasting the solar cycle using variational data assimilation: Validation on cycles 22 to 25

L. Jouve; C. P. Hung; A. S. Brun; S. Hazra; A. Fournier; O. Talagrand; B. Perri; A. Strugarek

doi:10.1051/0004-6361/202453431

Home

All issues

Volume 699 (June 2025)

A&A, 699 (2025) A30

Full HTML

Open Access

Issue		A&A Volume 699, June 2025


Article Number		A30
Number of page(s)		17
Section		The Sun and the Heliosphere
DOI		https://doi.org/10.1051/0004-6361/202453431
Published online		26 June 2025

A&A, 699, A30 (2025)

Forecasting the solar cycle using variational data assimilation: Validation on cycles 22 to 25

L. Jouve¹^⋆, C. P. Hung²^,3, A. S. Brun², S. Hazra², A. Fournier³, O. Talagrand⁴, B. Perri² and A. Strugarek²

¹ Univ. Toulouse, CNRS, CNES, Institut de Recherche en Astrophysique et Planétologie, Toulouse, France
² Département d’Astrophysique/AIM CEA/IRFU, CNRS/INSU, Univ. Paris-Saclay & Univ. de Paris Cité, 91191 Gif-sur-Yvette, France
³ Université Paris Cité, Institut de physique du globe de Paris, CNRS, F-75005 Paris, France
⁴ Laboratoire de météorologie dynamique, UMR 8539, Ecole Normale Supérieure, Paris Cedex 05, France

^⋆ Corresponding author: laurene.jouve@irap.omp.eu

Received: 13 December 2024
Accepted: 15 April 2025

Abstract

Context. Forecasting future solar activity has become crucial in our world today, where intense eruptive phenomena, mostly occurring during solar maximum, are likely to exert a highly detrimental effect on satellites and telecommunications. However, forecasting such events is a very difficult task owing to the highly turbulent flows existing in the solar interior.

Aims. We present a 4D variational assimilation technique (4D-Var) applied for the first time to real solar data, consisting of the time series of the sunspot number and the line-of-sight surface magnetic field from 1975 to 2024. We tested our method against observations of past cycles 22, 23, 24, as well as the ongoing cycle 25. For the latter, we offer an estimate of the imminent maximum value and timing, along with a first forecast of the next solar minimum.

Methods. We used a variational data assimilation technique applied to a solar mean-field Babcock-Leighton (BL) flux-transport dynamo model. This translates to a minimization of an objective function with respect to the control vector, defined here as a set of coefficients representing the meridional flow and the initial magnetic field. Ensemble predictions were produced to obtain uncertainties on the timing and value of the maximum of cycle n + 1, when the data on cycle n were assimilated. In particular, we have studied the influence of the phase during which the data were assimilated into the model and that of the weighting of various terms in the objective function.

Results. Our method was validated on cycles 22, 23, and 24 with very satisfactory results. We found a particularly good convergence of our predictions (both in terms of accuracy and precision) when the assimilation window encompassed more and more of the rising phase of cycle n + 1. For cycle 25, predictions varied, again depending on the extent of the assimilation window; however, they were seen to start converging past 2022 to a solar maximum reached between mid-2024 up to the beginning of 2025, with a sunspot number value of 143.1 ± 15.0. Relatively close values of the maximum have been found in both hemispheres within a time lag of a few months. We also offer a forecast for the next minimum to occur around late 2029 (with significant error bars).

Conclusions. The data assimilation technique presented here combines a physics-based model and real solar observations, offering promising results for future solar activity forecasting.

Key words: Sun: activity / Sun: magnetic fields / sunspots

© The Authors 2025

Open Access article, published by EDP Sciences, under the terms of the Creative Commons Attribution License (https://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

This article is published in open access under the Subscribe to Open model. Subscribe to A&A to support open access publication.

1. Introduction

The multitude of manifestations of solar activity have important consequences for the Sun’s environment, including our planet and the thousands of active satellites orbiting around it. Examples of such manifestations are a changing solar flux irradiating the Earth, solar energetic particles, coronal mass ejections (CMEs) and flares, magnetic fields, and solar wind variabilities on various timescales. In the digitally connected world we live in today, solar activity and the potential consequences on satellites and telecommunications infrastructures have become crucial to understand and predict. No matter whether the focus is on space weather (events on short timescale; e.g., CMEs) or space climate (longer term evolution of the Sun’s cyclic activity), such studies always come back to the question of how the solar magnetic fields governing the Sun’s activity and heliosphere are produced.

The Sun is believed to operate a nonlinear turbulent dynamo, producing magnetic fields with a huge range of temporal and spatial scales, making it extremely difficult to model and predict in all its details. However, if we are interested in the forecasting of the characteristics of the 11 year sunspot cycle, we may focus on the large-scale features mostly. This corresponds for instance to the mainly dipolar field exhibited at solar minimum that peaks at the solar poles, as well as the sunspots appearing periodically at the solar surface, which are thought to be the product of toroidal fields emerging from the solar interior.

The predictions of the sunspot number (SSN) evolution has been an active field of research for several decades. Various approaches have been adopted with methods which can be classified into three main groups: precursor methods, extrapolation methods, and physics-based methods (see Petrovay 2020 for details on those methods and results). It is still unclear if some of these methods are superior to others, but the relatively recently developed physics based methods (relying on dynamo and/or flux-transport models; e.g., Dikpati & Gilman 2006; Dikpati & Anderson 2012; Dikpati et al. 2014; Choudhuri et al. 2007; Bhowmik & Nandy 2018; Labonville et al. 2019) have the advantage of including physical processes that are thought to play a key role in the solar interior and to enable the assimilation of observed data (see Nandy 2021). In this work, we chose to use one of those physics-based models in which data is assimilated through the 4D-Var technique, which is routinely used today to predict weather on Earth (Klinker et al. 2000).

Following previous works (Hung et al. 2015, 2017), we applied a 4D variational data assimilation (DA) technique to a mean-field flux-transport Babcock-Leighton (BL) model, which has been used for decades to reproduce the evolution of the large-scale features of the solar cycle (Charbonneau 2010). In these previous works, the BL model produced synthetic observations on which various levels of Gaussian noise were added. The 4D-Var assimilation technique, which necessitates the development of an adjoint model (Talagrand & Courtier 1987; Fournier et al. 2010), allowed our physical model to recover, to a very good level of accuracy, the meridional flow and initial magnetic conditions used to produce the synthetic data.

In this work, we move one step forward by applying our technique to real solar data, consisting of the time series of SSNs and surface line-of-sight magnetic fields provided by different observatories. Another addition is the possibility to predict not only the amplitude and timing of the next maximum, but also to draw some insights into the asymmetry between the northern and southern hemispheres, which sometimes reach their peak SSN at very different times (see e.g., Waldmeier 1971; Berdyugina & Usoskin 2003; Temmer et al. 2006; Norton & Gallagher 2010; Finley & Brun 2023 for a focus on north-south asymmetries in the solar cycle). The goal of our work is thus to minimize an objective function measuring the misfit between outputs of our model and associated real solar observations. To minimize the objective function, a control vector is adjusted by the assimilation step of our method. Similarly to Hung et al. (2017), the control vector consists of coefficients defining the meridional circulation and initial conditions of the magnetic field. Our method, referred to in the paper as the Solar Predict tool, was applied to the three previous solar cycles (i.e., cycles 22, 23, and 24) as well as to the ongoing cycle 25, whose peak amplitude is (as of this writing) close to being reached. We show a very satisfactory agreement for all the previous three cycles, and then give an estimate of both the current sunspot maximum and the timing of the next minimum of solar activity.

The paper is organized as follows. The methodology (i.e., the model and data) is first presented in Sect. 2. Section 3 describes the tests performed to validate our method on the three previous cycles. The main results for the forecasting of cycle 25 are then presented in Sect. 4 and we discuss our conclusions in Sect. 5.

2. Methodology

2.1. Dynamo model

Similarly to Hung et al. (2017), we based our 11 year solar cycle forecasts on a mean field BL flux transport dynamo model. Here, we provide the main elements of this 2.5D magneto-hydrodynamical (MHD) model. First, we expand the assumed axisymmetric magnetic field as:

$\begin{matrix} B = B_{ϕ} e_{ϕ} + \nabla \times (A_{ϕ} e_{ϕ}) . \end{matrix}$ $\begin{aligned} \mathbf B = B_\phi \mathbf e _{\phi } + \mathbf \nabla \times (A_\phi \mathbf e _{\phi }). \end{aligned}$

We then solve for the following mean field dynamo equations for the poloidal potential, A_ϕ, and the toroidal magnetic field, B_ϕ,

$\begin{matrix} \partial_{t} A_{ϕ} & = \frac{η}{η_{t}} (\nabla^{2} - \frac{1}{ϖ^{2}}) A_{ϕ} - R e \frac{v_{p}}{ϖ} \cdot \nabla (ϖ A_{ϕ}) + C_{s} S (r, θ, B_{ϕ}), \end{matrix}$ $\begin{aligned} \partial _t A_{\phi }&= \frac{\eta }{\eta _t} \left(\nabla ^2 - \frac{1}{\varpi ^2}\right) A_{\phi }-Re \frac{\mathbf{v }_p}{\varpi }\cdot \nabla (\varpi A_{\phi }) + C_sS(r, \theta , B_{\phi }), \end{aligned}$ (1)

$\begin{matrix} \partial_{t} B_{ϕ} & = \frac{η}{η_{t}} (\nabla^{2} - \frac{1}{ϖ^{2}}) B_{ϕ} + \frac{1}{ϖ} \frac{\partial (ϖ B_{ϕ})}{\partial r} \frac{\partial (η / η_{t})}{\partial r} \\ - R e ϖ v_{p} \cdot \nabla (\frac{B_{ϕ}}{ϖ}) - R e B_{ϕ} \nabla \cdot v_{p} \\ + C_{Ω} ϖ [\nabla \times (A_{ϕ} e_{ϕ})] \cdot \nabla Ω, \end{matrix}$ $\begin{aligned} \partial _t B_{\phi }&= \frac{\eta }{\eta _t} \left(\nabla ^2 - \frac{1}{\varpi ^2}\right) B_{\phi } +\frac{1}{\varpi } \frac{\partial (\varpi B_{\phi })}{\partial r} \frac{\partial (\eta /\eta _t)}{\partial r}\nonumber \\&\qquad - Re\, \varpi \mathbf v _p\cdot \nabla \left(\frac{B_{\phi }}{\varpi }\right) - Re\, B_{\phi } \nabla \cdot \mathbf v _p \\&\qquad + C_{\Omega } \varpi \left[\nabla \times (A_{\phi } \mathbf e _{\phi })\right] \cdot \nabla \Omega ,\nonumber \end{aligned}$ (2)

where ϖ = rsinθ, v_p is the meridional circulation, Ω = Ω e_z is the angular velocity, η is the magnetic diffusivity, and S is the source of the poloidal field at the solar surface (noting that we did not include an α-effect).

The length is normalized with the solar radius, R_⊙, while the time is normalized using the diffusive timescale, R_⊙²/η_t, where η_t is the envelope turbulent diffusivity. We then introduce three dimensionless parameters, namely, the Reynolds number based on the meridional flow speed, Re = v_o R_⊙/η_t, the strength of the BL source, C_s = s_o R_⊙/η_t, and the strength of the Ω-effect, C_Ω = Ω_o R_⊙²/η_t. For the reference model used in this work, the following values are adopted for these dimensionless parameters: Re = 100, C_s = 35 and C_Ω = 9.27 × 10³, translating into dimensioned values of the equatorial rotation rate of Ω_o = 2π × 456 nHz, a maximal surface flow speed of v_o = 21.5 m s⁻¹, an amplitude of the BL source of s_o = 7.54 m s⁻¹, and a turbulent diffusivity at the surface of η_t = 1.5 × 10¹² cm² s⁻¹.

Here, we use the same model ingredients as done in Hung et al. (2017), namely a two-step profile in the radial direction for the resistivity profile:

$\begin{matrix} \frac{η}{η_{t}} = \frac{η_{c}}{η_{t}} & + \frac{η_{m} - η_{t}}{2 η_{t}} [1 + tanh (\frac{r - r_{bcz}}{d_{1}})] \\ + \frac{η_{t} - η_{m}}{2 η_{t}} [1 + tanh (\frac{r - r_{2}}{d_{1}})], \end{matrix}$ $\begin{aligned} \frac{\eta }{\eta _t}=\frac{\eta _c}{\eta _t}&+\frac{\eta _m-\eta _t}{2\eta _t}\left[1 +\tanh \left(\frac{r-r_{bcz}}{d_1}\right)\right]\\&+\frac{\eta _t-\eta _m}{2\eta _t}\left[1+\tanh \left(\frac{r-r_2}{d_1}\right)\right],\nonumber \end{aligned}$ (3)

where η_c = 10¹² cm² s⁻¹, η_m = 10¹¹ cm² s⁻¹, η_t = 1.5 × 10¹² cm² s⁻¹, r_bcz = 0.72 (representing the base of the convection zone in our model), r₂ = 0.95, d₁ = 0.016. Using this two-step resistivity profile with a high diffusion at the surface helps lower the ratio of radial magnetic field at the pole to that near the equator (Hotta & Yokoyama 2010). A high value of the diffusivity at the bottom helps to produce a more realistic profile for the magnetic cycle; in particular, the tendency of strong cycles to rise faster than weak ones, the so-called Waldmeier-effect (Karak & Choudhuri 2011). However, we do not anticipate any strong impact coming from this particular choice of high diffusivity on the results of the assimilation procedure.

The expression of the nonlocal BL source term is expressed as:

$\begin{matrix} S (r, θ, B_{ϕ}) = & \frac{1}{2} [1 + tanh (\frac{r - r_{2}}{d_{2}})] [1 - tanh (\frac{r - 1}{d_{2}})] \\ \times cos θ sin θ {[1 + {(\frac{B_{ϕ} (r_{bcz}, θ, t)}{B_{0}})}^{2}]}^{- 1} \\ \times B_{ϕ} (r_{bcz}, θ, t), \end{matrix}$ $\begin{aligned} S(r,\theta ,B_\phi ) =&\frac{1}{2}\left[1+\tanh \left(\frac{r-r_2}{d_2}\right)\right] \left[1-\tanh \left(\frac{r-1}{d_2}\right)\right]\\&\times \cos \theta \sin \theta \left[1+\left(\frac{B_\phi (r_{bcz},\theta ,t)}{B_0}\right)^2\right]^{-1}\nonumber \\&\times B_\phi (r_{bcz},\theta ,t),\nonumber \end{aligned}$ (4)

where B_ϕ(r_bcz, θ, t) represents the toroidal field at the base of the convection zone, the thickness d₂ = 0.008 and the radii r₂ and r_bcz are defined above. We here choose a low value of the quenching parameter B₀ = 9 G in order to have surface radial field values directly comparable to the observations (see Fig. 2). Indeed, a more conventional value of B₀ = 10⁴ G always produces unrealistically strong values of the surface radial field at the poles, as reported in a number of previous studies (see for example the review of Charbonneau 2010).

We turn to the meridional circulation profile. We express the flow in the convection zone as the curl of the stream function ψ:

$\begin{matrix} v_{p} (r, θ, t) = \nabla \times (ψ (r, θ, t) e_{ϕ}) . \end{matrix}$ $\begin{aligned} \mathbf v _p(r, \theta ,t) = \nabla \times (\psi (r,\theta ,t) \mathbf e _{\phi }). \end{aligned}$ (5)

We specify the explicit expression of ψ(r, θ, t) in this case in terms of its expansion on a chosen set of basis functions, namely Legendre polynomials in latitude and sine functions in radius:

$\begin{matrix} \begin{matrix} ψ (r, θ, t) = - \frac{2}{π} {(\frac{r - r_{mc}}{1 - r_{mc}})}^{2.5} (1 - r_{mc}) \\ \times {\begin{matrix} \sum_{k = 1}^{m} \sum_{l = 1}^{n} d_{k, l} (t) sin [\frac{k π (r - r_{mc})}{1 - r_{mc}}] P_{l}^{1} (- cos θ) & if r_{mc} \leq r \leq 1 \\ 0 & if 0.6 \leq r < r_{mc}, \end{matrix} \end{matrix} \end{matrix}$ $\begin{aligned} \begin{split}&\psi (r,\theta ,t) = -\frac{2}{\pi } \left(\frac{r-r_{mc}}{1-r_{mc}}\right)^{2.5}(1-r_{mc}) \\&\times {\left\{ \begin{array}{ll} \sum \limits _{k = 1}^{m} \sum \limits _{l = 1}^{n} d_{k,l}(t) \sin \left[\frac{k\pi (r-r_{mc})}{1-r_{mc}}\right] P_l^1 (-\cos \theta )&\mathrm{if}\; r_{mc} \le r \le 1 \\ 0&\mathrm{if}\; 0.6 \le r < r_{mc},\end{array}\right.} \end{split} \end{aligned}$ (6)

where P_l¹ is the associated Legendre polynomial of degree l and order 1. The meridional flow is allowed to penetrate to a radius of r_mc = 0.65 (i.e., slightly below the base of the convection zone). This profile was used to produce a 22-year magnetic cycle dynamo model with a surface flow ∼20 ms⁻¹ (Yeates et al. 2008), which is consistent with helioseismic observations (Ulrich 2010; Basu & Antia 2010; Komm et al. 2015). We note here that the expansion coefficients d_k, l(t) are obtained via our data assimilation procedure explained in the next section and are modulated over time so that the meridional flow is time-dependent.

Finally, the numerical domain is (r, θ)∈[0.6, 1]×[0, π]. The grid size is n_r × n_th = 129 × 129, while the time step is 10⁻⁶, equivalent to 0.112 day. The toroidal field is B_ϕ = 0 at the boundary of the domain and for A_ϕ, we imposed the pure radial field approximation at the surface, namely, ∂_r(rA_ϕ) = 0 at r = 1, and then A_ϕ = 0 at all the other boundaries. We note that we used radial boundary conditions for simplicity in this work, but the impact of more realistic conditions (as a match to an external potential field) would be worth considering in the future.

2.2. Solar data and preprocessing

As explained in the next section, our data assimilation procedure uses two types of solar data sets: (a) the sunspot number time series and (b) the line of sight surface magnetic field (B_los) from monthly generated synoptic maps. More precisely, for the SSN, we use the new (v2.0) time series provided by the SIDC SILSO data base. Since, as explained later, the Solar Predict tool can assimilate hemispherical data, we use in practice the hemispherical data base from SILSO. The data base runs from 1992 until today. To have the ability to cover the same time period than B_los, in our previous study, we extended in the past the SILSO hemispherical data series with the UCLE station operated at the Royal Observatory of Brussels. This allowed us to obtain a hemispherical SSN time series from 1975 until present (see Fig. 1). We show in the appendix how we cross-calibrated the two time series around the 1992 period. This data is used in the objective function 𝒥_W (described later in this paper) through the variables W_N^o and W_S^o. We used the 13-month smoothed SSN time series for the assimilation when minimizing the objective function 𝒥_W. We also used the monthly average time series for plotting purposes, but its higher monthly temporal variation is too irregular to be used efficiently in the data assimilation procedure. We note that using the 13-month smoothed SSN data series is standard in the community. Indeed, this SSN series is the reference used to officially determine the amplitudes of the minima and maxima and the timing of each solar cycle. We further note that this results in a six-month time lag when assimilating the data for forecasting cycle 25, as this cycle is still ongoing. At the time of writing (Fall 2024), this means that we are assimilating the SSN data up to April 2024.

Fig. 1.

SSN time series used in the assimilation procedure. We assimilate the hemispherical data and can choose to either forecast the total or hemispherical solar activity. Shown in dark blue is the total 13-month smoothed SSN times series and in light blue shades the northern and southern sub time series (north + south = total). We superimpose in orange the monthly data, which is six months ahead by construction.

For the line-of-sight magnetic field, we made use of various sources of synoptic maps generated for each Carrington rotation (CR). In the present paper we use magnetic field data covering the period from early 1975 to April 2024 to be compatible with the 13-month smoothed time series, namely, from CR1625 to CR2282 even though we have data series reaching CR2288 (i.e., as late in 2024 as the one-month mean SSN). Therefore, depending on the time period considered, the data for B_los come from different sources (instruments). When available, we use in priority the GONG synoptic maps from CR2059 to CR2288 (Harvey et al. 1996). For the oldest time period we use mostly Wilcox observatory synoptic maps (CR1625−CR2007, CR2015−2017, CR2026, CR2035, and CR2040−2042) (Scherrer et al. 1977; Hoeksema 1991). Finally, in the case of data gaps we also made use of SOLIS synoptic maps (CR2008−2014, CR2018−2025, CR2027−2034, CR2036−2039, CR2043−CR2058, and CR2189) (Bertello & Marble 2015). From these synoptic maps, we were then able to homogenize the data set by projecting high resolution maps on the same latitudinal resolution, $N_{θ}^{o}$ $N_{\theta}^{o}$ , as the Wilcox synoptic maps (e.g., 89 latitudinal mesh points). We then averaged in longitude and created a butterfly diagram as a function of latitude and time of B_los (see Fig. 2, top panel). In practice, we further used the temporal and spatial filters of the butterfly diagram, as the underlying dynamo model cannot capture all the small-scale field variations. We performed several combinations and settled on two main filtering choices. We fixed the maximum degree, ℓ_max, used in the assimilated butterfly diagram to 10. For the temporal filter, we chose between either one or five-year filtering time windows. Results for the forecasting of the next solar cycle can differ depending on the filtered observation files used during the data assimilation part. However, in most cases considered in this study, the forecast maximum amplitude and timing agree reasonably well (e.g., within the statistical error bars). In Fig. 2, we illustrate the original data used and its one-year and five-year temporal filtered versions, assuming ℓ_max = 10 as the latitudinal spatial filter. We see that the main polarity inversions and equatorward and poleward dynamo branches are retained, as are the polar flux accumulation and timing of reversal (within small acceptable variations).

Fig. 2.

Magnetic butterfly diagram depicts the azimuthally averaged magnetic synoptic maps (coming from various sources) for each Carrington rotation from CR1625 until CR2288 and stacked in time. Top row: Unfiltered data and one-year (middle row) and five-year (bottom row) filtered versions. The later two can be used as line-of-sight magnetic field observations in our assimilation pipeline.

2.3. Data assimilation strategy

The assimilation setting adopted in this work is similar to the one used in Hung et al. (2017), except that real solar observations are used here where only synthetic data were assimilated in our previous studies. We only recap the key features here and we refer to Hung et al. (2017) for the detailed description of the method.

2.3.1. Data assimilation setting

The idea of the present work is to assimilate data over a certain period of time, often covering a full cycle period and thus a typical total time of about 11 years, to constrain the control vector that will be used to produce a forecast for the future magnetic field trajectory. Our control vector consists of a part related to the meridional flow and another part related to the initial magnetic field. As we did not want to reconstruct the flow and field pointwise because the dimension of the control vector would then be too large, we expanded our meridional flow and initial magnetic field on well-chosen basis functions. The coefficients of those expansions constitute our control vector. The flow will thus be determined by the coefficients d_k, l as defined in Sect. 2.1 and the magnetic field will be represented by its expansion on the eigenvectors of the covariance matrix of the dynamo model, as explained in Appendix B of Hung et al. (2017).

To assimilate data into our dynamo model, we use a variational assimilation technique, which consists of adjusting a control vector that minimizes a well-defined objective (or cost) function measuring the misfit between outputs of the model and observations. The minimization is performed using a quasi-Newton technique, which requires the computation of the function to be minimized, along with its gradient with respect to each component of the control vector. The choice of the objective function is quite crucial in this procedure and details of each term involved in our function, 𝒥, are given below, as well as details about the control vector composed of the meridional flow and the initial magnetic field. As we intend for our meridional flow to vary over time to produce a best-fitting trajectory for the magnetic field over the whole assimilation window, we decided to use the variational approach on shorter windows (typically of one year) over which our meridional flow is steady. We consequently used a variational technique within each one-year window over the ten-year time span of the data assimilation phase, with each one-yr window being the initial guess of the next one in a sequential approach. Figure 3 shows a schematic of our assimilation strategy.

Fig. 3.

Schematic of our data assimilation procedure.

We can summarize our procedure as follows: for the first year of the total assimilation window, the initial guess comes from a dynamo model based on a unicellular flow with a magnetic cycle of 22 years (i.e., the only nonzero coefficient in the MC expansion is d_1, 2 = 1/3). Then, for the subsequent assimilation windows, the initial guess will be the forecasted magnetic field and velocity at the end of the previous assimilation. Within each one-year window, we estimate the coefficients of the stream function and initial condition, which minimizes the objective function defined below. Consequently, we obtain a meridional flow described as a piecewise constant function in time.

After this assimilation procedure, our previously described BL flux transport dynamo model is run to produce the forecasted trajectory for the magnetic field. It uses the meridional flow and initial magnetic conditions resulting from the assimilation of the last window.

2.3.2. Objective function and regularization

As stated above, the idea of the variational method is, within each window, to minimize a well defined objective function which measures the misfit between the observed quantities and the corresponding outputs from the model. The objective function we define in this work is the sum of two terms measuring a misfit to the observations. The first term is the misfit on the line of sight magnetic field:

$\begin{matrix} J_{los} = \sum_{i = 1}^{N_{t}^{o}} \sum_{j = 1}^{N_{θ}^{o}} \frac{{[B_{los} (θ_{j}, t_{i}) - B_{los}^{o} (θ_{j}, t_{i})]}^{2}}{σ_{B_{los}}^{2} (θ_{j})} \cdot \end{matrix}$ $\begin{aligned} \mathcal{J} _{\rm los}=\sum _{i = 1}^{N_t^o} \sum _{j = 1}^{N_\theta ^o} \frac{\left[B_{\rm los}(\theta _j,t_i)-B_{\rm los}^o(\theta _j,t_i)\right]^2}{\sigma ^2_{B_{\rm los}}(\theta _j)}\cdot \end{aligned}$ (7)

Here B_los is the surface radial field predicted by our mean field dynamo model and B_los^o is the observation extracted from the butterfly diagram. The weight, σ_{B_los(θ)}, is simply 1/sin(θ). This additional factor enables us to put more weight on the activity belt at mid-latitudes, as this is where most sunspots appear.

The second term is related to the misfit in the SSN in the northern and southern hemispheres,

$\begin{matrix} J_{W} = \sum_{i = 1}^{N_{t}^{o}} \frac{{[W_{N} (t_{i}) - W_{N}^{o} (t_{i})]}^{2}}{σ_{W_{N}}^{2}} + \frac{{[W_{S} (t_{i}) - W_{S}^{o} (t_{i})]}^{2}}{σ_{W_{S}}^{2}}, \end{matrix}$ $\begin{aligned} \mathcal{J} _{W}=\sum _{i = 1}^{N_t^o} \frac{\left[W_{N}(t_i)-W_{N}^o(t_i)\right]^2}{\sigma ^2_{W_N}}+ \frac{\left[W_{S}(t_i)-W_{S}^o(t_i)\right]^2}{\sigma ^2_{W_S}}, \end{aligned}$ (8)

where σ_{W_N} and σ_{W_S} represent the uncertainties on respectively the northern and southern SSN measurements. As stated above, the SSN in the northern and southern hemispheres have been observed and monitored in various solar stations since the mid-20th century and SILSO provides hemispherical SSN series since 1992 (noting that we have extended it back in time until 1975 using the UCLE station). However, in our dynamo model, we do not produce spots at the surface of our domain. We thus have to use a proxy for the SSN, to be compared to the observables. As in Hung et al. (2017), this proxy is taken to be the toroidal field at the base of the convection zone, integrated over a small radial extent and averaged in latitude. We thus define W_N(t) and W_S(t) our northern and southern hemisphere sunspot proxies as:

$\begin{matrix} W_{N} (t) & = f a c \times \int_{θ = 0}^{θ = π / 2} \int_{r = 0.7}^{r = 0.71} B_{ϕ}^{2} (r, θ, t) r^{2} sin θ d r d θ, \end{matrix}$ $\begin{aligned} W_N(t)&= fac \times \int _{\theta = 0}^{\theta =\pi /2} \int _{r = 0.7}^{r = 0.71} B^2_{\phi }(r,\theta ,t) r^2 \sin \theta \mathrm{d}r \mathrm{d}\theta , \end{aligned}$ (9)

$\begin{matrix} W_{S} (t) & = f a c \times \int_{θ = π / 2}^{θ = π} \int_{r = 0.7}^{r = 0.71} B_{ϕ}^{2} (r, θ, t) r^{2} sin θ d r d θ, \end{matrix}$ $\begin{aligned} W_S(t)&= fac \times \int _{\theta =\pi /2}^{\theta =\pi } \int _{r = 0.7}^{r = 0.71} B^2_{\phi }(r,\theta ,t) r^2 \sin \theta \mathrm{d}r \mathrm{d}\theta , \end{aligned}$ (10)

where fac is a scaling factor so that our proxies are directly comparable to the observed SSN values. The value of fac = 150 is used here.

In this work, we also imposed some constraints and regularization on our control vector: the meridional flow and initial magnetic conditions. As described above, we adopted a decomposition of the stream function on Legendre polynomials in latitude and on sine functions in radius. We can thus control the meridional flow by imposing some constraints on the coefficients of this expansion. In particular, we chose to:

impose the condition that the only nonzero coefficients in the meridional flow expansion are d_1, 2 (representing the 1-cell per hemisphere flow) and d_1, 3 (representing a flow with three cells in latitude, to account for some degree of north-south asymmetry in the flow).
impose the condition that the initial magnetic field of one assimilation window stays close to the value of the magnetic field at the end of the previous window. This helps prevent discontinuities in the magnetic field between successive windows and to propagate information from the previous assimilation to the next one. The first additional term in the objective function is then expressed as:

$\begin{matrix} J_{1} = \sum_{i = 0}^{Nmag} {(c (i) - c_{old} (i))}^{2}, \end{matrix}$ $\begin{aligned} \mathcal{J} _1 = \sum _{i = 0}^{Nmag} \left(c(i)-c_{\rm old}(i)\right)^2, \end{aligned}$ (11)

where Nmag = 10 is the number of eigenfunctions used to expand our magnetic initial conditions, c(i) is the coefficient on the ith eigenfunction in the current one-year window and c_old(i) is the coefficient on the ith eigenfunction in the previous one-year window (see Hung et al. 2017, for further details about our expansion procedure).
impose the condition that d_1, 3 should be less than a certain percentage p of d_1, 2. We adopt the value of p = 10% in this work. This adds the following term:

$\begin{matrix} J_{2} = \sum_{k = 1}^{m} \sum_{l = 1}^{n} {(max (0, | d_{k, l} | - p \times | d_{1, 2} |))}^{2} for d_{k, l} \neq d_{1, 2} . \end{matrix}$ $\begin{aligned} \mathcal{J} _2 = \sum _{k = 1}^{m} \sum _{l = 1}^n \left(\max (0, \vert d_{k,l}\vert - p \times \,\vert d_{1,2}\vert )\right)^2 \,\,\, \mathrm{for} \,\,\, d_{k,l} \,\,\, \ne d_{1,2}. \end{aligned}$ (12)
impose the condition that the deviation from the value of d_1, 2⁰ = 1/3 for d_1, 2 (producing a period of around 22 years) is less than a certain amount p₂. We adopt values between p₂ = 1% and 10% in this work. This term is expressed as:

$\begin{matrix} J_{3} = {(max (0, | d_{1, 2} - d_{1, 2}^{0} | - p_{2} \times d_{1, 2}^{0}))}^{2} . \end{matrix}$ $\begin{aligned} \mathcal{J} _3 = \left(\max (0, \vert d_{1,2}-d_{1,2}^0 \vert - p_2 \times d_{1,2}^0)\right)^2. \end{aligned}$ (13)

The assimilation procedure thus consists of the minimization of the full objective function, 𝒥, composed of the sum of all the terms listed above in Eqs. (7), (8), (11), (12), and (13), weighted by the different coefficients A_los, A_W, A₁, A₂, and A₃ as follows:

$\begin{matrix} J = A_{W} J_{W} + A_{los} J_{los} + A_{1} J_{1} + A_{2} J_{2} + A_{3} J_{3} . \end{matrix}$ $\begin{aligned} \mathcal{J} =A_W \mathcal{J} _W+ A_{\rm los}\mathcal{J} _{\rm los} + A_1 \mathcal{J} _1+ A_2 \mathcal{J} _2+ A_3 \mathcal{J} _3. \end{aligned}$ (14)

We note that we chose to fix A_los = 1, so that the relative importance of the SSN observations compared to the line-of-sight radial field is solely determined by the value of A_W. An investigation into the best choices for the regularization coefficients A₂ and A₃ compared to the value of the misfit coefficients A_W and A_los was conducted on basis of the prediction of cycle 24 and the results are presented in Sect. 3.2. The last coefficient A₁ was chosen in order to maintain a satisfying degree of continuity between two successive one-year windows and to impose some memory between the ten one-year windows in the assimilation procedure, ensuring that additional information is gained by assimilating each successive window. It is fixed to a value of A₁ = 10² for all cases discussed below. We note here that coefficients A₁, A₂, and A₃ are (for the moment) not part of our control vector as we wanted to focus on the physical quantities (i.e., the meridional flow and magnetic field initial conditions). Assimilating those extra three parameters in our pipeline has been postponed to later studies.

Our assimilation procedure finally consists of adjusting a control vector constructed by the two-meridional flow coefficients, d_1, 2 and d_1, 3, and the ten coefficients representing the projection of the magnetic field on the eigenvectors of the covariance matrix. Our assimilated data consist of 89 latitudinal points times 12 months for B_los and 1 value for each hemisphere times 12 months for the SSN smoothed on a monthly basis. We thus have a largely overdetermined problem, in which removing part of the data would thus have a limited impact on our assimilation results.

2.3.3. Ensemble members

Here, we describe how we produced an ensemble of predictions for the next solar cycle, when the data assimilation procedure described above is over.

Creating an ensemble of forecast trajectories is a common procedure as it allows us to test the sensitivity to initial conditions of the prediction. It is well known since the study of Lorenz (1963) that small perturbations in a dynamical system can lead to large variations on a long timescale. We have shown in Hung et al. (2017) that our data assimilation pipeline, when let free to evolve beyond its final assimilated observations data points, takes between 10 to 15 years to reach the same level of errors/mismatch in a controlled experiment as a free dynamo run. Hence the temporal horizon limit for our forecast is of the order of one solar cycle. Beyond that 10 or 15 year horizon limit, the uncertainty has grown too large to be able to consider the prediction as meaningful.

To probe this uncertainty and its growth, it is convenient to generate an ensemble of trajectories by perturbing the initial magnetic field conditions and the meridional circulation used to generate the forecast trajectory. To do so, we introduce a random perturbation to the central values obtained at the end of the assimilation procedure. The time at which we introduce this random perturbation is called the extrapolation time τ_exyr. Before that point, the dynamo model is being controlled via our data assimilation procedure to follow the solar activity data and beyond that point, it is let free to go with the initial conditions and meridional flow obtained during the assimilation and a fixed set of parameters. We can choose the number of ensemble members N_ens. We usually use values between 30 and 60 ensemble members for each value of A_W, which allows us to sample reasonably well the most probable forecast trajectories for a given set of parameters and initial magnetic conditions. We further use a set of 12 A_W values A_W = [0.2, 0.5, 0.8, 0.9, 1, 2, 3, 6, 11, 21, 51, 101], from loosely tracking the SSN versus tightly following it during the assimilation phase. Overall, if we choose 40 ensemble members per A_W values, we get a final set of 480 distinct trajectories (members) that constitutes our global ensemble. In Fig. 4 we show each of the 480 trajectories in the most general hemispherical version of the pipeline for illustration purposes. Please note that these trajectories are obtained by perturbing both the meridional flow and the magnetic field, explaining the sharp but continuous change of the 480 trajectories with respect to the mean seen in the figure. The level of perturbations has been chosen to provide a representative sampling of the observed historical variations of the 11 year solar cycle.

Fig. 4.

Trajectory of the 480 ensemble members. Example of hemispherical data assimilation starting in mid 2013 and ending in τ_exyr = mid 2023. The 12 × 40 = 480 extrapolations (forecast ensemble trajectories) computed by the solar dynamo model are starting beyond that time. We clearly see that further we look in the future larger the spread is both in amplitude and timing of the next maximum of cycle 26. Forecast of the next minimum around 2030 is within our temporal horizon limit at the time of the writing of the paper. In order not to overcomplicate the figure, we voluntarily omit to plot the 2 × 480 trajectories of each hemisphere.

2.3.4. Total versus hemispherical

As stated before, our data assimilation pipeline can assimilate either the total (global) SSN time series or its hemispherical counterpart. The only difference is the objective function where we either consider the two hemispheres separately or together. The objective function shown in Eq. (8) distinguishes the two hemispheres. When they are assimilated together, function 𝒥_W becomes:

$\begin{matrix} J_{Wtot} = \sum_{i = 1}^{N_{t}^{o}} \frac{{[W_{T} (t_{i}) - W_{T}^{o} (t_{i})]}^{2}}{σ_{W_{T}}^{2}}, s \end{matrix}$ $\begin{aligned} \mathcal{J} _{Wtot}=\sum _{i = 1}^{N_t^o} \frac{\left[W_{T}(t_i) -W_{T}^o(t_i)\right]^2}{\sigma ^2_{W_T}},s \end{aligned}$ (15)

with σ_{W_T} the uncertainty on each SSN measurement, W_T^o(t_i) = W_N^o(t_i)+W_S^o(t_i) the total observed SSN and W_T(t_i) = W_N(t_i)+W_S(t_i) the proxy for the total SSN defined as

$\begin{matrix} W_{T} (t) = \int_{θ = 0}^{θ = π} \int_{r = 0.7}^{r = 0.71} B_{ϕ}^{2} (r, θ, t) r^{2} sin θ d r d θ . \end{matrix}$ $\begin{aligned} W_T(t) = \int _{\theta = 0}^{\theta =\pi } \int _{r = 0.7}^{r = 0.71} B_{\phi }^2(r,\theta ,t) r^2 \sin \theta \mathrm{d}r \mathrm{d}\theta . \end{aligned}$ (16)

3. Validation of the forecasting strategy on cycles 22, 23 and 24

This first result section is dedicated to addressing the quality of our assimilation pipeline on the three previous cycles; namely: 22, 23, and 24, for which the true SSN evolution is known. In the next section, we focus on the ongoing cycle 25, for which an estimate of the maximum as well as the next minimum is given.

3.1. Forecasting of the past three cycles

3.1.1. Overall prediction as a function of the cycle phase

In this section, we wish to assess the quality of our data assimilation and forecasting procedures on past cycles for which the evolution of the sunspot number in both hemispheres are known. Our main goal here was to test under which conditions our technique would produce accurate and precise predictions for the timing and maximum of cycle n + 1, when the data were assimilated on cycle n at different phases of the cycle. We noticed that an important parameter was the weight A_W put on the observations of the SSN and thus decided to vary this parameter to produce an ensemble from which predictions are estimated.

For the assimilation step, we used six different values for A_W, namely 1, 2, 6, 11, 21, and 51, except for cycle 22, where we added smaller A_W values of 0.2, 0.5, 0.8, 0.9. The total assimilation window is chosen to be ten years in length, divided into ten one-year windows. As stated before, we test the effect of assimilating different phases of the cycle. We thus have five different cases, with the total assimilation window ending before the solar minimum, then around the solar minimum, and then at one year, two years, and three years after the minimum. For example, in order to forecast cycle 24, we use windows covering cycle 23 in the intervals [1997, 2007], [1998, 2008], [1999, 2009], [2000, 2010], and [2001, 2011].

After the assimilation procedure, we get a best-fit model from which we calculate the predicted trajectories. To do so, the meridional flow coefficients obtained after the data assimilation for each model are perturbed using a Gaussian noise with a typical amplitude of 6 × 10⁻⁵ to produce an ensemble of 40 members per value of A_W (cf. Fig. 4). A beam of predictions is then produced for each cycle phase, either for each A_W or for all the values of A_W together.

Results for cycle 22 are illustrated in this section in Fig. 5, and shown in the appendix for cycles 23 and 24 in Figs. C.1 and C.2. In addition to these figures, more quantitative estimates of the statistics of the distributions of the time at maximum t_max and of the sunspot number at this maximum SSN_max are given in Table C.1. Those more quantitative results are further illustrated in Fig. 6.

Fig. 5.

Prediction for the sunspot number of cycle 22 as a function of time, at different assimilated cycle phases. For the sake of clarity, the legend (same for all plots) is indicated only on the 3rd panel.

We start by focusing on Figs. 5, C.1, and C.2. Those figures show that for all cycles, the assimilation phase (until the vertical dashed line) performs well at following the hemispherical and total SSN data. Since we apply a variational assimilation on separate one-year windows, we see some discontinuities between consecutive windows due to the fact that continuity between windows is constrained on the initial magnetic conditions (by the term weighted by A₁ in the objective function; see Eq. 11) but not on the meridional flow coefficients. The forecasting part is of course more interesting to focus on. The green curve indicates the total predicted SSN while the purple (resp. blue) indicates the predicted SSN in the northern (resp. southern) hemisphere. The true values are shown in red, pink and yellow respectively. In all the predicted curves, the 1-σ dispersion is shown in shaded areas around the mean curves. The date at which the assimilation phase ends is here clearly an important parameter in the success of the prediction. As expected, forecasts initiated in the rising phase usually perform better at following the evolution (even after the maximum) than forecasts starting slightly before the previous minimum. This is particularly clear when we compare the last panels of each figure, where the correct evolution of the total and hemispherical data is usually captured up to four or five years after the maximum (i.e., close to the next minimum) and the first panels where the predictions most of the time fail right away at following the true curves. In between, the forecasted curves get closer and closer to the real values, with smaller and smaller error bars. This is true mostly for cycles 23 and 24. Cycle 22 seems to present more difficulties, in particular the error bars on the second and third panels are large, owing to the fact that 2 different populations of predictions (weak and strong cycles) appear for different values of the weighting parameter of the SSN values A_W. We come back to cycle 22 in Sect. 3.1.3.

We note that the plots show the results of the predictions using all the values of A_W together. However, we noticed in our procedure that small values of A_W perform better for a prediction made from the previous cycle minimum, while larger values of A_W perform better when the predicted cycle has already started its rising phase. This can be understood by the fact that A_W represents the weight given to the SSN (represented by the integral of B_ϕ at the tachocline in our model) compared to the weight given to the surface radial field. At solar minimum, there will of course be only a few sunspots, while the strength of the radial field will be the main progenitor for the sunspot-producing toroidal field of the next cycle (Bhowmik et al. 2023).

We see that our procedure enables us to obtain some asymmetries between the northern and southern hemispheres. This is seen in the blue and purple curves of Figs. 5, C.1, and C.2, which are peaking at different values and at different times. If we focus on the maximum SSN values, we see that the level of asymmetry produced in our model is compatible with what is seen in the observations. Indeed, the relative difference between the northern and southern SSN typically reaches 10 to 30% in the observations as well as in the predicted curves. However, what our pipeline has more difficulty to reproduce is the surge-like behaviour of the yellow and pink curves of Figs. C.1 and C.2, which produce very distinct peaks in the northern and southern hemispheres. This surge-like behaviour is indeed not designed to be reproduced in our current pipeline where the forecast trajectories coming from our dynamo model are close to sinusoidal and thus much smoother. If we now focus on the timing of the cycle, we see that the model does not produce asymmetries as pronounced as in the real Sun. This is especially true for cycle 24, where the 13-month smoothed data show a 2.6 years delay between both maxima. In comparison, our model predicts only a maximum delay of less than one year. This could be due to the fact that we constrained in our procedure the only coefficient producing an asymmetry with respect to the equator, namely d_1, 3, to be smaller than 10% of d_1, 2. Relaxing this constraint and adding more coefficients to the meridional flow would probably help to produce less similar behaviours for the northern and southern hemispheres and possibly higher delays between the maxima.

3.1.2. Convergence of predictions

Next, we turn to Fig. 6, where we show the prediction of only the maximum total SSN and the time of this maximum for the different cycle phases shown in the previous figures. To produce these plots, we calculated the statistics (mean, median, first and third quartiles) of the distribution of the points of coordinates (t_max, SSN_max) for all members of the ensemble. We note that the values of the means in this figure will be different from the maximum values of the mean curves shown in Figs. 5, C.1, and C.2 and thus should not be directly compared. Moreover, for these more quantitative estimates, some selection of cases has been applied. To be considered in our statistics, a predicted maximum had to happen prior to seven years after the end of the assimilation phase; it would also have to be contained within the range of [60, 350], which would be consistent with minimum and maximum values ever observed in the past solar cycles. The extent of the horizontal and vertical lines are determined by the first and third quartiles of the distributions. The mean values are found at the intersection between these two lines. The cyan diamonds show the true values of the maximum SSN. We usually plot two values since the cycles usually present a double peak due to the delay between the hemispheres. This is quite obvious when looking at the red curves of Figs. 5, C.1, and C.2.

Fig. 6.

Summary of the predicted time and sunspot number at maximum for cycles 22 (upper panel), 23 (mid panel) and 24 (lower panel) predicted from different phases of the previous cycle. The error bars are taken between the first and third quartiles of the distributions of t_max and SSN_max given in Table C.1. The second quartiles (i.e., median values) are indicated as dots and mean values as stars located at the intersection of the error bars. The cyan dots indicate the true values of the maximum with error bars on the sunspot number. For all cycles, two cyan dots are present, representing the two peaks of similar amplitude in the different cycles, visible on the 13 month smoothed sunspot number (red curves of Figs. 5, C.1, and C.2).

Figure 6 clearly shows the convergence of our predictions, while the assimilated cycle phase gets closer to the maximum. In all cases, the error bars indeed decrease and the predicted values get closer to the true values. We note that our predictions do not show a double-peak shape since the hemispheres remain quite in phase with each other. We thus estimate that the prediction could be considered as successful when a value between the two true peaks is reached. We also note from Fig. 6 that the errors made on the timing of the predicted cycle is in general much less (in percentage) than the error made on the sunspot maximum. For cycle 23, the maximum error is around one year and reduced to about six months for cycle 24. This could be due to the fact that our background dynamo model is calibrated to produce an 11 year cycle while no a priori is introduced for the sunspot number.

3.1.3. Difficulties with cycle 22?

The previous section showed that the predictions of cycle 22 have given significantly less good results than for the two other cycles. Indeed, as seen in Fig. 5, the predictions starting from the previous minimum (in 1985 and 1986) systematically give a very weak amplitude of the next maximum compared to the true value. When the assimilated period extends past the minimum phase (i.e., ends in 1987 and 1988), the error bars on the predicted future (both the maximum SSN and timing) are quite large. For example, for the prediction starting in 1987, the difference between the third and first quartiles is about 100% of the median while the same value reduces to 70% for the same cycle phase for cycle 23 and to 30% for cycle 24. The error bars are relatively small as can be seen in the bottom panel of Fig. 6. The reason for the larger dispersion in the predictions for cycle 22 is unclear but possible explanations may be given. First, it is quite obvious that this cycle was stronger than the two others and the prediction from the previous minimum may be more difficult if too much weight is put on the high A_W values. This is why we included for this cycle additional values of A_W < 1, such that more weight is put on the radial line-of-sight magnetic field compared to the SSN. We indeed find that low values of A_W are generally shown to be better at capturing the high maximum of cycle 22, as compared to higher values.

A second possibility for the peculiarity of this cycle is the long-lasting minimum which can be seen in Fig. 5, especially for the hemispherical data. We indeed observe that the curves remain very flat around the same value of the SSN of around 10 to 15 in each hemisphere for almost 2 years between 1985 and 1987. Especially for an assimilation phase finishing within this period (first and second panel of Fig. 5), the model seems to have difficulties capturing the particularly fast rise of the next cycle. This lingering minimum is quite obvious on the 13 month smoothed SSN data but less clear if the five-year smoothed curves are used. Incorporating the five-year smoothed data in our assimilation and forecast procedures, we indeed find quite different results. The results are illustrated in Fig. 7 where the predictions are shown starting from year 1986 using the 13-month smoothed (top panel) and five-year smoothed (bottom panel) SSN time series. All other parameters are kept identical between the two cases. It is clear from this figure that the sharper minimum of the five-year smoothed curve allows the model to better capture the rising phase of the cycle. The rising phase being better reproduced, the prediction for the next maximum is thus closer to the true value.

Fig. 7.

Predictions for cycle 22 starting from year 1986, using the 13 month smoothed SSN data (top) and the five-yr smoothed time series (bottom).

This test shows that it may be worth considering various filtering windows for the SSN as well as for the butterfly diagram in an ensemble prediction to minimize the adverse effect of some peculiar features, like the long-lasting minimum appearing mostly in the 13 month smoothed data here to impact and bias the forecast.

3.2. Influence of parameters

In this subsection we justify our choice of the weighting parameters A_i’s of the background terms involved in the objective function 𝒥. We focus, in particular, on the values of A₂ and A₃ which respectively appear in front of the term imposing that the coefficients of the meridional circulation other than d_1, 2 are not higher than 10% of d_1, 2 and in front of the term ensuring that d_1, 2 stays close to its nominal value of 1/3.

In order to test the effect of these two coefficients, we fix the other coefficients to the following values: A_W = 11, A_los = 1 and A₁ = 10² while varying A₂ and A₃ from 10³ to 10⁶. We assess the quality of the prediction by measuring the distance between the predicted maximum (both its amplitude SSN_max and its timing t_max) and the true maximum available for cycles 23 and 24.

The results are presented in Fig. 8 for the prediction of cycle 24. It is rather clear from this figure that some cases perform poorly for the prediction of the maximum. For example, the blue and red colors corresponding to values of A₂ = 10³ and 10⁴ usually give large misfits, independently of the value of A₃. Similarly, plus and stars symbols, corresponding to values of A₃ = 10³ and 10⁴ usually fail at forecasting the max. We conclude from these parametric surveys that a value of A₂ more than 10⁴ and A₃ more than 10⁵ are necessary to ensure convergence towards a better forecast. We indeed observe in the figure that beyond those values, the predicted maximum does not change very much and the quality of the convergence thus stalls. In the results presented in this work, we fixed A₂ = 10⁵ and A₃ = 10⁶, corresponding to the green crosses in the plots, which produce satisfying results for cycle 24. Equivalent tests were conducted on cycle 23 (not shown here) with similar results.

Fig. 8.

Predictions of the value and timing of the maximum sunspot number for cycle 24 for different values of A₂ and A₃. The included cases correspond to a 10-year assimilation window extending until 2011 for the prediction of cycle 24. The various symbols correspond to different values of A₂ (colors) and A₃ (markers). The cyan symbol is the true maximum.

4. Cycle 25 forecasting

We now present how our forecasting tool performs for the current cycle 25. At the time of writing of the paper we are close to the peak of cycle 25. NOAA/NASA consensus in 2019, which the current pipeline was one of the forecasting tools taken into account in drawing the final consensus, was announcing a mean maximum SSN value of 113 ± 18 that would be reached between 2023 and 2026. A recent update in October 2023 only based on the Space Weather Prediction Center (https://www.swpc.noaa.gov/) tools gives a higher SSN value 153 ± 8 and a more precise maximum occurring on September 2024 ±5 months.

A beta version of our pipeline was giving a low value of 92 ± 10 for the cycle maximum and a timing in mid-2024 (Brun et al. 2020). In the next subsection we present the released version of our 11 year solar cycle forecasting tool with the objective function described in Sect. 2.3.2 that provides a better control of systematics and forecasted value of the previous cycles 22 to 24 (see Sect. 3). We did a rerun of the December 2018 conditions and compared it to various forecasting done since, namely October 2023 as in the latest update cited above and April 2024 for the latest available data.

4.1. No cast prediction

Before discussing what our tool predicts for solar cycle 25, we would like to discuss the “no cast” solution of cycle 25. The “no cast” solution consists of using historical data as predictors. The simplest one is to take the previous cycle 24 or 23 (to respect an odd/even alternation) and to use it to forecast cycle 25. A slightly more elaborate one is to use the mean of either the first 24 cycles or say the last 5.

As can be seen in Fig. 9, this procedure of using previous cycles to predict the next does not provide a very accurate estimate of cycle 25. The two averages are much higher than cycle 25, meaning that this cycle is moderate. Likewise, cycle 24 is not providing a good estimate as it is clearly too weak and by opposition cycle 23 is too strong. This clearly demonstrates how difficult it is to forecast precisely the next solar cycle (Bushby & Tobias 2007; Petrovay 2020). Historical data are useful as they provide a global probable envelope of what the next solar cycle could be, but it is not accurate enough. Hence, more sophisticated tools such as Solar Predict are needed if we wish to go beyond historical data, see for instance the reviews of Hathaway (2015), Petrovay (2020) for a list of the various methodologies developed in the community.

Fig. 9.

No cast of cycle 25, where we compared cycle 25 to various historical data: the previous cycle 24 (dashed line), cycle 23 (the odd numbered previous cycle in dotted line) and two averages of the past record, a mean of the previous 24 cycles (cyan dash-dotted), and of the last 5 (pink dash-dotted). As we can see, none of these no cast or historical data provide an accurate prediction of the ongoing cycle 25.

Finally one could also consider the timing of the current cycle with respect to the Gleissberg cycle. We recall that the Gleissberg cycle is a possible 90 to 100-yr modulation of the 11 year cycle envelope. For cycle 25 this would mean considering the amplitude of cycles 14 to 16 (i.e., about 100 years ago) for instance. However, that does not work so straightforwardly as cycles 14 to 16 do not represent a good match to cycle 25.

4.2. Prediction of the total SSN

We will now turn to our prediction of the current cycle 25. The cycle is already well advanced and such a forecasting must be considered as a validation step as we are likely experiencing in 2024 the peak of solar activity. In Figs. 10 and 11 we show the typical output of our Solar Predict forecasting tool. As explained before, we are assimilating the SSN and the butterfly diagram data into our solar dynamo model. By the means of a minimisation procedure we adapt the meridional circulation coefficients to match the assimilated data over a 10-year long period. We clearly see that the DA algorithm is doing very well for the SSN evolution (Fig. 10), as the model trajectory (in blue dash triple dot line style) closely follows the 13 month smoothed SSN time series (in red). We note that the small disagreement at the very beginning is solely due to the time it takes in the first variational window to adjust the guess solar dynamo model to the solar data.

Fig. 10.

Solar cycle 25 forecast as of April 2024, for the sunspot number evolution. The next maximum as well as next minimum values are shown for the total SSN.

Fig. 11.

Solar cycle 25 forecasts as of December 2023, for the butterfly diagram. The observed radial field is shown in the top panel and the results of the assimilation phase (from 2014 to 2024) and forecast (from 2024) are illustrated on the bottom panel.

The agreement is less clear during the assimilated phase of the butterfly diagram (Fig. 11). In particular, the assimilation captures only the largest-scale features of the observed pattern and the polar field gets much more confined close to the poles in our procedure than in the observations. This can be explained by the weight we impose in our objective function on some key observables. Indeed, we put more weight in the assimilation on the SSN than on the B_los by imposing a large number of A_W in our ensemble to be larger than 1. Secondly, by using a σ_{B_los}(θ) to be equal to 1/sin(θ), we also put more weight on the low latitudes; thus, do not expect as good a match at high latitudes. We see however here that the behaviour at low latitudes is correctly reproduced, with the positive and negative polarities appearing in the correct latitudinal regions around 2022 after the long period of solar minimum. We note that various functions have been implemented for σ_{B_los}(θ) to better take into account the polar field but since our main goal was the SSN evolution, we decided to keep 1/sin(θ) as this case gives the best fit to the SSN during the assimilation phase.

We now turn to the forecast itself, after the DA procedure has converged over the 10 years before time τ_exyr (indicated by the vertical dotted lines in both figures). On Fig. 11, if we focus on the forecast, we see a smooth evolution in time of both polarities and quite symmetric with respect to the equator. As will be discussed in the next section, we indeed do not produce large asymmetries between the two hemispheres in our current forecast of cycle 25. The smooth evolution is due to our dynamo model which produces, within the range of parameters considered here, a nearly sinusoidal shape for the magnetic fields as a function of time and large-scale features in space. It is of course more interesting here to focus on the forecast in the SSN shown in Fig. 10. The 3-yr SSN forecast trajectory (shown in green) is here extrapolated from the latest adjusted control vector at τ_exyr. We indicate in the figure the forecasted maximum and its timing. Error bars represent the 1-σ (standard deviation) statistical uncertainty coming from the 480 ensemble members. Because we are near maximum, the errors are here relatively small, representing about 2.4 months in timing and about ±15 in sunspot number. We will see in Sect. 4.5 that these errors can be larger if the forecast is made starting from an earlier date in the cycle phase. Our algorithm is currently forecasting on average a maximum this Fall 2024 and a moderate solar cycle amplitude of 143.1 ± 15.0.

4.3. Prediction of the hemispherical SSN

Because we are assimilating the SSN time series into a dynamo model that covers the full latitudinal range from a co-latitude of 0 up to π, we do not have to assume any symmetry with respect to the equator. We can hence assimilate the hemispherical SSN time series rather than the total, hence taking into account potential phase delay between the two solar hemispheres. Indeed it is well known that the Sun’s quadrupolar mode leads to asynchronous hemispheres by perturbing the dipolar mode that would otherwise impose the two hemispheres to be synchronous (DeRosa et al. 2012; Finley & Brun 2023). For instance, cycles 23 and 24 had very distinct reversals of the northern and southern poles, with a time lag for the southern hemisphere of about 17 and 27 months respectively, quite significant when compared to the solar cycle length. Hence it is quite important to be able to capture this underlying dynamics between the solar dipole and quadrupole to improve the forecast.

Furthermore, since we let the algorithm pick in each hemisphere what meridional circulation matches the best the hemispherical SSN time series, we can produce asynchronous northern and southern forecasts. From historical records we know that the poles reversals cannot be delayed from one another by more than a couple of years. This is due to the fact that the solar quadrupolar magnetic field amplitude is on average of the order of 20 to 25% of the solar dipole, except during the solar maximum, where the quadrupolar mode ℓ = 2, m = 0 dominates (DeRosa et al. 2012; Finley & Brun 2023). We note that it is usually better to refer to equatorially symmetric modes by opposition to anti-symmetric ones as during solar global reversals the equatorial dipole ℓ = 1, m = 1 is also involved. Indeed, in full 3D spherical geometry it is the sum ℓ + m that determines the parity of the magnetic mode, symmetric modes corresponding to an even sum (McFadden et al. 1991; DeRosa et al. 2012). Since our dynamo model is 2.5D, we only consider axisymmetric modes such that m = 0 and the parity of the degree ℓ directly provides the equatorial symmetry.

In Fig. 12 we display the hemispherical forecast of cycle 25. First we note that the DA algorithm is still doing very well at assimilating data in both hemispheres and adapting the control vector in each of them. This is possible thanks to the inclusion of the coefficient d_1, 3 that breaks the equatorial symmetry.

Fig. 12.

Same as Fig. 10, but for the hemispherical version of the solar 25 activity forecast.

We note that in April 2024, both hemispheres were forecasted to be slightly out of phase, with similar activity levels. During summer 2024 the southern hemisphere started to desynchronize more than anticipated, with a sudden activity surge, so that a time lag of half a year is becoming more likely. To illustrate the level of asymmetry captured during the assimilation phase, we show in Fig. 13 an example of the reconstructed flow streamfunction at 4 different windows during the assimilation phase. We note that the streamfunction of the last window (window 10) is the one used to produce the forecasted trajectory. Since the streamfunction does not depart by more than about 20% of the 1-cell basic flow, we plotted the relative difference between our reconstructed streamfunction and this basic flow. This figure shows that we indeed get departures from a symmetric flow with respect to the equator, limited by the constraint we impose on the coefficient d_1, 3 to be less than 10% of d_1, 2. Examining Fig. 14 also reveals some asymmetry in the MC streamfunction and in the magnetic field distribution. In this figure, the flow streamfunction and the initial magnetic toroidal field (colors) and poloidal field (contours) obtained by assimilating the last window (window 10) are shown in the meridian plane. As discussed before, this level of asymmetry in the flow and field is responsible for the different trajectories obtained for both hemispheres.

Fig. 13.

Difference in the meridional flow streamfunction between the result of the assimilation in one particular window and the basic one-cell meridional flow. This is chosen from the assimilation phase of a case with A_W = 21, representative of the mean (green curve shown in Fig. 12). We see that the flow never departs by more than around 20% from the 1-cell MC.

Fig. 14.

Meridional flow streamfunction and magnetic initial conditions (toroidal field in colors and poloidal field lines in grey contours) obtained at the end of the last assimilation window (corresponding to window 10 of Fig. 13). This meridional circulation and initial magnetic conditions are then used to produce a forecast for cycle 25.

4.4. Prediction of the next minimum

Since we are likely near or have reached the maximum of cycle 25, it is important to forecast the next phase of the current cycle. To this end, we present here our forecast for the next minimum of solar activity. Indeed, we have shown that our temporal horizon window is of the order of 10 years so we are well within this horizon for the next minimum. We know that cycle 24 minimum was on December 2019, so simply adding 11 years to this date will provide a no cast prediction for the minimum of cycle 25 around December 2030 and early 2031. As shown in Fig. 10, we find that the minimum should be around fall 2029. An estimate of the first and third quartiles and of the median and mean calculated on the 480 members of the ensemble give the distribution shown in Fig. B.1. From this distribution of predictions, we estimate our most probable next minimum to occur between early 2029 to fall 2030 (i.e., $2029 . 6_{- 0.5}^{+ 1}$ $2029.6_{-0.5}^{+1}$ ). We note that the error bars are not symmetric with respect to the mean value, making a minimum in 2030 most probable. This implies that cycle 25 would not be a long cycle, contrary to cycle 23 that lasted 12.4 years. We are currently forecasting that it will more likely last between 9.5 and 10.5 years; namely, closer to the length of cycles 21 and 22. We recall that historical records show that the solar cycle duration varies between 9 and 13.7 years, so cycle 25 would be in the 50% of cycles that are a bit shorter than the canonical 11 yr period.

The SSN value reached during that minimum is shown on the figure to be below 10, but it is likely not that significant given the relative large error bar. Moreover, we note that while during the assimilation phase the temporal shape of the model (blue triple dot curve) can be asymmetric, this is typically not the case for the forecast, that usually displays a rather sinusoidal temporal shape. We know from historical records that the rising versus the declining phase of the solar cycle is not symmetric with respect to the maximum, being often much longer by several years. This empirical asymmetry aspect of the solar cycle, called the Waldmeier effect, is not fully taken into account in our current forecast (see model description in Sect. 2.1).

4.5. Predictions of Cycle 25 as a function of the cycle phase

To further characterize our cycle 25 predictions, we went on to apply the convergence analysis as a function of the cycle phase, as it was quite informative to study it for cycles 22, 23, and 24. The results are shown in Fig. 15 for an assimilation window extending up to 2018, 2019, 2020, 2021, 2022, 2023, and 2024. It has to be said here that because of the use of the 13 month smoothed data, the prediction shown here for 2024 actually uses data extending until July 2023. This explains why the predicted maximum here is slightly higher and later than the prediction presented in Sect. 4.2 where more updated data were used (observations until December 2023).

Fig. 15.

Same as Fig. 6 but for cycle 25. We added two extra phases of the cycle here, finishing in 2018 and 2019 (i.e., before the previous minimum).

From this figure and comparing to the same plots for the previous cycles, it seems that cycle 25 predictions were always in favour of a weak cycle (SSN below 160) with small error bars for all cycle phases, showing quite accurate results in all cases. Incorporating more recent data (cycle phases 2023 and 2024) produces an increase in the prediction and a delay in time of the peak amplitude. It is also interesting to see that the timing of the maximum has shifted consistently when more and more updated data were incorporated in the Solar Predict tool and that the most recent predictions seem to converge on a maximum time at the end of 2024 and beginning of 2025.

Our results for the phase 2019 (yellow point on the figure with a mean max SSN of 145.7 ± 26 reached in Mid 2023 with a six months uncertainty) may be compared to the prediction we published in Brun et al. (2020) where a maximum SSN of 92 ± 10 was reported to have been reached in April 2024 ± 6 months. At the time, we were using the five-year smoothed SSN time series and ten-year smoothed butterfly diagram when we are here using the 13 month smoothed time series for the SSN and the one-year smoothed data for the butterfly diagram. We also extended the ensemble by adding more values of A_W and by introducing the regularization terms J₁, J₂, and J₃. Those various changes contribute to a different prediction now reached starting from 2019 compared to Brun et al. (2020). This shows again the importance of carefully choosing the assimilated data and the various terms in the objective function for a reliable prediction.

5. Discussion and conclusion

In this work, we present a first attempt at applying our variational data assimilation framework on real solar data provided by the SIDC SILSO database and synoptic maps from various solar observatories. The assimilated dataset consists of the sunspot number time series smoothed over 13 months and the surface line-of-sight magnetic field coming from the synoptic maps over one full cycle. The output of our assimilation framework is the prediction of cycle 25 length and maximum values, by adjusting a dynamo model to the observed data. One strength of the method presented here is that an ensemble of predictions have been obtained. A first ensemble is produced by varying the different parameters of the assimilation; namely, the weighting coefficients in front of the terms appearing in the objective function to be minimized. A second ensemble was then produced by perturbing the results of our assimilation; namely, a best-fit meridional circulation and initial magnetic conditions for the forecast. This ensemble of hundreds of predictions then enabled us to precisely calculate the mean, median, and various quantile values of the distribution of the maximum sunspot number and timing of this max and to estimate the error bars on our predictions. Finally, an original feature of this work is the provision of separate predictions for the northern and southern hemispheres, since we added an asymmetry parameter that allowed our dynamo model to produce magnetic fields with different behaviours in both hemispheres.

We tested our method on previous cycles for the sake of validation and found reasonable agreements between predicted and observed maximum values and timing for cycles 22, 23, and 24. However, the results for the prediction of cycle 22 are less satisfactory, possibly owing to the fact that the previous minimum extended for almost 2 years between 1985 and 1987. Using data smoothed on longer timescales could be a way to improve the forecasts in those particular cases. It ought to be noted that most cycles present two peaks owing to the fact that the two hemispheres reach their maximum at different dates sometimes separated by several months or even years. The two peaks can even reach similar amplitudes in some cases. We might then wonder which value would correspond to the true maximum and whether the prediction could be considered successful in cases where the forecast ended up peaking between the two real values. This also puts into question the possibility of an axisymmetric mean-field dynamo model to reproduce a high degree of asymmetry between the two hemispheres. We did indeed find that our forecasted trajectories currently fail at reproducing large asymmetries between the hemispheres. Possible ways forward would be to allow for more coefficients in the meridional flow expansion to be assimilated and to relax the strong constraint we imposed of the d_1, 3 coefficient (i.e., the main driver of the north-south asymmetries) to be less than 10% of the 1-cell coefficient d_1, 2. Further investigations may be needed in this direction to make progress in this area in the future.

As far as the ongoing cycle 25 is concerned, our prediction for the maximum (as of this writing) gives a sunspot maximum of 143.1 ± 15.0 reached at the end of 2024, with similar amplitudes between the northern and southern hemispheres. Our method offers predictions that tend to vary with the extent of the assimilation window, but they do become more precise and more accurate as more and more updated data are assimilated into our model. However, the predicted values may still depend on the level of filtering of the observed SSN and butterfly diagram and on our various parameters. In this work, we have focused on the most significant influence of the weight put on the SSN observations compared to the surface radial field, but additional investigations ought to be conducted for other parameters that could influence the quality of these predictions.

Given that we have reached the maximum of cycle 25 at the time of writing (fall 2024), it is instructive to also attempt to forecast the next minimum. This is a reasonable goal since the temporal horizon window of our pipeline is around ten years (as shown in Hung et al. 2017); this is far enough to attempt such forecast, which is likely to occur within the next seven years or so. We have shown that our tool predicts the next minimum to occur in late 2029 with a one-year error bar. A no-cast forecast (i.e., just adding 11 years to the last minimum that occurred in December 2019) would imply a minimum in late 2030 to early 2031, which would be significantly later. Hence, our tool indicates that cycle 25 would be a relatively short cycle, with a length ranging from 9.5 to 10.5 years, placing it in the lower 50% of the distribution of all previous cycle lengths on records. Given that our forecast does not fully take into account the empirical Waldmeier rule, we will need to work more specifically on better incorporating this asymmetry between rising and declining phases of the solar cycle, as in Pipin et al. (2012). For instance, this would allow us to have more robust forecasting overall – an improvement we intend to focus on in the near future.

In addition, several aspects of our method may now be improved, in particular, for the forecasting step. First, we could consider other sources of poloidal field, such as an α-effect distributed across the whole convection zone, which would make the link between the simulated toroidal field at the base of the convection zone and the observed sunspot number less direct. We could also implement additional transport mechanisms, such as turbulent pumping, which would reduce the dependency of the cycle period to the meridional flow speed, as shown in Do Cao & Brun (2011) or Karak & Nandy (2012) for example. Another major improvement would be to introduce a backreaction of the Lorentz force on the fluctuations of the velocity field, in the spirit of the Malkus-Proctor effect or Λ-effect (Malkus & Proctor 1975; Kichatinov & Rudiger 1993; Bushby 2006) to better take into account nonlinearities (which are still weak in our current model). These additional nonlinearities would naturally lead to modulations of the cycle period and amplitude; thus, it could be helpful to take these into account for the predictive part of our technique. Another way forward is related to our modeling of the emergence of magnetic flux. Indeed, as flux emergence at the solar surface may also play a crucial role in the timing and value of the maximum of solar cycles (e.g., Cameron & Schüssler 2007), additional observational constraints on the tilt angles, amplitudes of the sunspot magnetic fields, and possibly morphological characteristics of sunspots could be introduced in the objective function to improve our assimilation step and consequently the accuracy of our forecast of future solar activity. This would also imply the use of a direct dynamo model which takes into account the individual characteristics of active regions, rather than their average effect on the cycle, as in the double-ring approach first suggested by Durney (1995) and then implemented, for example, in Nandy & Choudhuri (2001) or in the various 3D kinematic versions of the BL model as used in Yeates & Muñoz-Jaramillo (2013) or Kumar et al. (2019). These additional developments of the dynamo model should definitely be considered in future data assimilation techniques applied to solar physics to improve our forecasting capabilities.

Data availability

Our Solar Predict tool is freely available on the ESA SWE portal under the following link: https://swe.ssa.esa.int/irfu-federated

Acknowledgments

We acknowledge funding support by ERC grants Whole Sun #810218 and Solar Predict #640997, ESA SWESNET support, INSU/PNST grant and CNES Space Weather financial supports. We are thankful to WDC-SILSO, Royal Observatory of Belgium, Brussels, for providing free access to their sunspot number times series used in our data assimilation pipeline. Likewise, this work utilizes GONG data obtained by the NSO Integrated Synoptic Program, managed by the National Solar Observatory, which is operated by the Association of Universities for Research in Astronomy (AURA), Inc. under a cooperative agreement with the National Science Foundation and with contribution from the National Oceanic and Atmospheric Administration. The GONG network of instruments is hosted by the Big Bear Solar Observatory, High Altitude Observatory, Learmonth Solar Observatory, Udaipur Solar Observatory, Instituto de Astrofísica de Canarias, and Cerro Tololo Interamerican Observatory. We also use synoptic maps from the Wilcox Observatory for data earlier than 2000 and in case of gaps we also use SOLIS data. SOLIS data are obtained by the NSO Integrated Synoptic Program, managed by the National Solar Observatory, which is operated by the Association of Universities for Research in Astronomy (AURA), Inc. under a cooperative agreement with the National Science Foundation. AS acknowledges funding from the French Agence Nationale de la Recherche (ANR) project STORMGENESIS # ANR-22-CE31-0013-01. We are grateful to Eric Buchlin and Stephane Caminade for their help in deploying and hosting the Solar Predict tool on the IAS/Medoc servers https://idoc.osups.universite-paris-saclay.fr/medoc/

References

Basu, S., & Antia, H. M. 2010, ApJ, 717, 488 [NASA ADS] [CrossRef] [Google Scholar]
Berdyugina, S. V., & Usoskin, I. G. 2003, A&A, 405, 1121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Bertello, L., & Marble, A. R. 2015, ArXiv e-prints [arXiv:1507.07976] [Google Scholar]
Bhowmik, P., & Nandy, D. 2018, Nat. Commun., 9, 5209 [NASA ADS] [CrossRef] [Google Scholar]
Bhowmik, P., Jiang, J., Upton, L., Lemerle, A., & Nandy, D. 2023, Space Sci. Rev., 219, 40 [Google Scholar]
Brun, A. S., Pui Hung, C., Fournier, A., et al. 2020, IAU Symp., 354, 138 [Google Scholar]
Bushby, P. J. 2006, MNRAS, 371, 772 [Google Scholar]
Bushby, P. J., & Tobias, S. M. 2007, ApJ, 661, 1289 [Google Scholar]
Cameron, R., & Schüssler, M. 2007, ApJ, 659, 801 [Google Scholar]
Charbonneau, P. 2010, Liv. Rev. Sol. Phys., 7, 3 [Google Scholar]
Choudhuri, A. R., Chatterjee, P., & Jiang, J. 2007, Phys. Rev. Lett., 98, 131103 [Google Scholar]
DeRosa, M. L., Brun, A. S., & Hoeksema, J. T. 2012, ApJ, 757, 96 [NASA ADS] [CrossRef] [Google Scholar]
Dikpati, M., & Anderson, J. L. 2012, ApJ, 756, 20 [Google Scholar]
Dikpati, M., & Gilman, P. A. 2006, ApJ, 649, 498 [NASA ADS] [CrossRef] [Google Scholar]
Dikpati, M., Anderson, J. L., & Mitra, D. 2014, Geophys. Res. Lett., 41, 5361 [Google Scholar]
Do Cao, O., & Brun, A. S. 2011, Astron. Nachr., 332, 907 [NASA ADS] [CrossRef] [Google Scholar]
Durney, B. R. 1995, Sol. Phys., 160, 213 [Google Scholar]
Finley, A. J., & Brun, A. S. 2023, A&A, 679, A29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Fournier, A., Hulot, G., Jault, D., et al. 2010, Space Sci. Rev., 155, 247 [Google Scholar]
Harvey, J. W., Hill, F., Hubbard, R. P., et al. 1996, Science, 272, 1284 [Google Scholar]
Hathaway, D. H. 2015, Liv. Rev. Sol. Phys., 12, 4 [Google Scholar]
Hoeksema, J. T. 1991, J. Geomagn. Geoelectr., 43, 59 [Google Scholar]
Hotta, H., & Yokoyama, T. 2010, ApJ, 709, 1009 [NASA ADS] [CrossRef] [Google Scholar]
Hung, C. P., Jouve, L., Brun, A. S., Fournier, A., & Talagrand, O. 2015, ApJ, 814, 151 [Google Scholar]
Hung, C. P., Brun, A. S., Fournier, A., et al. 2017, ApJ, 849, 160 [Google Scholar]
Karak, B. B., & Choudhuri, A. R. 2011, MNRAS, 410, 1503 [NASA ADS] [Google Scholar]
Karak, B. B., & Nandy, D. 2012, ApJ, 761, L13 [Google Scholar]
Kichatinov, L. L., & Rudiger, G. 1993, A&A, 276, 96 [Google Scholar]
Klinker, E., Rabier, F., Kelly, G., & Mahfouf, J. F. 2000, Q. J. R. Meteorol. Soc., 126, 1191 [Google Scholar]
Komm, R., González Hernández, I., Howe, R., & Hill, F. 2015, Sol. Phys., 290, 3113 [Google Scholar]
Kumar, R., Jouve, L., & Nandy, D. 2019, A&A, 623, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Labonville, F., Charbonneau, P., & Lemerle, A. 2019, Sol. Phys., 294, 82 [NASA ADS] [CrossRef] [Google Scholar]
Lorenz, E. N. 1963, J. Atmos. Sci., 20, 130 [CrossRef] [Google Scholar]
Malkus, W. V. R., & Proctor, M. R. E. 1975, J. Fluid Mech., 67, 417 [NASA ADS] [CrossRef] [Google Scholar]
McFadden, P. L., Merrill, R. T., McElhinny, M. W., & Lee, S. 1991, J. Geophys. Res., 96, 3923 [Google Scholar]
Nandy, D. 2021, Sol. Phys., 296, 54 [NASA ADS] [CrossRef] [Google Scholar]
Nandy, D., & Choudhuri, A. R. 2001, ApJ, 551, 576 [NASA ADS] [CrossRef] [Google Scholar]
Norton, A. A., & Gallagher, J. C. 2010, Sol. Phys., 261, 193 [NASA ADS] [CrossRef] [Google Scholar]
Petrovay, K. 2020, Liv. Rev. Sol. Phys., 17, 2 [Google Scholar]
Pipin, V. V., Sokoloff, D. D., & Usoskin, I. G. 2012, A&A, 542, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Scherrer, P. H., Wilcox, J. M., Svalgaard, L., et al. 1977, Sol. Phys., 54, 353 [NASA ADS] [CrossRef] [Google Scholar]
Talagrand, O., & Courtier, P. 1987, Q. J. R. Meteorol. Soc., 113, 1311 [Google Scholar]
Temmer, M., Rybák, J., Bendík, P., et al. 2006, A&A, 447, 735 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]
Ulrich, R. K. 2010, ApJ, 725, 658 [NASA ADS] [CrossRef] [Google Scholar]
Waldmeier, M. 1971, Sol. Phys., 20, 332 [NASA ADS] [CrossRef] [Google Scholar]
Yeates, A. R., & Muñoz-Jaramillo, A. 2013, MNRAS, 436, 3366 [Google Scholar]
Yeates, A. R., Nandy, D., & Mackay, D. H. 2008, ApJ, 673, 544 [NASA ADS] [CrossRef] [Google Scholar]

Appendix A: Extension of hemispherical SSN times series

To extend this investigation further into the past (i.e., 1975) in terms of the hemispherical time series of SILSO (which starts in 1992), we made use of the UCCLE station SSN time series. The UCCLE station is run by ROB and in the file we got runs from 1950 until 1994, hence overlapping for 2 years with the SILSO hemispherical time series. We have found that a correction factor around 1.4136 was needed to cross-calibrate the data (i.e., the UCCLE data was boosted by this factor). We show both times series merged in Fig. A.1, with on the top panel the monthly SSN time series, in the middle panel the corresponding 13 month smoothed SSN time series and in the bottom panel the merging of both 13 month smoothed time series around year 1992.

Fig. A.1.

SILSO SSN hemispherical time series extension using UCCLE station at ROB. On the top panel we shown the monthly SSN time series (using black color for UCCLE and red for SILSO), in the middle panel the 13-month smoothed equivalent SSN time series and in the bottom panel the merging of the time series around 1992 using cyan color for UCCLE, red for SILSO and black color for the merged time series.

Appendix B: Prediction for the next minimum

To better estimate the typical error in the prediction of the next solar minimum discussed in Sect. 4.4, we calculated the distribution of predictions made using the 480 members described in the main text, when assimilation is performed up to April 2024. Figure B.1 shows the mean, median and first and third quartiles of the distribution. The calculation of these values allows us to have a more precise estimate of the next minimum predicted to occur in $2029 . 6_{- 0.5}^{+ 1}$ $2029.6_{-0.5}^{+1}$ , however with an error bar asymmetric with respect to the mean value, indicating in fact a probable minimum in 2030.

Fig. B.1.

Prediction of the next solar minimum, with the mean, median, and first and third quartiles of the distribution.

Appendix C: Summary of cycles 22, 23, 24, and 25 forecasts

In this section we summarize the values of the forecasts for cycles 22, 23, 24, and 25. Prediction for the sunspot number of cycles 23 and 24 are shown in Figs. C.1 and C.2 as a function of time, at different assimilated cycle phases. This is similar to Fig. 5 in the main text which illustrates the results of our validation strategy.

Fig. C.1.

Same as Fig. 5, but for cycle 23.

Fig. C.2.

Same as Figs. 5 and C.1, but for cycle 24.

Table C.1 gives the values of the forecasted value and timing of the maximum SSN for different assimilation windows for the past three cycles and the current one. In bold characters we indicate the mean values and the three quartiles are also given. The values q1, q2 and q3 indicate respectively the first, second, and third quartiles of the distributions of t_max and SSN_max. We also indicate the observed values for both quantities for cycles 22 to 24.

Table C.1.

Results for the prediction of cycles 22, 23, 24 and 25.

All Tables

Table C.1.

Results for the prediction of cycles 22, 23, 24 and 25.

In the text

All Figures

Fig. 1.

SSN time series used in the assimilation procedure. We assimilate the hemispherical data and can choose to either forecast the total or hemispherical solar activity. Shown in dark blue is the total 13-month smoothed SSN times series and in light blue shades the northern and southern sub time series (north + south = total). We superimpose in orange the monthly data, which is six months ahead by construction.

In the text

	Fig. 2. Magnetic butterfly diagram depicts the azimuthally averaged magnetic synoptic maps (coming from various sources) for each Carrington rotation from CR1625 until CR2288 and stacked in time. Top row: Unfiltered data and one-year (middle row) and five-year (bottom row) filtered versions. The later two can be used as line-of-sight magnetic field observations in our assimilation pipeline.
In the text

	Fig. 3. Schematic of our data assimilation procedure.
In the text

Fig. 4.

Trajectory of the 480 ensemble members. Example of hemispherical data assimilation starting in mid 2013 and ending in τ_exyr = mid 2023. The 12 × 40 = 480 extrapolations (forecast ensemble trajectories) computed by the solar dynamo model are starting beyond that time. We clearly see that further we look in the future larger the spread is both in amplitude and timing of the next maximum of cycle 26. Forecast of the next minimum around 2030 is within our temporal horizon limit at the time of the writing of the paper. In order not to overcomplicate the figure, we voluntarily omit to plot the 2 × 480 trajectories of each hemisphere.

In the text

	Fig. 5. Prediction for the sunspot number of cycle 22 as a function of time, at different assimilated cycle phases. For the sake of clarity, the legend (same for all plots) is indicated only on the 3rd panel.
In the text

Fig. 6.

Summary of the predicted time and sunspot number at maximum for cycles 22 (upper panel), 23 (mid panel) and 24 (lower panel) predicted from different phases of the previous cycle. The error bars are taken between the first and third quartiles of the distributions of t_max and SSN_max given in Table C.1. The second quartiles (i.e., median values) are indicated as dots and mean values as stars located at the intersection of the error bars. The cyan dots indicate the true values of the maximum with error bars on the sunspot number. For all cycles, two cyan dots are present, representing the two peaks of similar amplitude in the different cycles, visible on the 13 month smoothed sunspot number (red curves of Figs. 5, C.1, and C.2).

In the text

	Fig. 7. Predictions for cycle 22 starting from year 1986, using the 13 month smoothed SSN data (top) and the five-yr smoothed time series (bottom).
In the text

	Fig. 8. Predictions of the value and timing of the maximum sunspot number for cycle 24 for different values of A₂ and A₃. The included cases correspond to a 10-year assimilation window extending until 2011 for the prediction of cycle 24. The various symbols correspond to different values of A₂ (colors) and A₃ (markers). The cyan symbol is the true maximum.
In the text

Fig. 9.

No cast of cycle 25, where we compared cycle 25 to various historical data: the previous cycle 24 (dashed line), cycle 23 (the odd numbered previous cycle in dotted line) and two averages of the past record, a mean of the previous 24 cycles (cyan dash-dotted), and of the last 5 (pink dash-dotted). As we can see, none of these no cast or historical data provide an accurate prediction of the ongoing cycle 25.

In the text

	Fig. 10. Solar cycle 25 forecast as of April 2024, for the sunspot number evolution. The next maximum as well as next minimum values are shown for the total SSN.
In the text

	Fig. 11. Solar cycle 25 forecasts as of December 2023, for the butterfly diagram. The observed radial field is shown in the top panel and the results of the assimilation phase (from 2014 to 2024) and forecast (from 2024) are illustrated on the bottom panel.
In the text

	Fig. 12. Same as Fig. 10, but for the hemispherical version of the solar 25 activity forecast.
In the text

	Fig. 13. Difference in the meridional flow streamfunction between the result of the assimilation in one particular window and the basic one-cell meridional flow. This is chosen from the assimilation phase of a case with A_W = 21, representative of the mean (green curve shown in Fig. 12). We see that the flow never departs by more than around 20% from the 1-cell MC.
In the text

	Fig. 14. Meridional flow streamfunction and magnetic initial conditions (toroidal field in colors and poloidal field lines in grey contours) obtained at the end of the last assimilation window (corresponding to window 10 of Fig. 13). This meridional circulation and initial magnetic conditions are then used to produce a forecast for cycle 25.
In the text

	Fig. 15. Same as Fig. 6 but for cycle 25. We added two extra phases of the cycle here, finishing in 2018 and 2019 (i.e., before the previous minimum).
In the text

Fig. A.1.

SILSO SSN hemispherical time series extension using UCCLE station at ROB. On the top panel we shown the monthly SSN time series (using black color for UCCLE and red for SILSO), in the middle panel the 13-month smoothed equivalent SSN time series and in the bottom panel the merging of the time series around 1992 using cyan color for UCCLE, red for SILSO and black color for the merged time series.

In the text

	Fig. B.1. Prediction of the next solar minimum, with the mean, median, and first and third quartiles of the distribution.
In the text

	Fig. C.1. Same as Fig. 5, but for cycle 23.
In the text

	Fig. C.2. Same as Figs. 5 and C.1, but for cycle 24.
In the text

Current usage metrics show cumulative count of Article Views (full-text article views including HTML views, PDF and ePub downloads, according to the available data) and Abstracts Views on Vision4Press platform.

Data correspond to usage on the plateform after 2015. The current usage metrics is available 48-96 hours after online publication and is updated daily on week days.

Initial download of the metrics may take a while.

[1] Basu, S., & Antia, H. M. 2010, ApJ, 717, 488 [NASA ADS] [CrossRef] [Google Scholar]

[2] Berdyugina, S. V., & Usoskin, I. G. 2003, A&A, 405, 1121 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[3] Bertello, L., & Marble, A. R. 2015, ArXiv e-prints [arXiv:1507.07976] [Google Scholar]

[4] Bhowmik, P., & Nandy, D. 2018, Nat. Commun., 9, 5209 [NASA ADS] [CrossRef] [Google Scholar]

[5] Bhowmik, P., Jiang, J., Upton, L., Lemerle, A., & Nandy, D. 2023, Space Sci. Rev., 219, 40 [Google Scholar]

[6] Brun, A. S., Pui Hung, C., Fournier, A., et al. 2020, IAU Symp., 354, 138 [Google Scholar]

[7] Bushby, P. J. 2006, MNRAS, 371, 772 [Google Scholar]

[8] Bushby, P. J., & Tobias, S. M. 2007, ApJ, 661, 1289 [Google Scholar]

[9] Cameron, R., & Schüssler, M. 2007, ApJ, 659, 801 [Google Scholar]

[10] Charbonneau, P. 2010, Liv. Rev. Sol. Phys., 7, 3 [Google Scholar]

[11] Choudhuri, A. R., Chatterjee, P., & Jiang, J. 2007, Phys. Rev. Lett., 98, 131103 [Google Scholar]

[12] DeRosa, M. L., Brun, A. S., & Hoeksema, J. T. 2012, ApJ, 757, 96 [NASA ADS] [CrossRef] [Google Scholar]

[13] Dikpati, M., & Anderson, J. L. 2012, ApJ, 756, 20 [Google Scholar]

[14] Dikpati, M., & Gilman, P. A. 2006, ApJ, 649, 498 [NASA ADS] [CrossRef] [Google Scholar]

[15] Dikpati, M., Anderson, J. L., & Mitra, D. 2014, Geophys. Res. Lett., 41, 5361 [Google Scholar]

[16] Do Cao, O., & Brun, A. S. 2011, Astron. Nachr., 332, 907 [NASA ADS] [CrossRef] [Google Scholar]

[17] Durney, B. R. 1995, Sol. Phys., 160, 213 [Google Scholar]

[18] Finley, A. J., & Brun, A. S. 2023, A&A, 679, A29 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[19] Fournier, A., Hulot, G., Jault, D., et al. 2010, Space Sci. Rev., 155, 247 [Google Scholar]

[20] Harvey, J. W., Hill, F., Hubbard, R. P., et al. 1996, Science, 272, 1284 [Google Scholar]

[21] Hathaway, D. H. 2015, Liv. Rev. Sol. Phys., 12, 4 [Google Scholar]

[22] Hoeksema, J. T. 1991, J. Geomagn. Geoelectr., 43, 59 [Google Scholar]

[23] Hotta, H., & Yokoyama, T. 2010, ApJ, 709, 1009 [NASA ADS] [CrossRef] [Google Scholar]

[24] Hung, C. P., Jouve, L., Brun, A. S., Fournier, A., & Talagrand, O. 2015, ApJ, 814, 151 [Google Scholar]

[25] Hung, C. P., Brun, A. S., Fournier, A., et al. 2017, ApJ, 849, 160 [Google Scholar]

[26] Karak, B. B., & Choudhuri, A. R. 2011, MNRAS, 410, 1503 [NASA ADS] [Google Scholar]

[27] Karak, B. B., & Nandy, D. 2012, ApJ, 761, L13 [Google Scholar]

[28] Kichatinov, L. L., & Rudiger, G. 1993, A&A, 276, 96 [Google Scholar]

[29] Klinker, E., Rabier, F., Kelly, G., & Mahfouf, J. F. 2000, Q. J. R. Meteorol. Soc., 126, 1191 [Google Scholar]

[30] Komm, R., González Hernández, I., Howe, R., & Hill, F. 2015, Sol. Phys., 290, 3113 [Google Scholar]

[31] Kumar, R., Jouve, L., & Nandy, D. 2019, A&A, 623, A54 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[32] Labonville, F., Charbonneau, P., & Lemerle, A. 2019, Sol. Phys., 294, 82 [NASA ADS] [CrossRef] [Google Scholar]

[33] Lorenz, E. N. 1963, J. Atmos. Sci., 20, 130 [CrossRef] [Google Scholar]

[34] Malkus, W. V. R., & Proctor, M. R. E. 1975, J. Fluid Mech., 67, 417 [NASA ADS] [CrossRef] [Google Scholar]

[35] McFadden, P. L., Merrill, R. T., McElhinny, M. W., & Lee, S. 1991, J. Geophys. Res., 96, 3923 [Google Scholar]

[36] Nandy, D. 2021, Sol. Phys., 296, 54 [NASA ADS] [CrossRef] [Google Scholar]

[37] Nandy, D., & Choudhuri, A. R. 2001, ApJ, 551, 576 [NASA ADS] [CrossRef] [Google Scholar]

[38] Norton, A. A., & Gallagher, J. C. 2010, Sol. Phys., 261, 193 [NASA ADS] [CrossRef] [Google Scholar]

[39] Petrovay, K. 2020, Liv. Rev. Sol. Phys., 17, 2 [Google Scholar]

[40] Pipin, V. V., Sokoloff, D. D., & Usoskin, I. G. 2012, A&A, 542, A26 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[41] Scherrer, P. H., Wilcox, J. M., Svalgaard, L., et al. 1977, Sol. Phys., 54, 353 [NASA ADS] [CrossRef] [Google Scholar]

[42] Talagrand, O., & Courtier, P. 1987, Q. J. R. Meteorol. Soc., 113, 1311 [Google Scholar]

[43] Temmer, M., Rybák, J., Bendík, P., et al. 2006, A&A, 447, 735 [NASA ADS] [CrossRef] [EDP Sciences] [Google Scholar]

[44] Ulrich, R. K. 2010, ApJ, 725, 658 [NASA ADS] [CrossRef] [Google Scholar]

[45] Waldmeier, M. 1971, Sol. Phys., 20, 332 [NASA ADS] [CrossRef] [Google Scholar]

[46] Yeates, A. R., & Muñoz-Jaramillo, A. 2013, MNRAS, 436, 3366 [Google Scholar]

[47] Yeates, A. R., Nandy, D., & Mackay, D. H. 2008, ApJ, 673, 544 [NASA ADS] [CrossRef] [Google Scholar]