.. _evaluation_ifs_fesom:

IFS-FESOM Evaluation
====================

This page provides evaluation of the IFS-FESOM historical, projection, and storyline
simulations. For model description and forcing details, see :ref:`ifs_fesom` and
:ref:`storylines`.

Assessment of the Historical Simulation
----------------------------------------

A comprehensive evaluation of the main model biases is provided by the performance
index metric in the Performance Index table below, covering a large selection of variables
and regions. Performance is evaluated against the mean of the CMIP6 models, with
values below 1 (blueish) indicating superior performance and values above 1 (reddish)
indicating inferior performance. For most variables and regions IFS-FESOM shows
exceptional performance, with values typically below 0.2. The dynamical atmospheric
variables show the best performance, while net surface radiation, surface temperature
and sea ice concentration present relatively weaker — though still good — scores.

.. figure:: ../../../evaluation/general_evaluation/figures/climate_metrics.performance_indices.climatedt-o25.1.IFS-FESOM.historical-1990.r1.png
    :name: ifs-fesom_hist_pi

    Performance Index table for the IFS-FESOM historical simulation. Performance values refer to the mean of CMIP6 models, with values below 1 indicating improved performance and values above 1 indicating degraded performance.

Global mean temperatures (left panel of the figure below) over the historical period track ERA5 closely, with a small mean-state cold bias of ~0.25 °C. The future projection depicts an acceleration of the historical warming trend. Simulated values of top-of-atmosphere net radiation (middle panel) are within the range of observed values. The Gregory plot (right panel) shows that the simulation stays largely within the observed ranges, with a potential overestimation of the response to Pinatubo in the years following the eruption (i.e. 1991).

.. figure:: ../../../evaluation/mn5/figures/IFS-FESOM-Historical_SSP370_Tco2559_timeseries_Gregory_absoluteT-IFS_FESOM.png
    :name: ifs-fesom_hist_proj_gregory

    Left\: Time series of the globally averaged annual surface air temperature in ERA5 and the historical and scenario IFS-FESOM simulations. Middle\: Time series of the net heat fluxes at the top of the atmosphere (TOA), including observations from CERES. Right\: Gregory plot of the combined IFS-FESOM simulations. The mean values and ranges of the observed TOA fluxes and global mean surface air temperatures are included for reference.

Spatial maps of mean-state biases in annual surface air temperature (below) show very small differences with respect to ERA5 over the three major ocean basins and most continental areas (global mean bias 0.14 K, RMSE 1.22 K). The global cold bias mostly arises from the Arctic region and the Weddell Sea, likely linked to errors in sea ice representation. IFS-FESOM successfully mitigates the warm biases in eastern boundary coastal upwelling regions (e.g., Humboldt and Benguela currents) that commonly afflict coarser CMIP6 models.

.. figure:: ../../../evaluation/ifs_fesom_eval/figures/tas_annual_bias_combined_cropped.png
    :name: ifs-fesom_bias_tas

    Spatial maps of the climatological biases of annual surface air temperature in the IFS-FESOM historical simulation and the CMIP6 multi-model mean. Biases are computed against Berkeley Earth climatology over the period 1990–2014.

In terms of mean sea level pressure (below), IFS-FESOM performs well (global mean bias -1.59 Pa, RMSE ~101 Pa), comparable to the CMIP6 multi-model mean (RMSE ~87 Pa). The polar regions show positive biases, characteristic of cold high-latitude mean states that are dynamically consistent with excessive surface cooling and sea ice, highlighting the strongly coupled nature of tropical and polar mean-state errors.

.. figure:: ../../../evaluation/ifs_fesom_eval/figures/psl_annual_bias_combined_cropped.png
    :name: ifs-fesom_bias_psl

    Spatial maps of the climatological biases of annual mean sea level pressure in the IFS-FESOM historical simulation and the CMIP6 multi-model mean. Biases are computed against ERA5 climatology over the period 1990–2014.

IFS-FESOM shows good performance in terms of the climatological annual precipitation rate (below), sharing the broad spatial structure of the CMIP6 multi-model mean, with wet biases along the mid-latitude storm tracks and a dipole pattern in the tropical Pacific indicative of a double-ITCZ or ITCZ displacement error. The amplitude of the tropical biases in IFS-FESOM tends to be smaller than for most individual CMIP6 models.

.. figure:: ../../../evaluation/ifs_fesom_eval/figures/pr_annual_bias_combined_cropped.png
    :name: ifs-fesom_hist_pr_bias

    Spatial maps of the climatological biases of annual precipitation in the IFS-FESOM historical simulation and the CMIP6 multi-model mean. Biases are computed against MSWEP climatology over the period 1990–2014.

The Earth energy imbalance time series below shows that IFS-FESOM accurately captures the observed annual mean energy imbalance of approximately 0.5–1.0 W/m² post-2000, significantly outperforming the CMIP6 multi-model mean which exhibits a systematic positive bias of approximately 1.0 W/m² relative to CERES observations. IFS-FESOM successfully reproduces the large-amplitude seasonal cycle in net TOA radiation. A transient negative energy imbalance is evident in the early 1990s, reflecting the radiative response to the Mt. Pinatubo eruption.

.. figure:: ../../../evaluation/ifs_fesom_eval/figures/radiation_imbalance_timeseries.png
    :name: ifs-fesom_eei

    Time series of the global-mean net top-of-atmosphere radiation (Earth's energy imbalance) from 1990 to 2014, comparing IFS-FESOM with CERES observations and the CMIP6 multi-model mean.

Sea ice extent (below) is significantly overestimated in the Northern Hemisphere but shows a declining trend since the 1990s consistent with observed Arctic sea ice loss. In the Southern Hemisphere, IFS-FESOM exhibits anomalous decadal-scale variability with a pronounced dip and recovery, which may reflect an initialization adjustment of the Southern Ocean rather than a forced climate signal, following a different long-term evolution from observations.

.. figure:: ../../../evaluation/ifs_fesom_eval/figures/sea_ice_extent_timeseries.png
    :name: ifs-fesom_sea_ice_extent

    Time series of monthly and annual-mean sea ice extent for the Northern and Southern Hemispheres, comparing IFS-FESOM against OSI-SAF satellite observations and a CMIP6 multi-model ensemble.

.. rubric:: Further evaluation

Additional evaluation plots for the IFS-FESOM simulations are available
in the `Climate DT Evaluation Charts <https://climatedt-evaluation-charts.destine.eu/>`_.

.. For evaluation of the IFS-FESOM storyline simulations, see :ref:`evaluation_storylines`.

.. Annual Biases
.. ^^^^^^^^^^^^^

.. Annual biases of the storyline simulations (:numref:`ifs-fesom_story_annual_biases`) provide an overview of the mean-state performance of the nudged km-scale model. Since the large-scale circulation is constrained by nudging, remaining biases reflect model physics and parameterization errors rather than large-scale dynamical drift. The 2m temperature (panel a) shows a pronounced cold bias over the Arctic (exceeding -4 °C) and a warm bias around Antarctica, both likely related to sea ice representation issues. Most mid-latitude continental and oceanic regions remain within ±1 °C. Sea surface temperature biases (panel b) are generally small, with localized warm biases in the tropical Pacific and Indian Ocean and cold biases in the Southern Ocean. Total precipitation (panel c) reveals the largest biases in the tropics, with a dipole pattern suggesting a slight displacement of the Intertropical Convergence Zone (ITCZ) and excessive precipitation over the Maritime Continent. The 500 hPa geopotential height (panel d) shows a predominantly negative bias across the extratropics, with a positive anomaly near Antarctica consistent with the warm surface bias there.

.. .. figure:: ../../../evaluation/ifs_fesom_storylines_eval/figures/annual_biases.png
..     :name: ifs-fesom_story_annual_biases

..     Annual biases of the IFS-FESOM storyline simulations.

.. Temporal Correlation
.. ^^^^^^^^^^^^^^^^^^^^

.. Temporal correlation between ERA5 reanalysis and storyline simulations (:numref:`ifs-fesom_story_temporal_corr`) measures the skill of the nudged model in replaying observed weather events. Higher temporal correlations indicate that the model faithfully reproduces the timing and sequencing of synoptic-scale weather patterns, a prerequisite for meaningful event attribution. The 500 hPa geopotential height shows uniformly high correlations (>0.9) across all latitudes, as expected since this is the field constrained by nudging. For 2m temperature, correlations are high (>0.8) over most extratropical land areas, where surface temperature variability is strongly controlled by the large-scale circulation, but drop notably in the tropics and over tropical oceans where local processes such as convection and air–sea interaction play a larger role. Precipitation correlations show a similar latitude dependence, with reasonable skill in the extratropics and weaker correlations in the tropics, where precipitation is more strongly influenced by local convective processes.

.. .. figure:: ../../../evaluation/ifs_fesom_storylines_eval/figures/temporal_correlation.png
..     :name: ifs-fesom_story_temporal_corr

..     Temporal correlation between daily ERA5 reanalysis data and the IFS-FESOM storyline simulations under present-day conditions, calculated after removing the seasonal cycle, for (a) 2m temperature, (b) precipitation, and (c) 500 hPa geopotential height.

.. Case Studies
.. ^^^^^^^^^^^^

.. The storyline framework is demonstrated through two high-impact European extreme events: Storm Boris (September 2024) and the Paris heatwave (July 2019). For both cases further details can be found in `John et al. (in review) <https://doi.org/10.22541/essoar.173160166.64258929/v1>`__.

.. **Storm Boris (September 2024)**

.. Maps of 5-day accumulated precipitation for Storm Boris (:numref:`ifs-fesom_story_boris`) compare the km-scale IFS-FESOM ensemble against a coarser AWI-CM1 nudged run and reference datasets (ERA5, MSWEP). The historical present-day IFS-FESOM ensemble accurately reproduces the spatial structure and magnitude of the observed extreme precipitation, demonstrating the added value of km-scale resolution over coarser models. The three-scenario comparison — counterfactual past, present-day, and future +2K — reveals how the thermodynamic response to warming amplifies precipitation in this event.

.. .. figure:: ../../../evaluation/ifs_fesom_storylines_eval/figures/Precip_Boris_story.png
..     :name: ifs-fesom_story_boris

..     5-day accumulated precipitation (mm) for Storm Boris (12--16 September 2024). Top row: (a) AWI-CM1 coarse-resolution nudged run, (b) ERA5 reanalysis, (c) MSWEP observations. Bottom row: ensemble mean from nudged km-scale IFS-FESOM storyline experiments for (d) counterfactual past climate, (e) historical present-day climate, (f) future +2K climate.

.. **Paris Heatwave (25 July 2019)**

.. Maximum 2m-temperature during the peak of the 25 July 2019 European heatwave (:numref:`ifs-fesom_story_heatwave`) shows that the historical IFS-FESOM ensemble faithfully captures the spatial gradients and peak magnitudes of the observed event compared to ERA5 and E-OBS. The km-scale resolution resolves fine-scale temperature contrasts — including urban heat island signatures and orographic effects — that are smoothed out in the coarser AWI-CM1 run and ERA5. Under the +2K future scenario, the area exceeding 40 °C expands substantially, extending from France and Iberia into Germany and the Benelux region, while the counterfactual past scenario shows peak temperatures remaining below 35 °C over much of France. This progression across the three climate states quantifies the contribution of warming to the regional intensification of the heatwave.

.. .. figure:: ../../../evaluation/ifs_fesom_storylines_eval/figures/HW_2019_Paris_story.png
..     :name: ifs-fesom_story_heatwave

..     Maximum 2m-temperature (°C) during the peak of the 25 July 2019 European heatwave. Top row: (a) AWI-CM1 coarse-resolution nudged run, (b) ERA5 reanalysis, (c) E-OBS observations. Bottom row: ensemble mean from nudged km-scale IFS-FESOM storyline experiments for (d) counterfactual past climate, (e) historical present-day climate, (f) future +2K climate.