Note: All work was completed at and for the Met Office, UK.
Space weather forecasting has developed rapidly in recent years, with the threat of a severe space weather event increasing in importance as society becomes ever more dependent on technology. Space weather service providers worldwide have developed monitoring systems for solar events of particular interest to space weather, namely, solar flares, coronal mass ejections (CMEs), and solar energetic particle events (SEPs). Solar flares impact near-Earth space within minutes, while SEPs can take tens of minutes, and CMEs days, to reach Earth. Forewarning of solar flares is thus particularly important for operational space weather services.
As outlined in our previous nugget, the Met Office Space Weather Operations Centre (MOSWOC) produces 24/7/365 space weather guidance, alerts, and forecasts to a wide range of government and commercial end users across the United Kingdom. Solar flare forecasts are one of its products, which are issued multiple times a day in two forms: forecast probabilities for each active region on the solar disk over the next 24 hours and full-disk forecast probabilities for the next 4 days.
Before calculating any flare probabilities, a MOSWOC forecaster first undertakes a thorough analysis of current solar conditions using images from Solar Dynamics Observatory’s Heliospheric Magnetic Imager. Each active region on disk is analysed, and the forecaster manually assigns Modified Mount Wilson  and McIntosh  classifications to each region. Flare probabilities are then calculated based on historical flare rates for each McIntosh class. A database containing 16 years of GOES X-ray flare and McIntosh classifications is used to calculate an average daily flare rate for each McIntosh classification . The MOSWOC forecaster calculates the flare probabilities for M- and X- class flares in each identified active region for the next 24 hours using a Poisson statistics technique .
The raw model output active region probabilities are combined to give a full-disk percentage probability, and these values can then be manually edited by the forecaster as necessary before issuing the official forecast. An example of an issued ‘Radio Blackout Forecast’ that originates from this method (for the Day 1 forecast) can be seen in Figure 1. Note that the Day 2, Day 3, and Day 4 forecasts are determined by forecaster experience based on how the active regions are evolving and what active regions may be leaving or returning to the solar disk in the next few days.
Historically, the solar physics community has used categorical verification techniques to validate new forecast methods. This entails deciding a threshold at which the probabilistic values become a “yes/no” forecast and then calculating metrics such as the Heidke skill score and True skill score [5, 6, 3]. More recently, however, the community has looked to operational meteorological verification techniques more suitable for probabilistic forecasting, increasingly presenting reliability diagrams and relative operating characteristic (ROC) curves alongside these traditional skill scores [7, 8, 9].
In a recent study by Murray et al  several years of archived MOSWOC flare forecasts were validated using operational techniques. The raw Poisson output was compared to that of the forecaster edited issued probabilities; an example of the results is shown in Figure 2. Reliability diagrams measure how closely the forecast probabilities of an event correspond to the actual chance of observing the event. For perfect reliability the forecast probability and the frequency of occurrence should be equal, and the plotted points should lie on the diagonal line. Here the raw probability points mainly lie below the diagonal line, highlighting a tendency to overforecast. The human influence on issued probabilities has resulted in improvement upon the model results, with points lying closer to the diagonal. The distributions in the subplots of Figure 2 highlight that the forecasters tend to decrease the probability values, leading to less overforecasting in general.
The four-day issued forecasts were also verified using operational methods. Figure 3 shows ROC curves of the results, which provide information on the hit rates and false alarm rates that can be expected from use of different probability thresholds to trigger advisory action. A skillful forecast system will achieve hit rates that exceed the false alarm rate; thus, the closer the curve is to the top left corner of the plot, the more skillful the forecast. Here the best results are found on Day 1, with the ROC curves tending further toward the “no skill” diagonal as the days progress.
This first verification of MOSWOC flare forecasts is presented in more detail in the Murray et al study . The influence of human forecasters is highlighted, with human-edited forecasts outperforming original model results, and forecasting skill decreasing over longer forecast lead times. Real-time verification available for operational flare forecasting use is also described, developed based on operational weather verification tools already in use. This system facilitates easy, instant, post-event analysis which, when time allows, gives forecasters a way to review their performance and (potentially) learn from any mistakes. The system also includes verification of other MOSWOC space weather products, such as geomagnetic storm forecasts .
The verification methods used in this study will also be used as part of the Horizon 2020 Flare Likelihood and Region Eruption Forecasting (FLARECAST) project, which aims to develop a fully automated solar flare forecasting system with real-time verification. FLARECAST will evaluate existing predictors to identify the best performers through the use of a variety of statistical, supervised, and unsupervised techniques and implement these best performers in a user-friendly online facility. The FLARECAST system output may prove to be a more accurate model basis for MOSWOC forecasters than the currently used Poisson method, although it is likely that the “human influence” will be needed for flare forecasting for the foreseeable future.
-  Künzel, H. (1965), Zur Klassifikation von Sonnenfleckengruppen, Astron. Nachr., 288, 177.
-  McIntosh, P. S. (1990), The classification of sunspot groups, Sol. Phys., 125, 251–267, doi:10.1007/BF00158405.
-  Bloomfield, D. S., P. A. Higgins, R. T. J. McAteer, and P. T. Gallagher (2012), Toward reliable benchmarking of solar flare forecasting methods, Astrophys. J. Lett., 747, L41, doi:10.1088/2041-8205/747/2/L41.
-  Gallagher, P. T., Y.-J. Moon, and H. Wang (2002), Active-region monitoring and flare forecasting I. Data processing and first results, Sol. Phys., 209, 171–183, doi:10.1023/A:1020950221179.
-  Barnes, G., and K. D. Leka (2008), Evaluating the performance of solar flare forecasting methods, Astrophys. J. Lett., 688, L107, doi:10.1086/595550.
-  Crown, M. D. (2012), Validation of the NOAA Space Weather Prediction Center’s solar flare forecasting look-up table and forecaster-issued probabilities, Space Weather, 10, S06006, doi:10.1029/2011SW000760.
-  Guerra, J. A., A. Pulkkinen, and V. M. Uritsky (2015), Ensemble forecasting of major solar flares: First results, Space Weather, 13, 626–642, doi:10.1002/2015SW001195.
-  Barnes, G., et al. (2016), A comparison of flare forecasting methods. I. Results from the all-clear workshop, Astrophys. J., 829, 89, doi:10.3847/0004-637X/829/2/89.
-  Cui, Y., S. Liu, A. ErCha, Q. Zhong, B. Luo, and X. Ao (2016), Verification of SPE probability forecasts at the Space Environment Prediction Center (SEPC), Sci. China Earth Sci., 59(6), 1292–1298, doi:10.1007/s11430-016-5284-x.
-  Murray, S. A., S. Bingham, M. Sharpe, and D. R. Jackson (2017), Flare forecasting at the Met Office Space Weather Operations Centre, Space Weather, 15, doi:10.1002/2016SW001579.
-  Sharpe, M., Verification of Space Weather Forecasts issued by the Met Office Space Weather Operations Centre, in preparation (to be submitted to Space Weather).