SEP Model Validation Challenge

SEP Model Validation

Validation Framework

Here we describe the progress that has been made towards accomplishing validation of SEP models, including the validation approach and products. The most recent progress was summarized here for ISWAT 2021.

The validation code is being written in python for this effort and is designed to compliment the SEP Scoreboard developed at CCMC. The validation code uses the same input json file format as the SEP Scoreboard, described here. This is done with the idea that, in the long-term, model forecasts submitted to the SEP Scoreboard may also be automatically validated against data without any additional work from modelers. SRAG is actively working with the M2M office to develop an automated scheme to validate forecasts coming into the SEP Scoreboard in real-time.

Modelers are encouraged to submit their predictions to this challenge using the CCMC json format, however any formats are accepted. For models that produce flux time profiles, you (or Katie) can use operational_sep_quantities.py to automatically create the json files for validation.

Observational values used in validation are produced with the OpSEP code developed for this effort (described in further detail on the Data Sets page). The values are saved into a json file that mirrors the CCMC json format. This allows for easy comparison of like quantities.

To date (2021-10-21), the validation code has been written to:

Accept a list of any number of observations and any number of model predictions by any number of models
Automatically matches observations with model output for each SEP event
Validates the following quantities, as appropriate for each model:
- All Clear (threshold crossed/not crossed)
- Probability
- Start Time
- End Time
- Threshold Crossing Time
- Onset Peak Flux
- Maximum Flux
- Event Fluence
- Flux Time Profile
Derives the following metrics, as appropriate to each quantity:
- Contingency Table
- Skill Scores: Hits, Misses, False Alarms, Correct Negatives, Percent Correct, Bias, Hit Rate, False Alarm Ratio, False Alarm Rate, Frequency of Hits, Frequency of Misses, Probability of Correct Negatives, Detection Failure Ratio, Frequency of Correct Negatives, Threat Score, Odds Ratio, G Skill Score, True Skill Score, Heidke Skill Score, Odds Ratio Skill Score
- Metrics: Mean Log Error, Median Log Error, Mean Absolute Error, Median Absolute Error, Pearson's Correlation Coefficient (linear and log comparisons)
- Correlation Plots with linear regression lines
- Box Plots showing distribution of SEP model results across all events
- Time profile plots showing data and model plotted together
Generates pdf reports of each model's performance
Generates a combined model report (if multiple models validated simultaneously), comparing the performance of each model

View an example report for UMASEP-10. Note that the results in this report are not representative of a true validation due to the small number of events.

So far, the model validation work has focused on the operational thresholds >10 MeV, 10 pfu and >100 MeV, 1 pfu. Not all models produce forecasts for these thresholds, so we have also included >30 MeV, 1 pfu, >50 MeV, 1 pfu, and >60 MeV, 0.079 pfu.

It is possible to validate against any SEP data set, so models may produce forecasts in energy channels (both integral and differential) corresponding to any instrumentation available and it is possible to validate through this code.

Quantities, thresholds, and metrics can always be added according to feedback from the community.

Additional description of the validation work along with a description of SEP model needs with respect to human exploration operations can be found in links on the Conferences page.