dot
Detailansicht
Katalogkarte GBA
Katalogkarte ISBD
Suche präzisieren
Drucken
Download RIS
Hier klicken, um den Treffer aus der Auswahl zu entfernen
Titel Mean absolute error and root mean square error: which is the better metric for assessing model performance?
VerfasserIn Gary Brassington
Konferenz EGU General Assembly 2017
Medientyp Artikel
Sprache en
Digitales Dokument PDF
Erschienen In: GRA - Volume 19 (2017)
Datensatznummer 250140218
Publikation (Nr.) Volltext-Dokument vorhandenEGU/EGU2017-3574.pdf
 
Zusammenfassung
The mean absolute error (MAE) and root mean square error (RMSE) are two metrics that are often used interchangeably as measures of ocean forecast accuracy. Recent literature has debated which of these should be preferred though their conclusions have largely been based on empirical arguments. We note that in general, RM SE2 = M AE2 + V ARk [|ɛ|] PIC PIC such that RMSE includes both the MAE as well as additional information related to the variance (biased estimator) of the errors ɛ with sample size k. The greater sensitivity of RMSE to a small number of outliers is directly attributable to the variance of absolute error. Further statistical properties for both metrics are derived and compared based on the assumption that the errors are Gaussian. For an unbiased (or bias corrected) model both MAE and RMSE are shown to estimate the total error standard deviation to within a constant coefficient such that ∘ ---- M AE ≈ 2/πRM SE PIC . Both metrics have comparable behaviour in response to model bias and asymptote to the model bias as the bias increases. MAE is shown to be an unbiased estimator while RMSE is a biased estimator. MAE also has a lower sample variance compared with RMSE indicating MAE is the most robust choice. For real-time applications where there is a likelihood of “bad” observations we recommend ∘ -- ∘ -----∘ -- π- -1- π- π- TESD = 2 M AE ± √k-- 2 − 1 2M AE PIC as an unbiased estimator of the total error standard deviation with error estimates (one standard deviation) based on the sample variance and defined as a scaling of the MAE itself. A sample size (k) on the order of 90 and 9000 provides an error scaling of 10% and 1% respectively. Nonetheless if the model performance is being analysed using a large sample of delayed-mode quality controlled observations then RMSE might be preferred where the second moment sensitivity to large model errors is important. Alternatively for model intercomparisons the information might compactly represented by a graph with axes of MAE PIC and ∘V--ARk-[|ɛ|] PIC where radials from the origin represent RMSE PIC .