Online Katalog der GeoSphere Austria (Standort Neulinggasse)

Home

Login

Katalogkarte GBA

Katalogkarte ISBD

Suche präzisieren

Drucken

Download RIS

Zum vorherigen Treffer

Datensatz 1 von 1

Zum nächsten Treffer


Titel	Mean absolute error and root mean square error: which is the better metric for assessing model performance?
VerfasserIn	Gary Brassington
Konferenz	EGU General Assembly 2017
Medientyp	Artikel
Sprache	en
Digitales Dokument	PDF
Erschienen	In: GRA - Volume 19 (2017)
Datensatznummer	250140218
Publikation (Nr.)	EGU/EGU2017-3574.pdf



Zusammenfassung
The mean absolute error (MAE) and root mean square error (RMSE) are two metrics that are often used interchangeably as measures of ocean forecast accuracy. Recent literature has debated which of these should be preferred though their conclusions have largely been based on empirical arguments. We note that in general, RM SE2 = M AE2 + V ARk [\|ɛ\|] PIC PIC such that RMSE includes both the MAE as well as additional information related to the variance (biased estimator) of the errors ɛ with sample size k. The greater sensitivity of RMSE to a small number of outliers is directly attributable to the variance of absolute error. Further statistical properties for both metrics are derived and compared based on the assumption that the errors are Gaussian. For an unbiased (or bias corrected) model both MAE and RMSE are shown to estimate the total error standard deviation to within a constant coeﬃcient such that ∘ ---- M AE ≈ 2/πRM SE PIC . Both metrics have comparable behaviour in response to model bias and asymptote to the model bias as the bias increases. MAE is shown to be an unbiased estimator while RMSE is a biased estimator. MAE also has a lower sample variance compared with RMSE indicating MAE is the most robust choice. For real-time applications where there is a likelihood of “bad” observations we recommend ∘ -- ∘ -----∘ -- π- -1- π- π- TESD = 2 M AE ± √k-- 2 − 1 2M AE PIC as an unbiased estimator of the total error standard deviation with error estimates (one standard deviation) based on the sample variance and deﬁned as a scaling of the MAE itself. A sample size (k) on the order of 90 and 9000 provides an error scaling of 10% and 1% respectively. Nonetheless if the model performance is being analysed using a large sample of delayed-mode quality controlled observations then RMSE might be preferred where the second moment sensitivity to large model errors is important. Alternatively for model intercomparisons the information might compactly represented by a graph with axes of MAE PIC and ∘V--ARk-[\|ɛ\|] PIC where radials from the origin represent RMSE PIC .