AUC versus RMSE?

Feeds:: Posts; Comments

AUC versus RMSE?

November 7, 2010 by Shubhendu Trivedi

This question appears very trivial and might just be meaningless, so I put it up at the risk of embarrassing myself. However, since it is a genuine question. I still put it up. All thoughts are welcome.

Question: When trying to quantify the performance of a classifier. What advantages does RMSE offer over the Area under the Curve under the ROC metric? And what does the AUC offer that the RMSE does not? I find AUC very intuitive and prefer using it for classification tasks. But can I give a theoretical reason for using it above RMSE and the vice versa? Review committees have different preferences, some journals prefer reporting the RMSE while some prefer the AUC, some ask for both. Another example being – The 2010 KDD Cup used RMSE while the 2010 UCSD data mining competition used AUC.

Or is this a bad question to ask?

To paraphrase my question – What can be instances in which a classifier is deemed as “good” by the AUC measure and “not so good” by the RMSE measure. What would be the exact reason for such a different “opinion”? And in what situations should I use AUC and in what situations should I use RMSE?

Some Background : If they are equivalent, then you would expect a strong linear relationship (with a negative correlation). That means that for a perfect classifier RMSE would be zero and the AUC 1.

I always use both for all purposes. Here is a sample graph.

RMSE versus AUC for a classifier on some Intelligent Tutoring Data

This is actually a very typical graph, and there are no surprises with it. If you leave out some “bad examples” such as those at (0.4, 0.65) and (0.38, 0.7), the graph has a good negative correlation (as measured by the line fit).

So, the question remains for me. What are the advantages and disadvantages of both?

Recommendations :

1. ROC Graphs : Notes and Practical Considerations for Researchers – Tom Fawcett

Posted in Data Mining, Machine Learning | Tagged Area Under the Curve, Classification, Machine Learning, Receiver Operator Characteristics, RMSE | 5 Comments

5 Responses

on November 7, 2010 at 10:01 pm | Reply Yaroslav Bulatov

This paper (http://citeseerx.ist.psu.edu/viewdoc/summary?doi=10.1.1.73.587) looks at relationship between AUC and 0-1 loss, perhaps some things are relevant to RMSE. As far as which one to use, IMHO it should always come down to the “real loss” behind the learning task, ie, actual amount of time, money or energy wasted due to poor performance. For instance, in Netflix challenge, RMSE was the real loss (of money) because it corresponded directly to odds of winning the prize. 0-1 loss corresponds to real loss (of time) when classifier errors need to be manually corrected by a human. To be honest I can’t think of any real-life losses similar to AUC.
- on November 8, 2010 at 3:51 am | Reply Shubhendu Trivedi
  
  Yaroslav,
  
  Thank you so much for your comment. I will say I have got “terms/keywords” for my little problem. :)
  I have just returned to school, and was at a data mining startup for a while. At both places, I never really got an answer to this question. I always thought it was too stupid to ask as RMSE was a measure of some kind of loss while AUC was just a measure to rank classifiers based on the loss. I never really got how to measure how good a metric was. And the pros and cons of some of them around.
  
  And thank you for the paper by C Cortes (the SVM superstar!). I actually perused through ALL the papers that cite it.
  I found these papers very interesting. And will take time next weekend to read them.
  1. Generalization Bounds for the Area under the ROC Curve – Shivani Agarwal et al.
  2. A Data-Dependent Generalization Error Bound for the AUC – Nicolas Usunier et al.
  
  I also found this paper by Mark Reid very interesting – Information, Divergence and Risk for Binary Experiments. And these rule of the thumb articles titled ROC vs Accuracy vs AROC and Choice of Metrics by John Langford.
  
  Thanks Again,
  Shubhendu
on December 1, 2010 at 1:20 pm | Reply Himanshu

Sir, your postings are really very informative. Its my first time visit to your blog & I was amazed by the depth of knowledge. Do post more relevant stuffs…..
on December 13, 2010 at 3:00 am | Reply Will Dwinnell

It may help to consider situations in which the two measures disagree. Within the range of values one usually encounters, I’d expect there to be some correlation between RMSE and AUC. The question is: under what circumstances does a change cause one to improve, while the other deteriorates?
on December 20, 2010 at 7:32 pm | Reply zyxo

Yaroslav,
Funny that you speak of “real” loss.
As a commercial marketing data miner, I rather speak of “Real Gains” when using a model. For that you take into account : the fixed cost of a campaign, the variable cost (contact cost) which is function of the number of prospects targeted and of cause the cumulated net gains (=buy probability x unit profit), as compared to a rather random (or experienced guessed) selection of prospects.
See my blog post on this subject for a more elaborated discussion (http://bit.ly/3F99A6)
Zyxo.

Comments RSS

	Vihan on 1966 Film on John von Neu…
	Shubhendu Trivedi on Strangeness Minus Three (Richa…
	Shubhendu Trivedi on “Darwinian Evolution is…
	Deep on “Darwinian Evolution is…
	Deep on Strangeness Minus Three (Richa…
	Ofir Gagulashvily on Maximum Number of Regions in A…
	Perception and Visio… on Blind Geometers
	Andrea on A Note on the Graph Lapla…
	eJournal (Final Entr… on A Fractal with Zero Area and a…
	Deep on Where does the Sigmoid in Logi…

Onionesque Reality

..::A Random Walk::..

AUC versus RMSE?

5 Responses

Leave a comment Cancel reply

Blog Stats

Visitor Locations

Subscribe To Onionesque Reality

Subscribe To Onionesque Reality By email

Follow Blog via Email

Contact Me

Recent Comments

Pages

Archives

Quote for the Week

Jump to RANDOM Post

Twitter: Shubhendu

A.I / Machine Learning Blogs

Biology / Neurobiology

Foundations

Geek Humor

Mathematics

MATLAB

Movies and Digital Art

Other Blogs

Physics

Professional Associations

Random

Technology

My del.icio.us

Top Posts

Top Clicks

Recent Posts

Meta

Onionesque Reality

..::A Random Walk::..

AUC versus RMSE?

Share this:

Related

5 Responses

Leave a comment Cancel reply

Blog Stats

Visitor Locations

Subscribe To Onionesque Reality

Subscribe To Onionesque Reality By email

Follow Blog via Email

Contact Me

Recent Comments

Pages

Archives

Quote for the Week

Jump to RANDOM Post

Twitter: Shubhendu

A.I / Machine Learning Blogs

Biology / Neurobiology

Foundations

Geek Humor

Mathematics

MATLAB

Movies and Digital Art

Other Blogs

Physics

Professional Associations

Random

Technology

My del.icio.us

Top Posts

Top Clicks

Recent Posts

Meta