scikit-learn: Multi-metric scoring is incredibly slow because it repeats predictions for every metric

Description

The implementation of _multimetric_score will call every scorer individually. HERE

Unfortunately, these scorers are typically generated via make_scorer and each individual metric will end up repeatedly calling predict, proba, etc. HERE

For a recent exploratory GridSearch where I was generating lots of metrics (multi-output regression and wanted individual statistics for each output), My scoring time was 75% as long as my fit time which is bonkers and I know that my scoring functions are nowhere near that slow.

Suggested Change

This code should really just be calling predict once and feeding the same predictions into each of the scorers.

About this issue

  • Original URL
  • State: closed
  • Created 6 years ago
  • Comments: 28 (28 by maintainers)

Commits related to this issue

Most upvoted comments

My idea was that “most users” that use multi-metric scoring (if there are any lol)

I didn’t create this ticket just for the fun of it.

Why don’t we just allow scorers to return dictionaries and create a multi-metric scorer?