scikit-learn: Multi-metric scoring is incredibly slow because it repeats predictions for every metric

Description

The implementation of _multimetric_score will call every scorer individually. HERE

Unfortunately, these scorers are typically generated via make_scorer and each individual metric will end up repeatedly calling predict, proba, etc. HERE

For a recent exploratory GridSearch where I was generating lots of metrics (multi-output regression and wanted individual statistics for each output), My scoring time was 75% as long as my fit time which is bonkers and I know that my scoring functions are nowhere near that slow.

Suggested Change

This code should really just be calling predict once and feeding the same predictions into each of the scorers.

About this issue

Original URL
State: closed
Created 6 years ago
Comments: 28 (28 by maintainers)

Commits related to this issue

Improve multi-metric scoring computation. Previously multi metric scoring called the `predict` method of an estimator once for each scorer, this could lead to drastic increases in costs. This change... — committed to gamazeps/scikit-learn by gamazeps 6 years ago
Improve multi-metric scoring computation. Previously multi metric scoring called the `predict` method of an estimator once for each scorer, this could lead to drastic increases in costs. This change... — committed to gamazeps/scikit-learn by gamazeps 6 years ago
Improve multi-metric scoring computation. Previously multi metric scoring called the `predict` method of an estimator once for each scorer, this could lead to drastic increases in costs. This change... — committed to gamazeps/scikit-learn by gamazeps 6 years ago
Improve multi-metric scoring computation. Previously multi metric scoring called the `predict` method of an estimator once for each scorer, this could lead to drastic increases in costs. This change... — committed to gamazeps/scikit-learn by gamazeps 6 years ago
Improve multi-metric scoring computation. Previously multi metric scoring called the `predict` method of an estimator once for each scorer, this could lead to drastic increases in costs. This change... — committed to gamazeps/scikit-learn by gamazeps 6 years ago
Improve multi-metric scoring computation. Previously multi metric scoring called the `predict` method of an estimator once for each scorer, this could lead to drastic increases in costs. This change... — committed to gamazeps/scikit-learn by gamazeps 6 years ago

Most upvoted comments

My idea was that “most users” that use multi-metric scoring (if there are any lol)

I didn’t create this ticket just for the fun of it.

jimmywan on Oct 18, 2018

Why don’t we just allow scorers to return dictionaries and create a multi-metric scorer?

amueller on Oct 16, 2018