ValueError: For multi-metric scoring, the parameter refit must be set to a scorer key or a callable

编程小6 (4) 2024-05-27 18:23

Hi，大家好，我是编程小6，很荣幸遇见你，我把这些年在开发过程中遇到的问题或想法写出来，今天说一说ValueError: For multi-metric scoring, the parameter refit must be set to a scorer key or a callable,希望能够帮助你!!!。

ValueError: For multi-metric scoring, the parameter refit must be set to a scorer key or a callable to refit an estimator with the best parameter setting on the whole data and make the best_* attributes available for that metric. If this is not needed, refit should be set to False explicitly. True was passed.

问题：

因为当评估指标有多个的时候，模型不知道自己在refit的时候应该依据哪一个所以需要人为的进行指定才可以。

clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring)

import numpy as np
from sklearn import linear_model, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import log_loss, make_scorer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

iris = datasets.load_iris()
X = iris.data
# 将原始数据的类别处理为二分类问题，原始类别为0,1,2，现在为0,1
y = np.where(iris.target==0,0,1)
# 数据划分
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42,shuffle=True, stratify=y)

# 优化搜索函数；
alphas = np.logspace(1, 10, 100, base = 10)
parameters = {'C':[1, 10],'solver':('liblinear','saga')}
# parameters = {'C':alphas}
# 构建logisitic回归模型，选择L1正则化，
log_lr = linear_model.LogisticRegression(penalty='l1',max_iter=1e5,solver = 'liblinear')
# 构建logit损失函数；
LogLoss = make_scorer(log_loss, greater_is_better=False, needs_proba=True)
# GridSearchCV
scoring = {'AUC': 'roc_auc', 'LogLoss': LogLoss}
# clf = GridSearchCV(log_lr, parameters, cv=5, scoring=LogLoss)
# clf = GridSearchCV(log_lr, parameters, cv=5)
clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring)
# clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring,refit='AUC')
# 模型拟合
clf.fit(X_train, y_train)
print(clf.best_score_, clf.best_estimator_)
iris_model = clf.best_estimator_
# 查看 classification report
print('---------------classification report-------------------')
y_pred = iris_model.predict(X_test)
print(classification_report(y_test, y_pred))

解决：

clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring,refit='AUC')

import numpy as np
from sklearn import linear_model, datasets
from sklearn.model_selection import GridSearchCV
from sklearn.metrics import log_loss, make_scorer
from sklearn.model_selection import train_test_split
from sklearn.metrics import classification_report, confusion_matrix, accuracy_score

iris = datasets.load_iris()
X = iris.data
# 将原始数据的类别处理为二分类问题，原始类别为0,1,2，现在为0,1
y = np.where(iris.target==0,0,1)
# 数据划分
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42,shuffle=True, stratify=y)

# 优化搜索函数；
alphas = np.logspace(1, 10, 100, base = 10)
parameters = {'C':[1, 10],'solver':('liblinear','saga')}
# parameters = {'C':alphas}
# 构建logisitic回归模型，选择L1正则化，
log_lr = linear_model.LogisticRegression(penalty='l1',max_iter=1e5,solver = 'liblinear')
# 构建logit损失函数；
LogLoss = make_scorer(log_loss, greater_is_better=False, needs_proba=True)
# GridSearchCV
scoring = {'AUC': 'roc_auc', 'LogLoss': LogLoss}
# clf = GridSearchCV(log_lr, parameters, cv=5, scoring=LogLoss)
# clf = GridSearchCV(log_lr, parameters, cv=5)
# clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring)
clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring,refit='AUC')
# 模型拟合
clf.fit(X_train, y_train)
print(clf.best_score_, clf.best_estimator_)
iris_model = clf.best_estimator_
# 查看 classification report
print('---------------classification report-------------------')
y_pred = iris_model.predict(X_test)
print(classification_report(y_test, y_pred))

完整错误：

---------------------------------------------------------------------------
ValueError Traceback (most recent call last)
<ipython-input-70-e7a5d74dc020> in <module>
27 clf = GridSearchCV(log_lr, parameters, cv=5, scoring=scoring)
28 # 模型拟合
---> 29 clf.fit(X_train, y_train)
30 print(clf.best_score_, clf.best_estimator_)
31 iris_model = clf.best_estimator_

D:\anaconda\lib\site-packages\sklearn\utils\validation.py in inner_f(*args, **kwargs)
61 extra_args = len(args) - len(all_args)
62 if extra_args <= 0:
---> 63 return f(*args, **kwargs)
64
65 # extra_args > 0

D:\anaconda\lib\site-packages\sklearn\model_selection\_search.py in fit(self, X, y, groups, **fit_params)
754 else:
755 scorers = _check_multimetric_scoring(self.estimator, self.scoring)
--> 756 self._check_refit_for_multimetric(scorers)
757 refit_metric = self.refit
758

D:\anaconda\lib\site-packages\sklearn\model_selection\_search.py in _check_refit_for_multimetric(self, scores)
719 if (self.refit is not False and not valid_refit_dict
720 and not callable(self.refit)):
--> 721 raise ValueError(multimetric_refit_msg)
722
723 @_deprecate_positional_args

ValueError: For multi-metric scoring, the parameter refit must be set to a scorer key or a callable to refit an estimator with the best parameter setting on the whole data and make the best_* attributes available for that metric. If this is not needed, refit should be set to False explicitly. True was passed.

On GridSearchCV's doc, refit is defined as:

refit : boolean, string, or callable, default=True

Refit an estimator using the best found parameters on the whole dataset. For multiple metric evaluation, this needs to be a string denoting the scorer that would be used to find the best parameters for refitting the estimator at the end. Where there are considerations other than maximum score in choosing a best estimator, refit can be set to a function which returns the selected best_index_ given cv_results_. The refitted estimator is made available at the best_estimator_ attribute and permits using predict directly on this GridSearchCV instance. Also for multiple metric evaluation, the attributes best_index_, best_score_ and best_params_ will only be available if refit is set and all of them will be determined w.r.t this specific scorer. best_score_ is not returned if refit is callable. See scoring parameter to know more about multiple metric evaluation.

If you don't want to refit the estimator, you can set refit=False (as boolean). On the other hand, to refit the estimator with one of the scorer, you can do refit='precision_score' for example.

参考：How to fix the error “For multi-metric scoring” for OneClassSVM and GridSearchCV
参考：GridSearchCV

今天的分享到此就结束了，感谢您的阅读，如果确实帮到您，您可以动动手指转发给其他人。

已是最后文章

已是最新文章

发表回复取消回复

请先登录账户再评论哦

ValueError: For multi-metric scoring, the parameter refit must be set to a scorer key or a callable

问题：

解决：

完整错误：

发表回复取消回复

相关推荐

博客之王_新浪博客app

存储过程和触发器作用以及使用过程_简述四种基本触发器及功能

java集合 set_java基础知识

WIFEXITED WEXITSTATUS WIFSIGNALED

最新文章

博客之王_新浪博客app

存储过程和触发器作用以及使用过程_简述四种基本触发器及功能

java集合 set_java基础知识

WIFEXITED WEXITSTATUS WIFSIGNALED

opencv 保存图像_opencv去除背景的方法

ubuntu如何删除软连接_创建软连接 ln -s

info.plist可以打开么_苹果plist文件有什么用

jasper异常_apache协议

springmvc工作原理及其流程_mybatis基本工作原理

php基础不定时更新怎么办_php8什么时候发布的

ValueError: For multi-metric scoring, the parameter refit must be set to a scorer key or a callable

问题：

解决：

完整错误：

发表回复 取消回复

相关推荐

博客之王_新浪博客app

存储过程和触发器作用以及使用过程_简述四种基本触发器及功能

java集合 set_java基础知识

WIFEXITED WEXITSTATUS WIFSIGNALED

最新文章

博客之王_新浪博客app

存储过程和触发器作用以及使用过程_简述四种基本触发器及功能

java集合 set_java基础知识

WIFEXITED WEXITSTATUS WIFSIGNALED

opencv 保存图像_opencv去除背景的方法

ubuntu如何删除软连接_创建软连接 ln -s

info.plist可以打开么_苹果plist文件有什么用

jasper异常_apache协议

springmvc工作原理及其流程_mybatis基本工作原理

php基础不定时更新怎么办_php8什么时候发布的

发表回复取消回复