katib: Error "Objective metric accuracy is not found in training logs, unavailable value is reported. metric: /kind bug
What steps did you take and what happened:
I have been trying to create a simple Katib experiment with sklearn iris dataset but am facing an error "Objective metric accuracy is not found in training logs, unavailable value is reported. metric:<name:“accuracy” value:“unavailable”
Below is my code:
import argparse
import os
import hypertune
import logging
import pandas as pd
YOUR IMPORTS HERE
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split
from sklearn.neighbors import KNeighborsClassifier
def main():
parser = argparse.ArgumentParser()
parser.add_argument(‘–neighbors’, type=int, default=3,
help=‘value of k’)
parser.add_argument(“–log-path”, type=str, default=“”,
help=“Path to save logs. Print to StdOut if log-path is not set”)
parser.add_argument(“–logger”, type=str, choices=[“standard”, “hypertune”],
help=“Logger”, default=“standard”)
args = parser.parse_args()
if args.log_path == "" or args.logger == "hypertune":
logging.basicConfig(
format="%(asctime)s %(levelname)-8s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%SZ",
level=logging.DEBUG)
else:
logging.basicConfig(
format="%(asctime)s %(levelname)-8s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%SZ",
level=logging.DEBUG,
filename=args.log_path)
if args.logger == "hypertune" and args.log_path != "":
os.environ['CLOUD_ML_HP_METRIC_FILE'] = args.log_path
# For JSON logging
hpt = hypertune.HyperTune()
# LOAD DATA HERE
iris_data = load_iris()
iris_df = pd.DataFrame(data=iris_data['data'], columns=iris_data['feature_names'])
iris_df['Iris type'] = iris_data['target']
iris_df['Iris name'] = iris_df['Iris type'].apply(
lambda x: 'sentosa' if x == 0 else ('versicolor' if x == 1 else 'virginica'))
def f(x):
if x == 0:
val = 'setosa'
elif x == 1:
val = 'versicolor'
else:
val = 'virginica'
return val
iris_df['test'] = iris_df['Iris type'].apply(f)
iris_df.drop(['test'], axis=1, inplace=True)
X = iris_df[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']]
y = iris_df['Iris name']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
knn = KNeighborsClassifier(n_neighbors=args.neighbors)
knn.fit(X_train, y_train)
accuracy = knn.score(X_test, y_test)
logging.info("{{metricName: accuracy, metricValue: {:.4f}}}\n".format(accuracy))
if args.logger == "hypertune":
hpt.report_hyperparameter_tuning_metric(
hyperparameter_metric_tag='accuracy',
metric_value=accuracy)
if name == ‘main’:
main()
Below is my yaml file:
apiVersion: kubeflow.org/v1beta1
kind: Experiment
metadata:
namespace: kubeflow
name: iris-1
spec:
parallelTrialCount: 1
maxTrialCount: 2
maxFailedTrialCount: 3
objective:
type: maximize
goal: 0.99
objectiveMetricName: accuracy
metricsCollectorSpec:
collector:
kind: StdOut
algorithm:
algorithmName: random
parameters:
- name: neighbors
parameterType: int
feasibleSpace:
min: “3”
max: “5”
trialTemplate:
primaryContainerName: training-container
trialParameters:
- name: neighbors
description: KNN neighbors
reference: neighbors
trialSpec:
apiVersion: batch/v1
kind: Job
spec:
template:
metadata:
annotations:
sidecar.istio.io/inject: “false”
spec:
containers:
- name: training-container
image: e-dpiac-docker-local.docker.lowes.com/katib-sklearn:v3
command:
- “python3”
- “/app/iris.py”
- “–neighbors=${trialParameters.neighbors}”
- “–logger=hypertune”
resources:
requests:
memory: “6Gi”
cpu: “2”
limits:
memory: “10Gi”
cpu: “4”
restartPolicy: Never
What did you expect to happen: The metrics should have been collected and the trials should have succeeded…
Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]
Environment:
- Katib version (check the Katib controller image version): katib-controller:v0.12.0
- Kubernetes version: (
kubectl version
):
- OS (
uname -a
):
Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍
/kind bug
What steps did you take and what happened: I have been trying to create a simple Katib experiment with sklearn iris dataset but am facing an error "Objective metric accuracy is not found in training logs, unavailable value is reported. metric:<name:“accuracy” value:“unavailable”
Below is my code: import argparse import os import hypertune import logging import pandas as pd
YOUR IMPORTS HERE
from sklearn.datasets import load_iris from sklearn.model_selection import train_test_split from sklearn.neighbors import KNeighborsClassifier
def main(): parser = argparse.ArgumentParser() parser.add_argument(‘–neighbors’, type=int, default=3, help=‘value of k’) parser.add_argument(“–log-path”, type=str, default=“”, help=“Path to save logs. Print to StdOut if log-path is not set”) parser.add_argument(“–logger”, type=str, choices=[“standard”, “hypertune”], help=“Logger”, default=“standard”) args = parser.parse_args()
if args.log_path == "" or args.logger == "hypertune":
logging.basicConfig(
format="%(asctime)s %(levelname)-8s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%SZ",
level=logging.DEBUG)
else:
logging.basicConfig(
format="%(asctime)s %(levelname)-8s %(message)s",
datefmt="%Y-%m-%dT%H:%M:%SZ",
level=logging.DEBUG,
filename=args.log_path)
if args.logger == "hypertune" and args.log_path != "":
os.environ['CLOUD_ML_HP_METRIC_FILE'] = args.log_path
# For JSON logging
hpt = hypertune.HyperTune()
# LOAD DATA HERE
iris_data = load_iris()
iris_df = pd.DataFrame(data=iris_data['data'], columns=iris_data['feature_names'])
iris_df['Iris type'] = iris_data['target']
iris_df['Iris name'] = iris_df['Iris type'].apply(
lambda x: 'sentosa' if x == 0 else ('versicolor' if x == 1 else 'virginica'))
def f(x):
if x == 0:
val = 'setosa'
elif x == 1:
val = 'versicolor'
else:
val = 'virginica'
return val
iris_df['test'] = iris_df['Iris type'].apply(f)
iris_df.drop(['test'], axis=1, inplace=True)
X = iris_df[['sepal length (cm)', 'sepal width (cm)', 'petal length (cm)', 'petal width (cm)']]
y = iris_df['Iris name']
X_train, X_test, y_train, y_test = train_test_split(X, y, random_state=0)
knn = KNeighborsClassifier(n_neighbors=args.neighbors)
knn.fit(X_train, y_train)
accuracy = knn.score(X_test, y_test)
logging.info("{{metricName: accuracy, metricValue: {:.4f}}}\n".format(accuracy))
if args.logger == "hypertune":
hpt.report_hyperparameter_tuning_metric(
hyperparameter_metric_tag='accuracy',
metric_value=accuracy)
if name == ‘main’: main()
Below is my yaml file:
apiVersion: kubeflow.org/v1beta1 kind: Experiment metadata: namespace: kubeflow name: iris-1 spec: parallelTrialCount: 1 maxTrialCount: 2 maxFailedTrialCount: 3 objective: type: maximize goal: 0.99 objectiveMetricName: accuracy metricsCollectorSpec: collector: kind: StdOut algorithm: algorithmName: random parameters: - name: neighbors parameterType: int feasibleSpace: min: “3” max: “5” trialTemplate: primaryContainerName: training-container trialParameters: - name: neighbors description: KNN neighbors reference: neighbors trialSpec: apiVersion: batch/v1 kind: Job spec: template: metadata: annotations: sidecar.istio.io/inject: “false” spec: containers: - name: training-container image: e-dpiac-docker-local.docker.lowes.com/katib-sklearn:v3 command: - “python3” - “/app/iris.py” - “–neighbors=${trialParameters.neighbors}” - “–logger=hypertune” resources: requests: memory: “6Gi” cpu: “2” limits: memory: “10Gi” cpu: “4” restartPolicy: Never
What did you expect to happen: The metrics should have been collected and the trials should have succeeded…
Anything else you would like to add: [Miscellaneous information that will assist in solving the issue.]
Environment:
- Katib version (check the Katib controller image version): katib-controller:v0.12.0
- Kubernetes version: (
kubectl version
): - OS (
uname -a
):
Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍
About this issue
- Original URL
- State: closed
- Created a year ago
- Comments: 23 (23 by maintainers)
That is correct behaviour since you use Katib 0.12 version. In that version, the default
ResumePolicy=LongRunning
. Which allows you to restart your Experiment at any time by changing themaxTrialCount
parameter. In that case, Suggestion pod is always running.In the recent release, we use
ResumePolicy=Never
as a default resume policy, which won’t allow you to restart an Experiment and cleanup the Suggestion Pod.You can learn more about it in this doc: https://www.kubeflow.org/docs/components/katib/resume-experiment/#resume-succeeded-experiment