tensorflow: The tutorial "Logging and Monitoring Basics with tf.contrib.learn" has error.

When I used the code snippet in the section “Customizing the Evaluation Metrics with MetricSpec” of the tutorial Logging and Monitoring Basics with tf.contrib.learn. the code snippet is

validation_metrics = {
    "accuracy":
        tf.contrib.learn.metric_spec.MetricSpec(
            metric_fn=tf.contrib.metrics.streaming_accuracy,
            prediction_key=tf.contrib.learn.prediction_key.PredictionKey.
            CLASSES),
    "precision":
        tf.contrib.learn.metric_spec.MetricSpec(
            metric_fn=tf.contrib.metrics.streaming_precision,
            prediction_key=tf.contrib.learn.prediction_key.PredictionKey.
            CLASSES),
    "recall":
        tf.contrib.learn.metric_spec.MetricSpec(
            metric_fn=tf.contrib.metrics.streaming_recall,
            prediction_key=tf.contrib.learn.prediction_key.PredictionKey.
            CLASSES)
}

My tensorflow version is r1.0 . When I run my program, it print the following error:

$ python iris.py 
Traceback (most recent call last):
  File "iris.py", line 72, in <module>
    tf.app.run()
  File "/Library/Python/2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "iris.py", line 24, in main
    "accuracy": tf.contrib.learn.metric_spec.MetricSpec(
AttributeError: 'module' object has no attribute 'metric_spec'

I found that the class tf.contrib.learn.metric_spec.MetricSpec has been renamed to tf.contrib.learn.MetricSpec.

The class tf.contrib.learn.prediction_key.PredictionKey also has been renamed to tf.contrib.learn.PredictionKey.

About this issue

Original URL
State: closed
Created 7 years ago
Reactions: 6
Comments: 29 (11 by maintainers)

Links to this issue

deep learning - Tensorflow Estimator - Periodic Evaluation on Eval Dataset - Stack Overflow

Most upvoted comments

How can I use early_stopping in environment?

chenfei-wu on Sep 9, 2017

AxenGitHub I managed to run validation through training by using experiment see a doc here: https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Experiment


experiment = Experiment(estimator=estimator, train_input_fn=training_input_fn,
                               eval_input_fn=eval_input_fn, eval_steps=None, min_eval_frequency=1)
experiment.train_and_evaluate()

I am not sure how effective is it yet, but it did a job. could you please share your solution for implementing validation monitor with the hooks. I made a question at stackoverflow https://stackoverflow.com/questions/45417502/validation-during-training-of-estimator?noredirect=1#comment77798445_45417502

agniszczotka on Aug 8, 2017

is there any update regarding ValidationMonitor as hook? The documentation seems to not be updated

agniszczotka on Jul 31, 2017

No, I did update this tutorial back in December, but haven’t yet switched to use SessionRunHook, as I was waiting on an equivalent canned hook for ValidationMonitor. That’s not yet available, correct?

In the meantime, for an example of applying a SessionRunHook to an Estimator, you can refer to the tf.layers tutorial (https://www.tensorflow.org/tutorials/layers), which covers how to configure a LoggingTensorHook.

sandersk on Feb 21, 2017

@Moymix you can implement early stopping by using the continuous_eval_predicate_fn, available in tf.contrib.learn.Experiment.continuous_eval_on_train_data. For instance, let’s take a batch size of 10 and early stop count of 15. Modifying the example at TF Layers tutorial for a bigger dataset, the code would look like this:

BATCH_SIZE  = 10
EARLY_STOP_COUNT = 15

# Model function
def model_fn(features, labels, mode):
  # ...
  eval_metric_ops = { "accuracy"  : accuracy}
  return tf.estimator.EstimatorSpec(
      mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

# Early stopping function
accuracy_reg = np.zeros(EARLY_STOP_COUNT)
def early_stopping(eval_results):
  # None argument for the first evaluation
  if not eval_results: 
    return True
  
  accuracy_reg[0 : EARLY_STOP_COUNT - 1] = accuracy_reg[1 : EARLY_STOP_COUNT]
  accuracy_reg[EARLY_STOP_COUNT - 1] = eval_results["accuracy"]
  counts = 0
  for i in range(0, EARLY_STOP_COUNT - 1):
    if accuracy_reg[i + 1] <= accuracy_reg[i]:
      counts += 1
  if counts == EARLY_STOP_COUNT - 1:
    print("\nEarly stopping: %s \n" % accuracy_reg)
    return False
    
  return True

# Main function
def main(unused_argv):
  #...
  estimator = tf.estimator.Estimator(
  #...
  # Train the model 
  train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"data": train_data},
    y=train_labels,
    batch_size=BATCH_SIZE,
    num_epochs=None, # Continue until training steps are finished
    shuffle=True
    )
  eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"data": validate_data},
    y=validate_labels,
    batch_size=BATCH_SIZE,
    num_epochs=1, 
    shuffle=False
    )
  experiment = tf.contrib.learn.Experiment(
    estimator=estimator,
    train_input_fn=train_input_fn,
    eval_input_fn=eval_input_fn,
    train_steps=80000,
    eval_steps=None, # evaluate runs until input is exhausted
    eval_delay_secs=180, 
    train_steps_per_iteration=1000
    )
  experiment.continuous_train_and_eval(
    continuous_eval_predicate_fn=early_stopping)  
  
  # ...

However, have in mind that continuous_eval_predicate_fn is an experimental function, so it could change at any moment.

lelugom on Oct 22, 2017

I am in the same boat as “agniszczotka”. I have successfully used a SummarySaverHook to write some stats to file and display them on tensorboard, but i am wondering how i can evaluate the accuracy improvement through training. Should i put an estimator.evaluate with different “step” parameters to evaluate the accuracy in different moments/checkpoints? In specific, i am trying to replicate this: https://www.tensorflow.org/versions/r1.3/get_started/monitors#evaluating_every_n_steps

AxenGitHub on Aug 2, 2017

I’ve created a ValidationHook based on the existing LoggingTensorHook.

import tensorflow as tf


class ValidationHook(tf.train.SessionRunHook):
    def __init__(self, model_fn, params, input_fn, checkpoint_dir,
                 every_n_secs=None, every_n_steps=None):
        self._iter_count = 0
        self._estimator = tf.estimator.Estimator(
            model_fn=model_fn,
            params=params,
            model_dir=checkpoint_dir
        )
        self._input_fn = input_fn
        self._timer = tf.train.SecondOrStepTimer(every_n_secs, every_n_steps)
        self._should_trigger = False

    def begin(self):
        self._timer.reset()
        self._iter_count = 0

    def before_run(self, run_context):
        self._should_trigger = self._timer.should_trigger_for_step(self._iter_count)

    def after_run(self, run_context, run_values):
        if self._should_trigger:
            self._estimator.evaluate(
                self._input_fn
            )
            self._timer.update_last_triggered_step(self._iter_count)
        self._iter_count += 1

You can attach it as a hook whenever you run Estimator.train().

selcouthlyBlue on Apr 24, 2018

Take a look at this example: https://stackoverflow.com/questions/46326848/early-stopping-with-experiment-tensorflow

def experiment_fn(run_config):
    estimator = tf.estimator.Estimator(...)

    train_monitors = tf.contrib.learn.monitors.ValidationMonitor(
            early_stopping_metric = "loss",
    )

    return learn.Experiment(
        estimator = estimator,
        train_input_fn = train_input_fn,
        eval_input_fn = eval_input_fn,
        train_monitors = [train_monitors])

ex = learn_runner.run(
        experiment_fn = experiment_fn,
)

alyaxey on Nov 3, 2017

@agniszczotka @alyaxey Using Experiment works and enables me to run validation along with training. However, I’ve found that the batch size is probably encoded as a constant instead of a symbolic tensor for the input node even though it is coded as a reshape node with variable batch size (i.e, tf.reshape(features[“x”], [-1, …]). As a result, in the Android code, I have to allocate an array of similar size as the batch size to store the output (i.e, fetch()).

lam on Jan 21, 2018

@agniszczotka Thanks for your help. When I implement your suggestion, I get the following error: File ".../anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 253, in train if (config.environment != run_config.Environment.LOCAL and AttributeError: 'RunConfig' object has no attribute 'environment' Any idea on how to get around it?

maximedb on Aug 29, 2017

Yes. All Monitors are deprecated. Not all of them have a direct equivalent, but there should be hooks for the main use cases. Except ValidationMonitor, as of today.

martinwicke on Apr 24, 2017

I’m also following this tutorial and having problems with it. I’m using the latest 1.0.1 release.

The tutorial at https://www.tensorflow.org/get_started/monitors has outdated code snippet which generates errors.
The tutorial code at https://github.com/tensorflow/tensorflow/blob/master/tensorflow/examples/tutorials/monitors/iris_monitors.py does not generates errors and generates some checkpoint files in the log_dir. But Tensorboard doesn’t display any data (screenshot attached)

Is there any working example for these monitors CaptureVariable, PrintTensor, ValidationMonitor?

alanyuchenhou on Apr 12, 2017

@lienhua34 yes it’s correct. The interface has been sealed recently. Welcome to submit a pull request! @martinwicke Does the team have any plan to rewrite Monitor tutorial by Hooks?

terrytangyuan on Feb 19, 2017