tensorflow: The tutorial "Logging and Monitoring Basics with tf.contrib.learn" has error.

When I used the code snippet in the section “Customizing the Evaluation Metrics with MetricSpec” of the tutorial Logging and Monitoring Basics with tf.contrib.learn. the code snippet is

validation_metrics = {
    "accuracy":
        tf.contrib.learn.metric_spec.MetricSpec(
            metric_fn=tf.contrib.metrics.streaming_accuracy,
            prediction_key=tf.contrib.learn.prediction_key.PredictionKey.
            CLASSES),
    "precision":
        tf.contrib.learn.metric_spec.MetricSpec(
            metric_fn=tf.contrib.metrics.streaming_precision,
            prediction_key=tf.contrib.learn.prediction_key.PredictionKey.
            CLASSES),
    "recall":
        tf.contrib.learn.metric_spec.MetricSpec(
            metric_fn=tf.contrib.metrics.streaming_recall,
            prediction_key=tf.contrib.learn.prediction_key.PredictionKey.
            CLASSES)
}

My tensorflow version is r1.0 . When I run my program, it print the following error:

$ python iris.py 
Traceback (most recent call last):
  File "iris.py", line 72, in <module>
    tf.app.run()
  File "/Library/Python/2.7/site-packages/tensorflow/python/platform/app.py", line 44, in run
    _sys.exit(main(_sys.argv[:1] + flags_passthrough))
  File "iris.py", line 24, in main
    "accuracy": tf.contrib.learn.metric_spec.MetricSpec(
AttributeError: 'module' object has no attribute 'metric_spec'

I found that the class tf.contrib.learn.metric_spec.MetricSpec has been renamed to tf.contrib.learn.MetricSpec.

The class tf.contrib.learn.prediction_key.PredictionKey also has been renamed to tf.contrib.learn.PredictionKey.

About this issue

  • Original URL
  • State: closed
  • Created 7 years ago
  • Reactions: 6
  • Comments: 29 (11 by maintainers)

Most upvoted comments

How can I use early_stopping in environment?

AxenGitHub I managed to run validation through training by using experiment see a doc here: https://www.tensorflow.org/api_docs/python/tf/contrib/learn/Experiment


experiment = Experiment(estimator=estimator, train_input_fn=training_input_fn,
                               eval_input_fn=eval_input_fn, eval_steps=None, min_eval_frequency=1)
experiment.train_and_evaluate()

I am not sure how effective is it yet, but it did a job. could you please share your solution for implementing validation monitor with the hooks. I made a question at stackoverflow https://stackoverflow.com/questions/45417502/validation-during-training-of-estimator?noredirect=1#comment77798445_45417502

is there any update regarding ValidationMonitor as hook? The documentation seems to not be updated

No, I did update this tutorial back in December, but haven’t yet switched to use SessionRunHook, as I was waiting on an equivalent canned hook for ValidationMonitor. That’s not yet available, correct?

In the meantime, for an example of applying a SessionRunHook to an Estimator, you can refer to the tf.layers tutorial (https://www.tensorflow.org/tutorials/layers), which covers how to configure a LoggingTensorHook.

@Moymix you can implement early stopping by using the continuous_eval_predicate_fn, available in tf.contrib.learn.Experiment.continuous_eval_on_train_data. For instance, let’s take a batch size of 10 and early stop count of 15. Modifying the example at TF Layers tutorial for a bigger dataset, the code would look like this:

BATCH_SIZE  = 10
EARLY_STOP_COUNT = 15

# Model function
def model_fn(features, labels, mode):
  # ...
  eval_metric_ops = { "accuracy"  : accuracy}
  return tf.estimator.EstimatorSpec(
      mode=mode, loss=loss, eval_metric_ops=eval_metric_ops)

# Early stopping function
accuracy_reg = np.zeros(EARLY_STOP_COUNT)
def early_stopping(eval_results):
  # None argument for the first evaluation
  if not eval_results: 
    return True
  
  accuracy_reg[0 : EARLY_STOP_COUNT - 1] = accuracy_reg[1 : EARLY_STOP_COUNT]
  accuracy_reg[EARLY_STOP_COUNT - 1] = eval_results["accuracy"]
  counts = 0
  for i in range(0, EARLY_STOP_COUNT - 1):
    if accuracy_reg[i + 1] <= accuracy_reg[i]:
      counts += 1
  if counts == EARLY_STOP_COUNT - 1:
    print("\nEarly stopping: %s \n" % accuracy_reg)
    return False
    
  return True

# Main function
def main(unused_argv):
  #...
  estimator = tf.estimator.Estimator(
  #...
  # Train the model 
  train_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"data": train_data},
    y=train_labels,
    batch_size=BATCH_SIZE,
    num_epochs=None, # Continue until training steps are finished
    shuffle=True
    )
  eval_input_fn = tf.estimator.inputs.numpy_input_fn(
    x={"data": validate_data},
    y=validate_labels,
    batch_size=BATCH_SIZE,
    num_epochs=1, 
    shuffle=False
    )
  experiment = tf.contrib.learn.Experiment(
    estimator=estimator,
    train_input_fn=train_input_fn,
    eval_input_fn=eval_input_fn,
    train_steps=80000,
    eval_steps=None, # evaluate runs until input is exhausted
    eval_delay_secs=180, 
    train_steps_per_iteration=1000
    )
  experiment.continuous_train_and_eval(
    continuous_eval_predicate_fn=early_stopping)  
  
  # ...

However, have in mind that continuous_eval_predicate_fn is an experimental function, so it could change at any moment.

I am in the same boat as “agniszczotka”. I have successfully used a SummarySaverHook to write some stats to file and display them on tensorboard, but i am wondering how i can evaluate the accuracy improvement through training. Should i put an estimator.evaluate with different “step” parameters to evaluate the accuracy in different moments/checkpoints? In specific, i am trying to replicate this: https://www.tensorflow.org/versions/r1.3/get_started/monitors#evaluating_every_n_steps

I’ve created a ValidationHook based on the existing LoggingTensorHook.

import tensorflow as tf


class ValidationHook(tf.train.SessionRunHook):
    def __init__(self, model_fn, params, input_fn, checkpoint_dir,
                 every_n_secs=None, every_n_steps=None):
        self._iter_count = 0
        self._estimator = tf.estimator.Estimator(
            model_fn=model_fn,
            params=params,
            model_dir=checkpoint_dir
        )
        self._input_fn = input_fn
        self._timer = tf.train.SecondOrStepTimer(every_n_secs, every_n_steps)
        self._should_trigger = False

    def begin(self):
        self._timer.reset()
        self._iter_count = 0

    def before_run(self, run_context):
        self._should_trigger = self._timer.should_trigger_for_step(self._iter_count)

    def after_run(self, run_context, run_values):
        if self._should_trigger:
            self._estimator.evaluate(
                self._input_fn
            )
            self._timer.update_last_triggered_step(self._iter_count)
        self._iter_count += 1

You can attach it as a hook whenever you run Estimator.train().

Take a look at this example: https://stackoverflow.com/questions/46326848/early-stopping-with-experiment-tensorflow

def experiment_fn(run_config):
    estimator = tf.estimator.Estimator(...)

    train_monitors = tf.contrib.learn.monitors.ValidationMonitor(
            early_stopping_metric = "loss",
    )

    return learn.Experiment(
        estimator = estimator,
        train_input_fn = train_input_fn,
        eval_input_fn = eval_input_fn,
        train_monitors = [train_monitors])

ex = learn_runner.run(
        experiment_fn = experiment_fn,
)

@agniszczotka @alyaxey Using Experiment works and enables me to run validation along with training. However, I’ve found that the batch size is probably encoded as a constant instead of a symbolic tensor for the input node even though it is coded as a reshape node with variable batch size (i.e, tf.reshape(features[“x”], [-1, …]). As a result, in the Android code, I have to allocate an array of similar size as the batch size to store the output (i.e, fetch()).

screen shot 2018-01-20 at 10 15 16 pm

@agniszczotka Thanks for your help. When I implement your suggestion, I get the following error: File ".../anaconda2/lib/python2.7/site-packages/tensorflow/contrib/learn/python/learn/experiment.py", line 253, in train if (config.environment != run_config.Environment.LOCAL and AttributeError: 'RunConfig' object has no attribute 'environment' Any idea on how to get around it?

Yes. All Monitors are deprecated. Not all of them have a direct equivalent, but there should be hooks for the main use cases. Except ValidationMonitor, as of today.

I’m also following this tutorial and having problems with it. I’m using the latest 1.0.1 release.

Is there any working example for these monitors CaptureVariable, PrintTensor, ValidationMonitor?

@lienhua34 yes it’s correct. The interface has been sealed recently. Welcome to submit a pull request! @martinwicke Does the team have any plan to rewrite Monitor tutorial by Hooks?