coref: Unable to check Evaluation results

I want to test & evaluate performance of this model on GAP data-set. Using its below google colab notebook to avoid machine dependency issues. https://colab.research.google.com/drive/1SlERO9Uc9541qv6yH26LJz5IM9j7YVra#scrollTo=H0xPknceFORt

How can I view the results of evaluation metrics as shown in the mentioned research paper?

Secondly, I tried to run cmd ! GPU=0 python evaluate.py $CHOSEN_MODEL in colab, assuming it would generate the evaluation results, but getting below error:

..
..
..
W0518 20:39:24.163360 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/keras/optimizer_v2/learning_rate_schedule.py:409: div (from tensorflow.python.ops.math_ops) is deprecated and will be removed in a future version.
Instructions for updating:
Deprecated in favor of operator or tf.math.divide.
W0518 20:39:24.184890 140409457641344 deprecation_wrapper.py:119] From /content/coref/optimization.py:64: The name tf.train.AdamOptimizer is deprecated. Please use tf.compat.v1.train.AdamOptimizer instead.

bert:task 199 27
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
/usr/local/lib/python3.6/dist-packages/tensorflow/python/ops/gradients_util.py:93: UserWarning: Converting sparse IndexedSlices to a dense Tensor of unknown shape. This may consume a large amount of memory.
  "Converting sparse IndexedSlices to a dense Tensor of unknown shape. "
2020-05-18 20:39:34.733225: I tensorflow/stream_executor/platform/default/dso_loader.cc:42] Successfully opened dynamic library libcuda.so.1
2020-05-18 20:39:34.735705: E tensorflow/stream_executor/cuda/cuda_driver.cc:318] failed call to cuInit: CUDA_ERROR_NO_DEVICE: no CUDA-capable device is detected
2020-05-18 20:39:34.735782: I tensorflow/stream_executor/cuda/cuda_diagnostics.cc:156] kernel driver does not appear to be running on this host (fcef0d45f3e2): /proc/driver/nvidia/version does not exist
2020-05-18 20:39:34.736239: I tensorflow/core/platform/cpu_feature_guard.cc:142] Your CPU supports instructions that this TensorFlow binary was not compiled to use: AVX2 FMA
2020-05-18 20:39:34.750671: I tensorflow/core/platform/profile_utils/cpu_utils.cc:94] CPU Frequency: 2200160000 Hz
2020-05-18 20:39:34.751024: I tensorflow/compiler/xla/service/service.cc:168] XLA service 0x78bf9c0 executing computations on platform Host. Devices:
2020-05-18 20:39:34.751074: I tensorflow/compiler/xla/service/service.cc:175]   StreamExecutor device (0): <undefined>, <undefined>
Restoring from ./spanbert_base/model.max.ckpt
2020-05-18 20:39:40.322311: W tensorflow/compiler/jit/mark_for_compilation_pass.cc:1412] (One-time warning): Not using XLA:CPU for cluster because envvar TF_XLA_FLAGS=--tf_xla_cpu_global_jit was not set.  If you want XLA:CPU, either set that envvar, or use experimental_jit_scope to enable XLA:CPU.  To confirm that XLA is active, pass --vmodule=xla_compilation_cache=1 (as a proper command-line flag, not via TF_XLA_FLAGS) or set the envvar XLA_FLAGS=--xla_hlo_profile.
W0518 20:39:47.165590 140409457641344 deprecation.py:323] From /usr/local/lib/python3.6/dist-packages/tensorflow/python/training/saver.py:1276: checkpoint_exists (from tensorflow.python.training.checkpoint_management) is deprecated and will be removed in a future version.
Instructions for updating:
Use standard file APIs to check for files with this prefix.
Traceback (most recent call last):
  File "evaluate.py", line 26, in <module>
    model.evaluate(session, official_stdout=True, eval_mode=True)
  File "/content/coref/independent.py", line 538, in evaluate
    self.load_eval_data()
  File "/content/coref/independent.py", line 532, in load_eval_data
    with open(self.config["eval_path"]) as f:
FileNotFoundError: [Errno 2] No such file or directory: './dev.english.384.jsonlines'

Any idea what could be the possible reason? I am new to this area/environment and following the colab code right now.

Looking for some suggestions / steps to do evaluation part.

Thanks in advance!

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 16 (5 by maintainers)

Most upvoted comments

I’m not sure I understand. How did you generate the input file for predict.py? Did you not use gap_to_jsonlines.py? If not, that error makes sense. Here’s the full pipeline in case it wasn’t clear:

#!/bin/bash
gap_file_prefix=$1
vocab_file=$2
python gap_to_jsonlines.py $gap_file_prefix.tsv $vocab_file
GPU=0 python predict.py bert_base $gap_file_prefix.jsonlines $gap_file_prefix.output.jsonlines
python to_gap_tsv.py $gap_file_prefix.output.jsonlines
python2 ../gap-coreference/gap_scorer.py --gold_tsv $gap_file_prefix.tsv --system_tsv $gap_file_prefix.output.tsv

$1/$gap_file_prefix points to the path of the original GAP file without the tsv prefix.

mandarjoshi90 on Jun 19, 2020

Sorry, I’m getting to this only now. GAP uses accuracy for evaluation as opposed to F1 for OntoNotes. If you’ve been able to make predictions, you can convert them to tsv files and call the gap scorer script.

python to_gap_tsv.py <prediction_jsonline_file> <tsv_output_file> should do the trick.

mandarjoshi90 on Jun 13, 2020