tensorflow: CSV decode error

I am using Tensorflow 0.9.0. I have CUDA Driver Version = 7.5 and CUDNN 4 on Ubuntu 14.04 This maybe related to another issue I found here

I have a simple csv file which has a single line like this

"field with
newline",0

where the newline has been added by pressing Enter key in vim on Ubuntu. I am able to read this file in pandas using the read_csv function where the text field is shown as containing a single \n character.

But when I try to read it in tensorflow, I get the following error: tensorflow.python.framework.errors.InvalidArgumentError: Quoted field has to end with quote followed by delim or end

My tensorflow code to read the csv uses this function to read a single row.

def read_single_example(filename_queue, skip_header_lines, record_defaults, feature_index, label_index):
    reader = tf.TextLineReader(skip_header_lines=skip_header_lines)
    key, value = reader.read(filename_queue)
    record = tf.decode_csv(
            value,
            record_defaults=record_defaults)
    features, label = record[feature_index], record[label_index]
    return features, label

If I read using pandas and replace all newlines with spaces, the tensorflow code is able to parse the csv successfully.

But it will be really helpful if newlines can be handled within the Tensorflow CSV pipeline itself.

About this issue

  • Original URL
  • State: closed
  • Created 8 years ago
  • Comments: 19 (13 by maintainers)

Most upvoted comments

Here’s the updated link for tf.data.experimental.CsvDataset

because tf.io.decode_csv sometimes will not work at all.

Came to this issue by way of an old stackoverflow question. For the interested parties, this issue has been fixed in tf.data with tf.contrib.data.CsvDataset .