datasets: Command tfds.as_dataframe fails to make dataframe
Short description When I call tfds.as_dataframe, it gives the error below.
Environment information
-
Operating System: Ubuntu 20.04
-
Python version: 3.8
-
tensorflow-datasets/tfds-nightlyversion: tfds-nightly v3.2.1.dev202009090105 -
tensorflow/tf-nightlyversion: tensorflow v2.3 -
Does the issue still exists with the last
tfds-nightlypackage (pip install --upgrade tfds-nightly) ? Yes
Reproduction instructions
import tensorflow.compat.v2 as tf
import tensorflow_datasets as tfds
import pandas as pd
tfds.disable_progress_bar()
tf.enable_v2_behavior()
(ds_train, ds_test), ds_info = tfds.load(
'mnist',
split=['train', 'test'],
shuffle_files=True,
as_supervised=True,
with_info=True,
)
def normalize_img(image, label):
"""Normalizes images: `uint8` -> `float32`."""
return tf.cast(image, tf.float32) / 255., label
ds_test = ds_test.map(
normalize_img, num_parallel_calls=tf.data.experimental.AUTOTUNE)
ds_test = ds_test.batch(128)
ds_test = ds_test.cache()
ds_test = ds_test.prefetch(tf.data.experimental.AUTOTUNE)
df = tfds.as_dataframe(ds_test.take(10), ds_info)
Link to logs
Traceback (most recent call last):
File "mnist_test.py", line 31, in <module>
df = tfds.as_dataframe(ds_test.take(10), ds_info)
File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 192, in as_dataframe
columns = _make_columns(ds.element_spec, ds_info=ds_info)
File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 148, in _make_columns
return [
File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 149, in <listcomp>
ColumnInfo.from_spec(path, ds_info)
File "/home/ubuntu/miniconda3/envs/tensorflow/lib/python3.8/site-packages/tensorflow_datasets/core/as_dataframe.py", line 61, in from_spec
name = '/'.join(path)
TypeError: sequence item 0: expected str instance, int found
Expected behavior Conversion from dataset to into a pandas dataframe
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Reactions: 3
- Comments: 15 (5 by maintainers)
Commits related to this issue
- Add `as_supervised` support for `tfds.as_dataframe` Fix #2476 PiperOrigin-RevId: 333126477 — committed to tensorflow/datasets by Conchylicultor 4 years ago
- Add `as_supervised` support for `tfds.as_dataframe` Fix #2476 PiperOrigin-RevId: 333126477 — committed to tensorflow/datasets by Conchylicultor 4 years ago
- Add `as_supervised` support for `tfds.as_dataframe` Fix #2476 PiperOrigin-RevId: 333241467 — committed to tensorflow/datasets by Conchylicultor 4 years ago
- Sync tensorflow/datasets/master with fork (#1) * Add mocking policies * Mock dataset_info file * Minor Changes * Fix imagenet_v2 dataset * CleanUP * clean oxford_flowers102 * Fix `t... — committed to axd465/datasets by axd465 4 years ago
Please access this file: C:\Users\elver\PycharmProjects\ds_to_csv\venv\lib\site-packages\tensorflow_datasets\core\as_dataframe.py And cast path into str: name = ‘/’.join(map(str,path))
When
as_supervised=Trueis used,ds_infomust be fed to the DataFrame, hence the “str-int” error. If you don’t use the flagas_supervised, it’s up to you to passds_infoto the DataFrame or not, no error for that. Tested on: TF 2.6.0.