wandb: [App] Table not updating it at each call of log

I have a code for which I want to log some metadata at each step. I thought I could log this metadata in a Table, unfortunately the table is not updated on the app after the first call to log.

My code is something like

metadata = [["I love my phone", "1", "1"]]
table = wandb.Table(data=metadata, columns=["Text", "Predicted Label", "True Label"])
wandb.log({"metadata": table})

run = wandb.init(**config)
with run:
    for step in steps:
        ## some code
        table.add_data(*metadata)
        run.log({"examples": table})

On the app, I can only see the initial table with data no rows are added to it. Note that the corresponding artifact is also not updated. Maybe this is due to wandb assigning the same identity to initial and updated table and not logging it again?

About this issue

Original URL
State: open
Created 3 years ago
Reactions: 10
Comments: 48 (4 by maintainers)

Most upvoted comments

First of all, I’m sorry on behalf of W&B for not updating this thread in a while.

There are a few options here.

Use W&B History: If you’re just logging text data, not rich media like images, you can use the typical wandb.log({'predicted': 'hello', 'label': 'world'}) API and then use the UI to get a W&B Table.

In a workspace, click “Add Panel”
Add a Weave panel by clicking Weave
Type runs.history.concat to configure it to concatenate all of your history logs
You can then configure the columns shown by clicking columns in the bottom right of the Table This will function as a Table that can continuously log to, regardless of from which run.

See Tables Update: Create a new wandb.Table object each iteration and log it to the same key with wandb.log. This should update in the UI as you train.

Compare 2 Tables: If you want to compare logged Tables:

Navigate to the Artifacts tab of your project
Click an Artifact
Click File
Click <table_name>.table.json That should show up your Table.

Now you can click compare with a different version of your artifact. You can either concatenate the two Tables so the table shown will have all the rows from both Tables, or join them using one of the columns, so all other columns are merged.

Concatenate Tables: Note: Advanced & may be limited by performance depending on the size of your Table You can create a Weave panel in your workspace and use runs[0].loggedArtifactVersions.map((row, index) => row.file("<table_key>.table.json")) where <table_key> is the key you’ve logged the table to. This will create one big table with all of the rows from your logged Tables (make sure to create a new Table object every iteration). This is using the Weave query language which is still in active development and may change in the future.

+17

scottire on Jan 26, 2024

Hello! I am also experiencing the same issue. In the first log call, the table is logged, but any future log calls with an updated table (updated via add_data) does not display the updated data in the UI. I also do not find any of the updated rows in the artifacts section.

+14

rtaori on Apr 7, 2022

I am also experiencing this issue.

+13

richardrl on May 6, 2022

faced with the same issue, would be nice to have a fix

qubvel on Jul 26, 2023

I am surprised this is still not fixed over a year after being raised.

The docs here are quite misleading, as they give the impression that it’s possible to incrementally log text during training. However, it’s not possible to do that with weights and biases at the moment. Having to set up the workarounds for this adds an annoying amount of friction when setting up WandB logging for training with LLMs.

C-J-Cundy on Jul 13, 2023

+1 I am also facing this issue, for now I am defaulting to logging the tables after training.

peterdavidfagan on Aug 2, 2023

+1 on this

ellisbrown on Jul 22, 2022

A work around I’ve just found is creating a new instance of a Table which contains the same columns and data as the old one, and logging that.

new_table = wandb.Table(
    columns=self.wandb_table.columns, data=self.wandb_table.data
)
self.wandb.log({"predictions": new_table}, commit=False)

Before doing this, it would only create a single json for the table in the RUN_DIR/files/media/table folder, but now it’s making one for each time this command is run.

Maybe this can help narrowing down the problem?

amitkparekh on Dec 8, 2021

Still facing the same issue.

Hgherzog on Aug 15, 2023

I suffered from this problem too. Thanks to @amitkparekh’s solution, I can get it to work. I made a small tweak tho: I just log a shallow copy of the table:

from copy import copy
...

my_table.add_data(...)
wandb.log({"my_table": copy(my_table)})

khasmamad99 on Mar 7, 2023

I experienced the same problem.

It turns out that the tables are being updated when I look at the artifacts section - there is a new version under the artifact, and I can view the table by going to Files --> {path_to_table} --> {table_name.json}.

But for some reason it is not updating in the UI.

ashok-arjun on Jan 29, 2022

Same here

tommasodelorenzo on Jun 23, 2023

Any updates on this 6 months later?

abdo-eldesokey on Dec 3, 2022

So after the first log, any future logs do not do anything. After the finish, it still does not update.

It’s like for the given instance of table, it will only ever be uploaded once. That’s why my workaround solution was to continue adding rows to the table as described in the help, but I log a brand new instantiation of Table using the columns and data from the old one.

amitkparekh on Jan 7, 2022

A work around I’ve just found is creating a new instance of a Table which contains the same columns and data as the old one, and logging that.
new_table = wandb.Table(
    columns=self.wandb_table.columns, data=self.wandb_table.data
)
self.wandb.log({"predictions": new_table}, commit=False)
Before doing this, it would only create a single json for the table in the RUN_DIR/files/media/table folder, but now it’s making one for each time this command is run.

Maybe this can help narrowing down the problem?

As of March 14, 2024, this is THE MOST correct trick.

Although this trick seems to require repeated copies of memory and writes to files. But believe me, it’s the best solution “right now”.

Junyi-99 on Mar 14, 2024

Also facing this issue.

chs20 on Jun 20, 2023

This is still a problem. And same with confusion matrices.

jlehrer1 on Feb 18, 2023

Still same issue. Why is it not resolved yet?

EMCarrami on Jan 18, 2023

Following - i’m facing this same error

jstmn on Dec 23, 2022

Hi! We are still working on this issue - this requires a large refactor to our logic of how we handle tables due to which this is something which has not been implemented yet. I’ll post back here once I have some some good news about the progress of this feature!

ramit-wandb on May 9, 2022

Hi @amitkparekh is this one of tables that weren’t updating properly?

Sorry, no it isn’t the same tables.

On my own project, I’ve been having the same issue where after logging an instance of wandb.Table once, all future “wandb.log’s” are ignored.

I am following the same logic outlined above by @oumarkaba: creating a table at the start, then adding data and logging it each epoch.

It seems that after the first time you log a table, logging that same instance of the table, with or without any data changes does nothing. Looking within the local files for the run, no new table artifacts are created after the first logging, so it might be something in the client-side code preventing it from being updated?

Using the workaround, where I just log a newly created wandb.Table instance using the same columns and data from the table within the state, it works properly — creating the local table artifact files and updating the dashboard as expected.

For completeness, I am using allennlp and have extended their WandBCallback. This is the full module of the callback I am using.

I’m creating the wandb.Table instance on start
I update the table at the end of each epoch
I create a new instance of the table and log the new instance, just as in the the above workaround, and then discard the new instance to save on memory.

import logging
from typing import Any

import wandb
from allennlp.training import GradientDescentTrainer
from allennlp.training.callbacks import TrainerCallback, WandBCallback


logger = logging.getLogger(__name__)


def get_training_stage_from_key(key: str) -> str:
    if key.startswith("training"):
        return "training"
    if key.startswith("validation"):
        return "validation"
    return "unknown"


@TrainerCallback.register("alt_wandb")
class AltWandBCallback(WandBCallback):
    def on_start(
        self,
        trainer: GradientDescentTrainer,
        is_primary: bool = True,
        **kwargs: Any,
    ) -> None:
        if not is_primary:
            return None

        super().on_start(trainer, is_primary, **kwargs)

        # Create prediction table
        columns = ["epoch", "stage", "task", "prediction", "target"]
        self.wandb_table = wandb.Table(columns=columns)

    def on_epoch(
        self,
        trainer: GradientDescentTrainer,
        metrics: dict[str, Any],
        epoch: int,
        is_primary: bool = True,
        **kwargs: Any,
    ) -> None:
        if not is_primary:
            return None

        self._log_predictions_table(metrics, epoch)

        filtered_metrics = {
            name: metric for name, metric in metrics.items() if "__" not in name
        }

        super().on_epoch(trainer, filtered_metrics, epoch, is_primary, **kwargs)

    def _log_predictions_table(self, metrics: dict[str, Any], epoch: int) -> None:
        prediction_metric_keys = {"training__predictions", "validation__predictions"}

        for key in prediction_metric_keys:
            if key not in metrics.keys():
                continue

            stage = get_training_stage_from_key(key)

            for task_prediction in metrics[key].values():
                logger.info(f"Adding data for {task_prediction}")
                self.wandb_table.add_data(epoch, stage, *task_prediction)

        new_table = wandb.Table(
            columns=self.wandb_table.columns, data=self.wandb_table.data
        )

        self.wandb.log({"predictions": new_table}, commit=False)

amitkparekh on Dec 10, 2021