pytorch_geometric: IndexError: Encountered an index error. Please ensure that all indices in 'edge_index' point to valid indices in the interval [0, X] (got interval [0, Y])

Hi, I want to do supervised binary classification of a list of labelled graphs (train_graphs):

train_loader = DataLoader(
        train_graphs,
        batch_size=1,
        shuffle=True)

I get an index error, and assume that the error is caused by the node labelling and missing nodes. The maximum integer label of the first failing example is “405” but the number of nodes is 400. But theoretically, DataLoader should be able to deal with missing nodes or graphs of different sizes, right?

def train(loader, model):
    model.train()
    for batch in loader:

        print("Run batch")
        print(batch)
        print(max(batch.edge_index[0]))
        logits = model(x=batch.x,
                       edge_index=batch.edge_index,
                       edge_attr=batch.edge_attr,
                       batch=batch.batch)

Output of print statements:

Run batch
DataBatch(x=[406, 1], edge_index=[2, 20618], edge_attr=[20618, 1], y=[1], num_nodes=406, dataType=[1], batch=[406], ptr=[2])
tensor(405)
Run batch
DataBatch(x=[406, 1], edge_index=[2, 22208], edge_attr=[22208, 1], y=[1], num_nodes=406, dataType=[1], batch=[406], ptr=[2])
tensor(405)
Run batch
DataBatch(x=[400, 1], edge_index=[2, 27101], edge_attr=[27101, 1], y=[1], num_nodes=400, dataType=[1], batch=[400], ptr=[2])
tensor(405)

Error message and traceback:

Traceback (most recent call last):
  File "miniconda3/lib/python3.9/site-packages/torch_geometric/nn/conv/message_passing.py", line 272, in _lift
    return src.index_select(self.node_dim, index)
IndexError: index out of range in self

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "gcnn_run2.py", line 424, in <module>
    main()
  File "gcnn_run2.py", line 326, in main
    train(
  File "gcnn_run2.py", line 185, in train
    logits = model(x=batch.x,
  File "miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "gcnn_run2.py", line 162, in forward
    x = layer(x, edge_index, edge_attr)
  File "miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "miniconda3/lib/python3.9/site-packages/torch_geometric/nn/models/deepgcn.py", line 94, in forward
    h = self.conv(h, *args, **kwargs)
  File "miniconda3/lib/python3.9/site-packages/torch/nn/modules/module.py", line 1190, in _call_impl
    return forward_call(*input, **kwargs)
  File "miniconda3/lib/python3.9/site-packages/torch_geometric/nn/conv/graph_conv.py", line 86, in forward
    out = self.propagate(edge_index, x=x, edge_weight=edge_weight,
  File "miniconda3/lib/python3.9/site-packages/torch_geometric/nn/conv/message_passing.py", line 461, in propagate
    coll_dict = self._collect(self._user_args, edge_index, size,
  File "miniconda3/lib/python3.9/site-packages/torch_geometric/nn/conv/message_passing.py", line 335, in _collect
    data = self._lift(data, edge_index, dim)
  File "miniconda3/lib/python3.9/site-packages/torch_geometric/nn/conv/message_passing.py", line 275, in _lift
    raise IndexError(
IndexError: Encountered an index error. Please ensure that all indices in 'edge_index' point to valid indices in the interval [0, 399] (got interval [0, 405])

Environment:

Python 3.9.12
PyG version: 2.4.0
Torch: 1.13.0
Ubuntu 20.04 LTS

Any ideas what I can do about this? Thanks!

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 16 (7 by maintainers)

Most upvoted comments

It looks like your model architecture is wrong, since you are using graph-level pooling readouts as input into your next GNN layer:

        h1 = self.conv1(x, edge_index, edge_attr)
        h1 = global_add_pool(h1, batch)
        h2 = self.conv2(h1, edge_index, edge_attr)
        h2 = global_add_pool(h2, batch)
        h3 = self.conv3(h2, edge_index, edge_attr)
        h3 = global_add_pool(h3, batch)
        h = torch.cat((h1, h2, h3), dim=1)

I think it should be

        h = self.conv1(x, edge_index, edge_attr)
        h1 = global_add_pool(h, batch)
        h = self.conv2(h, edge_index, edge_attr)
        h2 = global_add_pool(h, batch)
        h = self.conv3(h, edge_index, edge_attr)
        h3 = global_add_pool(h, batch)
        h = torch.cat((h1, h2, h3), dim=1)

You need to fix your data for this. Calling data.validate() should give you additional information.

edge_index needs to always point to the corresponding index in the feature matrix. As you said, if you want to store additional node IDs, you can hold them in a separate attribute, e.g., data.node_id, and use that for any mapping related work.