MONAI: UNet Training Error: Size of Tensors Mismatched

I’m currently experiencing mismatch between my input tensors while trying to train UNet with BraTS2018 data.

I’m working off of the spleen example, which has been very helpful, but I’ve been unable to complete training. I’ve referred to issues #418 and #323, but am still stuck.

My code is as follows:

Data set and Transforms

text_t1 = open(r'C:\Users\jilli\Documents\MF-MRI\BraTS 2018 Training Data\Training\filename_t1.txt', 'r') 
train_images = text_t1.read().split('\n')

text_segs = open(r'C:\Users\jilli\Documents\MF-MRI\BraTS 2018 Training Data\Training\filename_seg.txt', 'r') 
train_labels = text_segs.read().split('\n')

data_dicts = [{'image': image_name, 'label': label_name}
              for image_name, label_name in zip(train_images, train_labels)]
train_files, val_files = data_dicts[:-9], data_dicts[-9:]

train_transforms = Compose([
    LoadNiftid(keys=['image', 'label']),
    AddChanneld(keys=['image', 'label']),
    Spacingd(keys=['image', 'label'], pixdim=(1.5, 1.5, 2.), mode=('bilinear', 'nearest')),
    Orientationd(keys=['image', 'label'], axcodes='RAS'),
    ScaleIntensityRanged(keys=['label'], a_min=0, a_max=4, b_min=0.0, b_max=1.0, clip=True),
    #CropForegroundd(keys=['image', 'label'], source_key='image'),
    ToTensord(keys=['image', 'label'])
])
val_transforms = Compose([
    LoadNiftid(keys=['image', 'label']),
    AddChanneld(keys=['image', 'label']),
    Spacingd(keys=['image', 'label'], pixdim=(1.5, 1.5, 2.), mode=('bilinear', 'nearest')),
    Orientationd(keys=['image', 'label'], axcodes='RAS'),
    ScaleIntensityRanged(keys=['label'], a_min=0, a_max=4, b_min=0.0, b_max=1.0, clip=True),
    #CropForegroundd(keys=['image', 'label'], source_key='image'),
    ToTensord(keys=['image', 'label'])
])

Cache Dataset

train_ds = monai.data.CacheDataset(
    data=train_files, transform=train_transforms, cache_rate=1.0, num_workers=0
)
# train_ds = monai.data.Dataset(data=train_files, transform=train_transforms)

# use batch_size=2 to load images and use RandCropByPosNegLabeld
# to generate 2 x 4 images for network training
train_loader = monai.data.DataLoader(train_ds, batch_size=2, shuffle=True, num_workers=0, multiprocessing_context = None)

val_ds = monai.data.CacheDataset(
    data=val_files, transform=val_transforms, cache_rate=1.0, num_workers=0
)
# val_ds = monai.data.Dataset(data=val_files, transform=val_transforms)
val_loader = monai.data.DataLoader(val_ds, batch_size=1, num_workers=0, multiprocessing_context = None)

Training

device = torch.device('cpu')
model = monai.networks.nets.UNet(dimensions=3, in_channels=1, out_channels=2, channels=(16, 32, 64, 128, 256),
                                 strides=(2, 2, 2, 2), num_res_units=2, norm=Norm.BATCH).to(device)
loss_function = monai.losses.DiceLoss(to_onehot_y=True, softmax=True)
optimizer = torch.optim.Adam(model.parameters(), 1e-4)

val_interval = 2
best_metric = -1
best_metric_epoch = -1
epoch_loss_values = list()
metric_values = list()
for epoch in range(600):
    print('-' * 10)
    print('Epoch {}/{}'.format(epoch + 1, 600))
    model.train()
    epoch_loss = 0
    step = 0
    for batch_data in train_loader:
        step += 1
        inputs, labels = batch_data['image'].to(device), batch_data['label'].to(device)
        optimizer.zero_grad()
        outputs = model(inputs)
        loss = loss_function(outputs, labels)
        loss.backward()
        optimizer.step()
        epoch_loss += loss.item()
        print('{}/{}, train_loss: {:.4f}'.format(step, len(train_ds) // train_loader.batch_size, loss.item()))
    epoch_loss /= step
    epoch_loss_values.append(epoch_loss)
    print('epoch {} average loss: {:.4f}'.format(epoch + 1, epoch_loss))

    if (epoch + 1) % val_interval == 0:
        model.eval()
        with torch.no_grad():
            metric_sum = 0.
            metric_count = 0
            for val_data in val_loader:
                val_inputs, val_labels = val_data['image'].to(device), val_data['label'].to(device)
                roi_size = (160, 160, 160)
                sw_batch_size = 4
                val_outputs = sliding_window_inference(val_inputs, roi_size, sw_batch_size, model)
                value = compute_meandice(y_pred=val_outputs, y=val_labels, include_background=False,
                                         to_onehot_y=True, mutually_exclusive=True)
                metric_count += len(value)
                metric_sum += value.sum().item()
            metric = metric_sum / metric_count
            metric_values.append(metric)
            if metric > best_metric:
                best_metric = metric
                best_metric_epoch = epoch + 1
                torch.save(model.state_dict(), 'best_metric_model.pth')
                print('saved new best metric model')
            print('current epoch {} current mean dice: {:.4f} best mean dice: {:.4f} at epoch {}'.format(
                epoch + 1, metric, best_metric, best_metric_epoch))

I get the following error:

  File "<ipython-input-20-6831040c0cf9>", line 41, in <module>
    outputs = model(inputs)
  File "C:\Users\jilli\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\jilli\AppData\Roaming\Python\Python37\site-packages\monai\networks\nets\unet.py", line 128, in forward
    x = self.model(x)
  File "C:\Users\jilli\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\jilli\Anaconda3\lib\site-packages\torch\nn\modules\container.py", line 100, in forward
    input = module(input)
  File "C:\Users\jilli\Anaconda3\lib\site-packages\torch\nn\modules\module.py", line 550, in __call__
    result = self.forward(*input, **kwargs)
  File "C:\Users\jilli\AppData\Roaming\Python\Python37\site-packages\monai\networks\layers\simplelayers.py", line 33, in forward
    return torch.cat([x, self.submodule(x)], self.cat_dim)

RuntimeError: Sizes of tensors must match except in dimension 1. Got 39 and 40 in dimension 4

Please note image.shape = label.shape = 160, 160, 78

My current set up is on Windows 10 MONAI version: 0.2.0rc1+19.ge8c26a2 Python version: 3.7.4 (default, Aug 9 2019, 18:34:13) [MSC v.1915 64 bit (AMD64)] Numpy version: 1.19.0 Pytorch version: 1.5.0

Any help would be greatly appreciated! Thanks so much.

About this issue

Original URL
State: closed
Created 4 years ago
Comments: 24 (11 by maintainers)

Most upvoted comments

This worked! Thank you. I did also have to change the output_channel to 1 which I believe makes snse. The image input is a single channeled image and the label is a multi-class single channeled label, so a single output channel with a loss function with to_onehot_y = True should correctly run training right?

Thanks again, both of you! I greatly appreciate all your help.

jillianlee on Jul 3, 2020

Sorry I was not verifying it properly. The size should be divisible by 16, so 160,160,80 would work. If you want to use image size 160,160,72 or 168,168,80, then change the network to have channels=(16, 32, 64, 128), strids=(2, 2, 2) from channels=(16, 32, 64, 128, 256), strides=(2, 2, 2, 2)

wyli on Jul 3, 2020

Hi @wyli

So I tried using Resized and set spatial_size to 160, 160, 72 and I get a very similar error to before: RuntimeError: Sizes of tensors must match except in dimension 1. Got 9 and 10 in dimension 2

This error also occurs if I set the size to a cube of dimensions 72, 72, 72

If I use SpatialPadd instead to resize to 168, 168, 80, I again get a similar error: RuntimeError: Sizes of tensors must match except in dimension 1. Got 21 and 22 in dimension 2

I have found other people report very similar errors, especially when dealing with UNets. They too fix it with a resize, but I’m not sure whats going wrong here as it appears from the Check Transforms portion of the code that the size adjustments I’m doing are applying correctly: image shape: torch.Size([160, 160, 72]), label shape: torch.Size([160, 160, 72])

jillianlee on Jul 3, 2020