vision-language-models-are-bows: mismatching results on compositional task

hi, when I try to reproduce the results on ARO, I can’t get the scores, the code is for dataset in VG_Relation VG_Attribution do for resume in scratch/open_clip/src/Outputs/negclip/checkpoints/epoch_0.pt do python3 main_aro.py --dataset=$dataset --model-name=$model --resume=$resume --batch-size=$bs --device=cuda --download done done

and I just got VG_Relation 68.11 VG_Attribution 42.16 instead of 81 and 71 as reported in table6

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Comments: 25 (2 by maintainers)

Most upvoted comments

I think overall nondeterminism can get pretty tricky in the context of the VG-R dataset. The embeddings can be pretty close, and minor differences in embeddings due to nondeterminism can lead to differences around ~0.002-5

Hi! thanks! The CLIP one matches the one in the paper