Otter: mismatch the size of datasets
As the mentioned in paper, the MIMIC-IT dataset has 2.2M instruction qa. But I have downloaded all x_instruction.json
from Hugging Face. The total number of instruction qa is 1171k. Anything I miss?
VST 32k image-qa LA 256k image-qa SN 6k image-qa SD 16k image-qa CGD 141k image-qa E4D 527k video-qa DC 56k video-qa TVC 137k video-qa
In a word, 451k image-qa & 720k video-qa, which 1171k qa totally.
About this issue
- Original URL
- State: closed
- Created 7 months ago
- Comments: 24 (7 by maintainers)
The size mismatch may come from we iteratively cleaned the dataset after submission. We will update the paper later when numbers are fully confirmed.
Let me check the VST’s missing image_ids then.
@Luodian Thanks very very very very much, LA is okay. Now only VST lacks some samples.
Again, thanks for your brilliant work and repo and your help, which teaches me a lot!
This: https://github.com/Luodian/Otter/issues/320
Thanks for your great contribution.
That would be much appreciated if it could be uploaded to hugging face. the OneDrive link is too unstable.