mmaction2: What this training log refers? And for training SlowFast on new data for some custom activity, is there any minimum sample size to start with?

I prepared short sample custom data in AVA format for 2 activity Sweeping and walking, then trained SlowFast for 50 epocs on clip_len=16 (due to hardware limitation). Sharing below the training log json details, looks like its not learning anything because mAP is consistently 0 for all epocs, what could be possible reasons behind it?

Compiler: 10.2\nMMAction2: 0.12.0+13f42bf", “seed”: null, “config_name”: “custom_slowfast.py”, “work_dir”: “slowfast_kinetics_pretrained_r50_4x16x1_20e_ava_rgb_clean-data_new_e80”, “hook_msgs”: {}}

{"mode": "train", "epoch": 1, "iter": 20, "lr": 0.0562, "memory": 8197, "data_time": 0.18563, "loss_action_cls": 0.16409, "recall@thr=0.5": 0.71278, "prec@thr=0.5": 0.67664, "recall@top3": 0.90636, "prec@top3": 0.30212, "recall@top5": 0.91545, "prec@top5": 0.18309, "loss": 0.16409, "grad_norm": 0.91884, "time": 0.99759}
{"mode": "val", "epoch": 1, "iter": 22, "lr": 0.0598, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 2, "iter": 20, "lr": 0.0958, "memory": 8197, "data_time": 0.1842, "loss_action_cls": 0.10098, "recall@thr=0.5": 0.75593, "prec@thr=0.5": 0.74255, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.10098, "grad_norm": 0.34649, "time": 0.98014}
{"mode": "val", "epoch": 2, "iter": 22, "lr": 0.0994, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 3, "iter": 20, "lr": 0.1354, "memory": 8197, "data_time": 0.18966, "loss_action_cls": 0.10026, "recall@thr=0.5": 0.77377, "prec@thr=0.5": 0.77127, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.10026, "grad_norm": 0.30035, "time": 0.99118}
{"mode": "val", "epoch": 3, "iter": 22, "lr": 0.139, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 4, "iter": 20, "lr": 0.175, "memory": 8197, "data_time": 0.18845, "loss_action_cls": 0.12424, "recall@thr=0.5": 0.79485, "prec@thr=0.5": 0.78929, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.12424, "grad_norm": 0.19094, "time": 0.99367}
{"mode": "val", "epoch": 4, "iter": 22, "lr": 0.1786, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 5, "iter": 20, "lr": 0.2146, "memory": 8197, "data_time": 0.18817, "loss_action_cls": 0.11159, "recall@thr=0.5": 0.79285, "prec@thr=0.5": 0.77159, "recall@top3": 0.99545, "prec@top3": 0.33182, "recall@top5": 0.99545, "prec@top5": 0.19909, "loss": 0.11159, "grad_norm": 0.16631, "time": 0.99733}
{"mode": "val", "epoch": 5, "iter": 22, "lr": 0.2182, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 6, "iter": 20, "lr": 0.22, "memory": 8197, "data_time": 0.18938, "loss_action_cls": 0.11952, "recall@thr=0.5": 0.735, "prec@thr=0.5": 0.73273, "recall@top3": 0.98, "prec@top3": 0.32667, "recall@top5": 0.98, "prec@top5": 0.196, "loss": 0.11952, "grad_norm": 0.26395, "time": 0.99816}
{"mode": "val", "epoch": 6, "iter": 22, "lr": 0.22, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 7, "iter": 20, "lr": 0.22, "memory": 8197, "data_time": 0.19043, "loss_action_cls": 0.11324, "recall@thr=0.5": 0.82705, "prec@thr=0.5": 0.82227, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.11324, "grad_norm": 0.1336, "time": 0.9999}
{"mode": "val", "epoch": 7, "iter": 22, "lr": 0.22, "mAP@0.5IOU": 0.0}
{"mode": "train", "epoch": 8, "iter": 20, "lr": 0.22, "memory": 8197, "data_time": 0.18619, "loss_action_cls": 0.08463, "recall@thr=0.5": 0.82482, "prec@thr=0.5": 0.81927, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08463, "grad_norm": 0.11848, "time": 0.99716}
{"mode": "val", "epoch": 8, "iter": 22, "lr": 0.22, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 9, "iter": 20, "lr": 0.22, "memory": 8197, "data_time": 0.18562, "loss_action_cls": 0.09073, "recall@thr=0.5": 0.77285, "prec@thr=0.5": 0.77035, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09073, "grad_norm": 0.12449, "time": 0.99849}
{"mode": "val", "epoch": 9, "iter": 22, "lr": 0.22, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 10, "iter": 20, "lr": 0.22, "memory": 8197, "data_time": 0.18366, "loss_action_cls": 0.09193, "recall@thr=0.5": 0.81924, "prec@thr=0.5": 0.81369, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09193, "grad_norm": 0.09078, "time": 0.99763}
{"mode": "val", "epoch": 10, "iter": 22, "lr": 0.22, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 11, "iter": 20, "lr": 0.022, "memory": 8197, "data_time": 0.18933, "loss_action_cls": 0.09355, "recall@thr=0.5": 0.84336, "prec@thr=0.5": 0.84086, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09355, "grad_norm": 0.08913, "time": 1.00207}
{"mode": "val", "epoch": 11, "iter": 22, "lr": 0.022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 12, "iter": 20, "lr": 0.022, "memory": 8197, "data_time": 0.18655, "loss_action_cls": 0.09352, "recall@thr=0.5": 0.84199, "prec@thr=0.5": 0.83949, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09352, "grad_norm": 0.09578, "time": 0.99861}
{"mode": "val", "epoch": 12, "iter": 22, "lr": 0.022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 13, "iter": 20, "lr": 0.022, "memory": 8197, "data_time": 0.18258, "loss_action_cls": 0.09836, "recall@thr=0.5": 0.86856, "prec@thr=0.5": 0.86856, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09836, "grad_norm": 0.07878, "time": 0.99762}
{"mode": "val", "epoch": 13, "iter": 22, "lr": 0.022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 14, "iter": 20, "lr": 0.022, "memory": 8197, "data_time": 0.18307, "loss_action_cls": 0.08192, "recall@thr=0.5": 0.86619, "prec@thr=0.5": 0.86619, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08192, "grad_norm": 0.07241, "time": 0.99841}
{"mode": "val", "epoch": 14, "iter": 22, "lr": 0.022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 15, "iter": 20, "lr": 0.022, "memory": 8197, "data_time": 0.18555, "loss_action_cls": 0.07062, "recall@thr=0.5": 0.84995, "prec@thr=0.5": 0.84995, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07062, "grad_norm": 0.07792, "time": 0.99924}
{"mode": "val", "epoch": 15, "iter": 22, "lr": 0.022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 16, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18864, "loss_action_cls": 0.08495, "recall@thr=0.5": 0.86629, "prec@thr=0.5": 0.86629, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08495, "grad_norm": 0.08121, "time": 1.00141}
{"mode": "val", "epoch": 16, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 17, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18965, "loss_action_cls": 0.11092, "recall@thr=0.5": 0.8503, "prec@thr=0.5": 0.8503, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.11092, "grad_norm": 0.06323, "time": 1.00582}
{"mode": "val", "epoch": 17, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 18, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18077, "loss_action_cls": 0.08457, "recall@thr=0.5": 0.85369, "prec@thr=0.5": 0.85369, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08457, "grad_norm": 0.06237, "time": 0.9956}
{"mode": "val", "epoch": 18, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 19, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18342, "loss_action_cls": 0.08996, "recall@thr=0.5": 0.84434, "prec@thr=0.5": 0.84226, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08996, "grad_norm": 0.07551, "time": 0.99802}
{"mode": "val", "epoch": 19, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 20, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18127, "loss_action_cls": 0.08211, "recall@thr=0.5": 0.85747, "prec@thr=0.5": 0.85747, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08211, "grad_norm": 0.06186, "time": 0.99498}
{"mode": "val", "epoch": 20, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 21, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18135, "loss_action_cls": 0.0857, "recall@thr=0.5": 0.84931, "prec@thr=0.5": 0.84931, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.0857, "grad_norm": 0.07136, "time": 0.995}
{"mode": "val", "epoch": 21, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 22, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18529, "loss_action_cls": 0.08998, "recall@thr=0.5": 0.86644, "prec@thr=0.5": 0.86208, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08998, "grad_norm": 0.07752, "time": 0.99948}
{"mode": "val", "epoch": 22, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 23, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18675, "loss_action_cls": 0.07464, "recall@thr=0.5": 0.84141, "prec@thr=0.5": 0.84141, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07464, "grad_norm": 0.07109, "time": 1.02437}
{"mode": "val", "epoch": 23, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 24, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.19255, "loss_action_cls": 0.09615, "recall@thr=0.5": 0.87189, "prec@thr=0.5": 0.87189, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09615, "grad_norm": 0.06948, "time": 1.00467}
{"mode": "val", "epoch": 24, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 25, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18252, "loss_action_cls": 0.0939, "recall@thr=0.5": 0.86088, "prec@thr=0.5": 0.86088, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.0939, "grad_norm": 0.06941, "time": 0.99516}
{"mode": "val", "epoch": 25, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 26, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18245, "loss_action_cls": 0.09089, "recall@thr=0.5": 0.84902, "prec@thr=0.5": 0.84901, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09089, "grad_norm": 0.05622, "time": 0.99528}
{"mode": "val", "epoch": 26, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 27, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18309, "loss_action_cls": 0.0874, "recall@thr=0.5": 0.87808, "prec@thr=0.5": 0.87808, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.0874, "grad_norm": 0.06894, "time": 0.99701}
{"mode": "val", "epoch": 27, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 28, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18577, "loss_action_cls": 0.08544, "recall@thr=0.5": 0.84664, "prec@thr=0.5": 0.84437, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08544, "grad_norm": 0.07643, "time": 0.99881}
{"mode": "val", "epoch": 28, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 29, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18908, "loss_action_cls": 0.10787, "recall@thr=0.5": 0.87369, "prec@thr=0.5": 0.87141, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.10787, "grad_norm": 0.05707, "time": 1.00178}
{"mode": "val", "epoch": 29, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 30, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18647, "loss_action_cls": 0.0934, "recall@thr=0.5": 0.8727, "prec@thr=0.5": 0.87042, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.0934, "grad_norm": 0.05735, "time": 0.99853}
{"mode": "val", "epoch": 30, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 31, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18154, "loss_action_cls": 0.07874, "recall@thr=0.5": 0.85874, "prec@thr=0.5": 0.85874, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07874, "grad_norm": 0.06633, "time": 0.99413}
{"mode": "val", "epoch": 31, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 32, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18083, "loss_action_cls": 0.07918, "recall@thr=0.5": 0.86742, "prec@thr=0.5": 0.86492, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07918, "grad_norm": 0.06247, "time": 0.9932}
{"mode": "val", "epoch": 32, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 33, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18088, "loss_action_cls": 0.08861, "recall@thr=0.5": 0.86927, "prec@thr=0.5": 0.86735, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08861, "grad_norm": 0.07271, "time": 0.99552}
{"mode": "val", "epoch": 33, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 34, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.1886, "loss_action_cls": 0.09317, "recall@thr=0.5": 0.86667, "prec@thr=0.5": 0.86667, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09317, "grad_norm": 0.06294, "time": 1.00273}
{"mode": "val", "epoch": 34, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 35, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18746, "loss_action_cls": 0.089, "recall@thr=0.5": 0.87669, "prec@thr=0.5": 0.87669, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.089, "grad_norm": 0.06243, "time": 0.99921}
{"mode": "val", "epoch": 35, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 36, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18179, "loss_action_cls": 0.07702, "recall@thr=0.5": 0.86391, "prec@thr=0.5": 0.86391, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07702, "grad_norm": 0.07411, "time": 0.99609}
{"mode": "val", "epoch": 36, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 37, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18797, "loss_action_cls": 0.08872, "recall@thr=0.5": 0.86088, "prec@thr=0.5": 0.86088, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08872, "grad_norm": 0.07458, "time": 0.99985}
{"mode": "val", "epoch": 37, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 38, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18704, "loss_action_cls": 0.08762, "recall@thr=0.5": 0.87121, "prec@thr=0.5": 0.86843, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08762, "grad_norm": 0.06538, "time": 0.99896}
{"mode": "val", "epoch": 38, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 39, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18852, "loss_action_cls": 0.08822, "recall@thr=0.5": 0.85919, "prec@thr=0.5": 0.85919, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08822, "grad_norm": 0.07977, "time": 1.0016}
{"mode": "val", "epoch": 39, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 40, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18234, "loss_action_cls": 0.09024, "recall@thr=0.5": 0.85601, "prec@thr=0.5": 0.85601, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09024, "grad_norm": 0.06097, "time": 0.99434}
{"mode": "val", "epoch": 40, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 41, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18165, "loss_action_cls": 0.09851, "recall@thr=0.5": 0.84987, "prec@thr=0.5": 0.84737, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09851, "grad_norm": 0.06554, "time": 0.99627}
{"mode": "val", "epoch": 41, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 42, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18597, "loss_action_cls": 0.10595, "recall@thr=0.5": 0.87117, "prec@thr=0.5": 0.87117, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.10595, "grad_norm": 0.05842, "time": 0.99769}
{"mode": "val", "epoch": 42, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 43, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.1856, "loss_action_cls": 0.08387, "recall@thr=0.5": 0.86939, "prec@thr=0.5": 0.86939, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08387, "grad_norm": 0.06906, "time": 1.00146}
{"mode": "val", "epoch": 43, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 44, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18118, "loss_action_cls": 0.08536, "recall@thr=0.5": 0.85187, "prec@thr=0.5": 0.85187, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.08536, "grad_norm": 0.0665, "time": 0.9931}
{"mode": "val", "epoch": 44, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 45, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18369, "loss_action_cls": 0.09834, "recall@thr=0.5": 0.84446, "prec@thr=0.5": 0.84169, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09834, "grad_norm": 0.07264, "time": 0.99587}
{"mode": "val", "epoch": 45, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 46, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18497, "loss_action_cls": 0.07137, "recall@thr=0.5": 0.85472, "prec@thr=0.5": 0.85194, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07137, "grad_norm": 0.07303, "time": 0.99785}
{"mode": "val", "epoch": 46, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 47, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18986, "loss_action_cls": 0.07812, "recall@thr=0.5": 0.86687, "prec@thr=0.5": 0.86687, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07812, "grad_norm": 0.06059, "time": 1.00136}
{"mode": "val", "epoch": 47, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 48, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.188, "loss_action_cls": 0.09891, "recall@thr=0.5": 0.85929, "prec@thr=0.5": 0.85929, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.09891, "grad_norm": 0.05919, "time": 0.99993}
{"mode": "val", "epoch": 48, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 49, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.18616, "loss_action_cls": 0.06949, "recall@thr=0.5": 0.85987, "prec@thr=0.5": 0.85987, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.06949, "grad_norm": 0.07458, "time": 0.99806}
{"mode": "val", "epoch": 49, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

{"mode": "train", "epoch": 50, "iter": 20, "lr": 0.0022, "memory": 8197, "data_time": 0.1849, "loss_action_cls": 0.07176, "recall@thr=0.5": 0.88101, "prec@thr=0.5": 0.88101, "recall@top3": 1.0, "prec@top3": 0.33333, "recall@top5": 1.0, "prec@top5": 0.2, "loss": 0.07176, "grad_norm": 0.06244, "time": 0.99677}
{"mode": "val", "epoch": 50, "iter": 22, "lr": 0.0022, "mAP@0.5IOU": 0.0}

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 34 (15 by maintainers)

Most upvoted comments

As I mentioned earlier, you should set num_classes for data config

data = didct(
    ....
    train=dict(
        type='AVADataset',
        num_classes=3,
        ...
    ),
    val=dict(
        type='AVADataset',
        num_classes=3,
        ...
    ),
    test=dict(
        type='AVADataset',
        num_classes=3,
        ...
    ),
)

In your case

data = dict(
  videos_per_gpu=1,
  workers_per_gpu=4,
  val_dataloader=dict(videos_per_gpu=1),
  test_dataloader=dict(videos_per_gpu=1),
  train=dict(
    type=dataset_type,
    num_classes=3, # don't forget to add this
    ann_file=ann_file_train,
    exclude_file=exclude_file_train,
    pipeline=train_pipeline,
    label_file=label_file,
    proposal_file=proposal_file_train,
    person_det_score_thr=0.9,
    data_prefix=data_root),
  val=dict(
    type=dataset_type,
    num_classes=3,  # don't forget to add this
    ann_file=ann_file_val,
    exclude_file=exclude_file_val,
    pipeline=val_pipeline,
    label_file=label_file,
    proposal_file=proposal_file_val,
    person_det_score_thr=0.9,
    data_prefix=data_root))

Select a few(say 2) annotaed frames and corresponding annotations for BOTH training set and val set. Train and val on these two frames for 100 epochs to get high map. This could help to find bugs in codes and configs.

BTW, since you’re using your own dataset, maybe your dataset is not good enough(not enough data, noise label, etc.) to get good results.

@arvindchandel It seems you forget to set topk=1 in bbox_head.

Your current config(copied from your log) is

bbox_head=dict(
    type='BBoxHeadAVA',
    in_channels=2304,
    num_classes=3,
    multilabel=True,
    dropout_ratio=0.5)

A lot of things may lead to this, codes/dataset/proposals/gt labels… If I were you, I would try to overfit on one batch first.

You need to modify configs to support 3 classes.

  • add topk=(1,) in bbox_head
  • add num_classes=3 in data dict
data = didct(
    ....
    train=dict(
        type='AVADataset',
        num_classes=3,
        ...
    ),
    val=dict(
        type='AVADataset',
        num_classes=3,
        ...
    ),
    test=dict(
        type='AVADataset',
        num_classes=3,
        ...
    ),
)

image Here is my gpu memory. before training and during training set videos_per_gpu=1

Research has found that the setting of the learning rate has little to do with the size of the dataset, so although you reduce the size of the dataset, you can still use the above learning rate.

Please set videos_per_gpu=1

you can use clip_len=32 if you decrese videos_per_gpu. (maybe videos_per_gpu=5).

When you set clip_len=32, videos_per_gpu=5, you need to modify your learning rate to 0.009375

And I think you can try to use slowonly which requires less memory. It can check your error more convenient.

@arvindchandel could you plz share the config of the assertion error version… I may take a look