nbgrader: Duplicating cells results in autograding failure because of duplicate IDs, and confusing error message

This is similar to #981, as in that may solve the problem, but I’ll document core problem here and a workaround.

Students have submitted notebooks in which they have copied cells. This duplicates the metadata: of importance here is "nbgrader": {"grade_id": ...}. When this happens, the following error message is given:

[AutogradeApp | WARNING] Cell with id 'Task_3_1_test' exists multiple times!
...
[AutogradeApp | WARNING] Removing failed assignment: /m/jhnas/jupyter/course/mlpython2019/files/autograded/staafv1/R1_Introduction
[AutogradeApp | ERROR] One or more notebooks in the assignment use an old version of the nbgrader metadata format. Please **back up your class files directory** and then update the metadata using: nbgrader update .

The first warning is the key to the problem. However, the error message shows that nbgrader metadata should be updated, which not the real problem here (though I haven’t tested it if actually works). At least, the error message could be improved or the failure could be noticed earlier.

It isn’t a perfect solution, but it would be nice if nbgrader could also continue on in this case. Clearly the student has done something weird (and notebook/jupyterlab could solve it by not duplicating metadata). But could nbgrader also survive this and make some choice one way or the other itself, considering that duplicating metadata is probably a type of problem that will appear again? That way the common case won’t cause a failure stopping all autograding… and we leave it to instructors to tell of the problem later.

Who knows what the right nbgrader-only solution would be - simple would be to pick the last cell with the metadata. But many corner cases could break this.

Workaround for anyone else who gets this problem: My tests showed that you have to delete the whole nbgrader metadata dict from the notebook cell - changing the grade_id is not enough because it can’t find that grade_id in the database, nor is removing just the grade_id cell.

Operating system

Linux

nbgrader --version

Modified version, based on upstream d56f75b (close to current master).

About this issue

  • Original URL
  • State: open
  • Created 5 years ago
  • Reactions: 1
  • Comments: 27 (24 by maintainers)

Most upvoted comments

Just adding a +1 to this – I ran in to this exact issue in my course just now!

Just as a data point: the issue is still there with JupyterLab 3.6.0 and nbgrader 0.8.1.

About #1753 : I am really glad someone worked on that issue, and I have been meaning to test it for a long time. That being said, even if it works well in practice, it feels like a workaround for fixing a posteriori what should have never happened in the first place: that nbgrader cells are duplicated with there metadata when the end users really just meant to copy their content.

The thing with duplicating cells: how do we know which should have the original content? first or second? what if content was split among two cells? What if duplicated and the “original” was later deleted, maybe because it was used as a test or something? Or original was modified to be come the “copy”

… but actually, according to my theory, it shouldn’t matter! If you can ignore duplicate metadata, hopefully you can ignore these problems too. I would really be interested in an analysis of when it actually matters.

Maybe docs should be updated to warn to add metadata in the least number of possible cases, and try to make it obvious to not duplicate them - even if we do other things?

I experienced this issue as well in the class I recently TA’d (nbgrader 0.5.4 on jupyterhub 0.9.4). My solution was a script that merges cells when it detects duplicate grade_ids by concatenating the cell’s source to the first then removing the duplicate.

I uploaded the script as part of a small collection I wrote for that course in case it’s of use to anyone https://github.com/elesiuta/jupyter-nbgrader-helper (run with --fix AssignName NbName.ipynb). As an aside, if there’s any feature in there you think might be useful to implement properly into nbgrader I’d be happy to open a pull request and work on it.