kolibri: Stuck on UPDATE CHANNEL task
Observed behavior
When updating a large channel on a 0.13.2 Kolibri demo server, after clicking UPDATE CHANNEL I saw the progress bar jump form 0 to 100 very fast then go back to 0% progress and just gets stuck there. Kolibri esta stuckiendo!
Ina recently encountered a similar issue: see https://learningequality.slack.com/archives/CRM17EYQ5/p1591149618002100
Expected behavior
The UPDATE CHANNEL task to complete and progress to be displayed along the way.
User-facing consequences
Kolibri admins cannot update channels (for larger channels).
Errors and logs
INFO Annotating ContentNode objects with children for level 1
ERROR Job cd5f7ac88c03423f8d70b69a85b4f7e5 raised an exception: Traceback (most recent call last):
File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/worker.py", line 73, in handle_finished_future
result = future.result()
File "/.../kolibri-0.13.2....whl/kolibri/dist/py2only/concurrent/futures/_base.py", line 422, in result
return self.__get_result()
File "/.../kolibri-0.13.2....whl/kolibri/dist/py2only/concurrent/futures/thread.py", line 62, in run
result = self.fn(*self.args, **self.kwargs)
File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/worker.py", line 224, in wrap
return f(*args, **kwargs)
File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/job.py", line 178, in y
result = func(*args, **kwargs)
File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/api.py", line 728, in _remoteimport
check_for_cancel=check_for_cancel,
File "/.../kolibri-0.13.2....whl/kolibri/dist/django/core/management/__init__.py", line 131, in call_command
return command.execute(*args, **defaults)
File "/.../kolibri-0.13.2....whl/kolibri/dist/django/core/management/base.py", line 330, in execute
output = self.handle(*args, **options)
File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/management/commands/base.py", line 110, in handle
return self.handle_async(*args, **options)
File "/.../kolibri-0.13.2....whl/kolibri/core/content/management/commands/importcontent.py", line 439, in handle_async
renderable_only=options["renderable_only"],
File "/.../kolibri-0.13.2....whl/kolibri/core/content/management/commands/importcontent.py", line 159, in download_content
renderable_only=renderable_only,
File "/.../kolibri-0.13.2....whl/kolibri/core/content/management/commands/importcontent.py", line 217, in _transfer
.values("content_id")
File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/query.py", line 364, in count
return self.query.get_count(using=self.db)
File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/sql/query.py", line 499, in get_count
number = obj.get_aggregation(using, ['__count'])['__count']
File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/sql/query.py", line 480, in get_aggregation
result = compiler.execute_sql(SINGLE)
File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/sql/compiler.py", line 899, in execute_sql
raise original_exception
OperationalError: Expression tree is too large (maximum depth 1000)
Steps to reproduce
- IMPORT vN of a large channel in Kolibri
- PUBLISH an updated v(N+1) of the channel with lots of changes
- click VIEW CHANGES, and UPDATE CHANNEL in Kolibri to start the update process
Note: it is not clear what causes the error (is it large channels?), channels with many changes?, deep channels?
Suggestion: The KA-bg channel on the main Kolibri demo-server has not been updated from v6 to v7, so it can be used as a test case to reproduce. Maybe @lyw07 could monitor the logs live and who ever is debugging this can do the update process to monitor the js console in parallel.
Context
- Kolibri version: 0.13.3
- Operating system: Linux
- DB: sqlite3 (for my demo server), postgres for UNW server
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 20 (20 by maintainers)
To fix this, we might have to change the API of
get_nodes_to_transfer
https://github.com/learningequality/kolibri/blob/develop/kolibri/core/content/management/commands/importcontent.py#L192 to return a chunked sequence of querysets instead.For context, this is how we resolved this in the annotation case: https://github.com/learningequality/kolibri/blob/develop/kolibri/core/content/utils/annotation.py#L270 (basically, chunking the queries and doing batched operations) - in this case, it would be simpler, as it would just be a matter of taking the batched queries and summing the counts, and getting the list of files from them, so should be feasible to implement.
I have closed #7009 as this is intentional behaviour. We use cherrypy to manage our non-http services, and the
KOLIBRI_CHERRYPY_START
flag does not actually flag whether Cherrypy starts or not, just whether Cherrypy serves http.