kolibri: Stuck on UPDATE CHANNEL task

Observed behavior

When updating a large channel on a 0.13.2 Kolibri demo server, after clicking UPDATE CHANNEL I saw the progress bar jump form 0 to 100 very fast then go back to 0% progress and just gets stuck there. Kolibri esta stuckiendo!

Ina recently encountered a similar issue: screen_shot_2020-06-03_at_11 58 05_am see https://learningequality.slack.com/archives/CRM17EYQ5/p1591149618002100

Expected behavior

The UPDATE CHANNEL task to complete and progress to be displayed along the way.

User-facing consequences

Kolibri admins cannot update channels (for larger channels).

Errors and logs

INFO     Annotating ContentNode objects with children for level 1
ERROR    Job cd5f7ac88c03423f8d70b69a85b4f7e5 raised an exception: Traceback (most recent call last):
  File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/worker.py", line 73, in handle_finished_future
    result = future.result()
  File "/.../kolibri-0.13.2....whl/kolibri/dist/py2only/concurrent/futures/_base.py", line 422, in result
    return self.__get_result()
  File "/.../kolibri-0.13.2....whl/kolibri/dist/py2only/concurrent/futures/thread.py", line 62, in run
    result = self.fn(*self.args, **self.kwargs)
  File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/worker.py", line 224, in wrap
    return f(*args, **kwargs)
  File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/job.py", line 178, in y
    result = func(*args, **kwargs)
  File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/api.py", line 728, in _remoteimport
    check_for_cancel=check_for_cancel,
  File "/.../kolibri-0.13.2....whl/kolibri/dist/django/core/management/__init__.py", line 131, in call_command
    return command.execute(*args, **defaults)
  File "/.../kolibri-0.13.2....whl/kolibri/dist/django/core/management/base.py", line 330, in execute
    output = self.handle(*args, **options)
  File "/.../kolibri-0.13.2....whl/kolibri/core/tasks/management/commands/base.py", line 110, in handle
    return self.handle_async(*args, **options)
  File "/.../kolibri-0.13.2....whl/kolibri/core/content/management/commands/importcontent.py", line 439, in handle_async
    renderable_only=options["renderable_only"],
  File "/.../kolibri-0.13.2....whl/kolibri/core/content/management/commands/importcontent.py", line 159, in download_content
    renderable_only=renderable_only,
  File "/.../kolibri-0.13.2....whl/kolibri/core/content/management/commands/importcontent.py", line 217, in _transfer
    .values("content_id")
  File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/query.py", line 364, in count
    return self.query.get_count(using=self.db)
  File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/sql/query.py", line 499, in get_count
    number = obj.get_aggregation(using, ['__count'])['__count']
  File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/sql/query.py", line 480, in get_aggregation
    result = compiler.execute_sql(SINGLE)
  File "/.../kolibri-0.13.2....whl/kolibri/dist/django/db/models/sql/compiler.py", line 899, in execute_sql
    raise original_exception
OperationalError: Expression tree is too large (maximum depth 1000)

Steps to reproduce

  1. IMPORT vN of a large channel in Kolibri
  2. PUBLISH an updated v(N+1) of the channel with lots of changes
  3. click VIEW CHANGES, and UPDATE CHANNEL in Kolibri to start the update process

Note: it is not clear what causes the error (is it large channels?), channels with many changes?, deep channels?

Suggestion: The KA-bg channel on the main Kolibri demo-server has not been updated from v6 to v7, so it can be used as a test case to reproduce. Maybe @lyw07 could monitor the logs live and who ever is debugging this can do the update process to monitor the js console in parallel.

Context

  • Kolibri version: 0.13.3
  • Operating system: Linux
  • DB: sqlite3 (for my demo server), postgres for UNW server

About this issue

  • Original URL
  • State: closed
  • Created 4 years ago
  • Comments: 20 (20 by maintainers)

Most upvoted comments

To fix this, we might have to change the API of get_nodes_to_transfer https://github.com/learningequality/kolibri/blob/develop/kolibri/core/content/management/commands/importcontent.py#L192 to return a chunked sequence of querysets instead.

For context, this is how we resolved this in the annotation case: https://github.com/learningequality/kolibri/blob/develop/kolibri/core/content/utils/annotation.py#L270 (basically, chunking the queries and doing batched operations) - in this case, it would be simpler, as it would just be a matter of taking the batched queries and summing the counts, and getting the list of files from them, so should be feasible to implement.

I have closed #7009 as this is intentional behaviour. We use cherrypy to manage our non-http services, and the KOLIBRI_CHERRYPY_START flag does not actually flag whether Cherrypy starts or not, just whether Cherrypy serves http.