airflow: Logging bug in a long runs

Apache Airflow version: 2.0.2

Environment: Kubernetes v1.18.3 Openshift 4.5.37

What happened: We are running our python code in kubernetes operators(airflow.contrib.operators.kubernetes_pod_operator). During long runs(>10h) the airflow with the logs turned on(get_logs=True in k8s operator field) behaves absolutely normally, and then throws an unexpected error.

If we set get_logs=False - we have success dag run, otherwise, we have the same error every time.


> [2021-05-18 13:54:10,199] {} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 696, in _update_chunk_length
    self.chunk_left = int(line, 16)
ValueError: invalid literal for int() with base 16: b''
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 436, in _error_catcher
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 763, in read_chunked
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 700, in _update_chunk_length
    raise httplib.IncompleteRead(line)
http.client.IncompleteRead: IncompleteRead(0 bytes read)
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/", line 1138, in _run_raw_task
    self._prepare_and_execute_task_with_callbacks(context, task)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/", line 1311, in _prepare_and_execute_task_with_callbacks
    result = self._execute_task(context, task_copy)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/models/", line 1341, in _execute_task
    result = task_copy.execute(context=context)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/cncf/kubernetes/operators/", line 366, in execute
    final_state, _, result = self.create_new_pod_for_operator(labels, launcher)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/cncf/kubernetes/operators/", line 513, in create_new_pod_for_operator
    final_state, result = launcher.monitor_pod(pod=self.pod, get_logs=self.get_logs)
  File "/home/airflow/.local/lib/python3.6/site-packages/airflow/providers/cncf/kubernetes/utils/", line 145, in monitor_pod
    for line in logs:
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 807, in __iter__
    for chunk in
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 571, in stream
    for line in self.read_chunked(amt, decode_content=decode_content):
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 792, in read_chunked
  File "/usr/local/lib/python3.6/", line 99, in __exit__
    self.gen.throw(type, value, traceback)
  File "/home/airflow/.local/lib/python3.6/site-packages/urllib3/", line 454, in _error_catcher
    raise ProtocolError("Connection broken: %r" % e, e)
urllib3.exceptions.ProtocolError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
[2021-05-18 13:54:10,204] {} INFO - Marking task as FAILED. dag_id=pipline, task_id=task7, execution_date=20210518T132920, start_date=20210518T133244, end_date=20210518T135410
[2021-05-18 13:54:10,280] {} INFO - Task exited with return code 1

We have an airflow instance on other kubernetes server, where we are able to run the same code with the same dags and get no errors.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 17 (11 by maintainers)

Most upvoted comments

And I heartily recommend “search” on Airlfow docs site. It really fast and really good:


@sg27 Because you are looking in a wrong place. This is a kubernetes provider fix, not airflow.

@trucnguyenlam -> just upgrade to latest cncf.kubernetes provider.