airflow: KubernetesPodOperator Fails on empty log line

Apache Airflow Provider(s)

cncf-kubernetes

Versions of Apache Airflow Providers

apache-airflow-providers-cncf-kubernetes==1!2.1.0

Apache Airflow version

2.2.3 (latest released)

Operating System

Debian GNU/Linux 10 (buster)

Deployment

Astronomer

Deployment details

No response

What happened

Some KubernetesPodOperator tasks fail with an empty logline. From the following logs, you could see that monitor_pod function

  1. continuously fails to get a timestamp in log message Error parsing timestamp. Will continue execution but won't update timestamp
  2. Not able to fetch container logs unable to retrieve container logs for docker://
  3. eventually failing with an empty log line Exception: Log not in "{timestamp} {log}" format. Got:

What you expected to happen

In case of empty log line, we should graciously handle the error instead of failing the task itself

How to reproduce

Not sure what really causes this issue but this StackOverflow question Docker cleaning up the logs?

Anything else

Complete log stacktrace*

[2022-02-07, 23:33:23 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:33:23 UTC] {pod_launcher.py:176} INFO - rpc error: code = DeadlineExceeded desc = context deadline exceeded
[2022-02-07, 23:33:24 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:35:24 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:35:24 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:35:25 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:37:25 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:37:25 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:37:26 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:39:26 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:39:26 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:39:27 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:41:27 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:41:27 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:41:28 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:43:28 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:43:28 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:43:29 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:45:29 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:45:29 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:45:30 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:47:30 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:47:30 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:47:31 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:49:31 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:49:31 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:49:32 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:51:32 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:51:32 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:51:33 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:51:56 UTC] {taskinstance.py:1700} ERROR - Task failed with exception
Traceback (most recent call last):
  File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1329, in _run_raw_task
    self._execute_task_with_callbacks(context)
  File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1455, in _execute_task_with_callbacks
    result = self._execute_task(context, self.task)
  File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1506, in _execute_task
    result = execute_callable(context=context)
  File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 367, in execute
    final_state, remote_pod, result = self.create_new_pod_for_operator(labels, launcher)
  File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 524, in create_new_pod_for_operator
    final_state, remote_pod, result = launcher.monitor_pod(pod=self.pod, get_logs=self.get_logs)
  File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_launcher.py", line 175, in monitor_pod
    timestamp, message = self.parse_log_line(line.decode('utf-8'))
  File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_launcher.py", line 225, in parse_log_line
    raise Exception(f'Log not in "{{timestamp}} {{log}}" format. Got: {line}')
Exception: Log not in "{timestamp} {log}" format. Got:

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 1
  • Comments: 29 (14 by maintainers)

Commits related to this issue

Most upvoted comments

I found this issue using GCP Composer v2 with KubernetesPodOperator.

I solved the issue by adding persistence for logs

And how did you do that @y0zg ?

We’ve started seeing Error parsing timestamp (no timestamp in message ''). Will continue execution but won't update timestamp recently (I think someone upgraded something just before this started appearing) in our logs and were initially concerned, but the jobs always succeed. I’m guessing this message can be ignored, but is there something with more detail to explain why this message is safe to ignore?

Hi, why is the issue closed? I am on 2.6.1 with airflow-cncf-kubernetes 7.5.1 (latest) and still getting this error

Because we think the original issue from version released 2 years has been fixed. You @romanzdk @ricoms (also suggestion y00zg @tanthml) might have SIMILAR issue which might be completely different. And the best way you can help someone to diagnoe and solve your issue is open a new one where you will describe what happens in your case and provide evidence from your version (ideally after upgrading to latest version of Airflow because what you see, could have been fixed since).

This is an open-source project where you get software for free and people who solve other people’s problem mostoften do that in their free time - weekends and nights. And the best way that you can get help from those people is to make it easy for them to diagnose your issue. I know it is super-easy to write “I have the same issue”. You do not loose any time on gathering evidences and writing the issue. But your issue (especially that we are talking about someone who raised it for version 2.2 and we are 2 years later with 2.7.1 and k8s code has been rewritten 3 times since then) . Also it might depend on multiple factors like kubernetes version, provider version, type of kubernetes cluster you have etc. But your comment did not bring anyone any closer to knowing all those details.

So, if you can open a new issue @romanzdk and provide good evidences, you are increasing chances (but only that - chances) that someone will spend their free afternoon or weekend and look at your issue and maybe even fixes it.

If all that you have is question why the issue is closed, your chances for getting it solved do not increase even by a fraction of percent.

So - if you really care about your problem being solved, - I suggest you help those who try to help you and provide good, reproducible issue with good evidences of what happens (ideally looking at your system, also and corellating what happens on your system (maybe some pods were failing? Maybe you can see some unusual behaviour or output of your pods etc. That will be a great help for those who spends their nights an weekends trying to help people who use the software for completely free.

Hi, why is the issue closed? I am on 2.6.1 with airflow-cncf-kubernetes 7.5.1 (latest) and still getting this error

I am hitting this on the very latest Cloud Composer which uses apache-airflow-providers-cncf-kubernetes==6.0.0. The logs are ingested fine until I have some unexpected warnings from an underlying library printed, which kills the task.

I solved the issue by adding persistence for logs

I’m also experiencing this problem while using dbt in Airflow:

Airflow v2.5.0 Kubernetes: 1.24.10 (Azure Kubernetes Service) apache-airflow-providers-cncf-kubernetes 7.0.0

[2023-05-24, 14:02:49 UTC] {pod_manager.py:367} INFO - 2023-05-24T13:57:48.744628Z [info     ] 13:57:48  105 of 105 START sql table model tap_mongodb.mart_name_here .................. [RUN] cmd_type=command name=dbt-*** stdio=stderr

[2023-05-24, 14:07:49 UTC] {pod_manager.py:438} ERROR - Error parsing timestamp (no timestamp in message ''). Will continue execution but won't update timestamp

In this case, we’re at a step in the execution process that could take a long time and you can see that the error occurs exactly 5 minutes after the previous line got executed. There isn’t an empty log line in the log that follows. My pipeline was working before when I was dealing with a smaller dataset because the build time of this step was small enough, but I think there might be some kind of KubernetesPodOperator related timeout at play here.

@potiuk / anyone else: Any ideas on Airflow variables I could play around with to test this hypothesis?

3.0.2 Also has the same logic for parsing log lines, wouldn’t be of much help https://github.com/apache/airflow/blob/providers-cncf-kubernetes/3.0.2/airflow/providers/cncf/kubernetes/utils/pod_manager.py#L254

It has different kubernetes library. Can you please chack it ?