airflow: KubernetesPodOperator Fails on empty log line
Apache Airflow Provider(s)
cncf-kubernetes
Versions of Apache Airflow Providers
apache-airflow-providers-cncf-kubernetes==1!2.1.0
Apache Airflow version
2.2.3 (latest released)
Operating System
Debian GNU/Linux 10 (buster)
Deployment
Astronomer
Deployment details
No response
What happened
Some KubernetesPodOperator
tasks fail with an empty logline. From the following logs, you could see that monitor_pod
function
- continuously fails to get a timestamp in log message
Error parsing timestamp. Will continue execution but won't update timestamp
- Not able to fetch container logs
unable to retrieve container logs for docker://
- eventually failing with an empty log line
Exception: Log not in "{timestamp} {log}" format. Got:
What you expected to happen
In case of empty log line, we should graciously handle the error instead of failing the task itself
How to reproduce
Not sure what really causes this issue but this StackOverflow question Docker cleaning up the logs?
Anything else
Complete log stacktrace*
[2022-02-07, 23:33:23 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:33:23 UTC] {pod_launcher.py:176} INFO - rpc error: code = DeadlineExceeded desc = context deadline exceeded
[2022-02-07, 23:33:24 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:35:24 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:35:24 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:35:25 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:37:25 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:37:25 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:37:26 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:39:26 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:39:26 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:39:27 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:41:27 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:41:27 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:41:28 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:43:28 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:43:28 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:43:29 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:45:29 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:45:29 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:45:30 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:47:30 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:47:30 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:47:31 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:49:31 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:49:31 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:49:32 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:51:32 UTC] {pod_launcher.py:231} ERROR - Error parsing timestamp. Will continue execution but won't update timestamp
[2022-02-07, 23:51:32 UTC] {pod_launcher.py:176} INFO - unable to retrieve container logs for docker://c2cd33567efa4a95f555ccb6d7fda760b65bb9cf2d13087d2704e2e8b974e42a
[2022-02-07, 23:51:33 UTC] {pod_launcher.py:192} WARNING - Pod dbt-hourly-run-task.537231fc6daf403484e342380a5caf53 log read interrupted
[2022-02-07, 23:51:56 UTC] {taskinstance.py:1700} ERROR - Task failed with exception
Traceback (most recent call last):
File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1329, in _run_raw_task
self._execute_task_with_callbacks(context)
File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1455, in _execute_task_with_callbacks
result = self._execute_task(context, self.task)
File "/usr/local/lib/python3.9/site-packages/airflow/models/taskinstance.py", line 1506, in _execute_task
result = execute_callable(context=context)
File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 367, in execute
final_state, remote_pod, result = self.create_new_pod_for_operator(labels, launcher)
File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/operators/kubernetes_pod.py", line 524, in create_new_pod_for_operator
final_state, remote_pod, result = launcher.monitor_pod(pod=self.pod, get_logs=self.get_logs)
File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_launcher.py", line 175, in monitor_pod
timestamp, message = self.parse_log_line(line.decode('utf-8'))
File "/usr/local/lib/python3.9/site-packages/airflow/providers/cncf/kubernetes/utils/pod_launcher.py", line 225, in parse_log_line
raise Exception(f'Log not in "{{timestamp}} {{log}}" format. Got: {line}')
Exception: Log not in "{timestamp} {log}" format. Got:
Are you willing to submit PR?
- Yes I am willing to submit a PR!
Code of Conduct
- I agree to follow this project’s Code of Conduct
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 29 (14 by maintainers)
Commits related to this issue
- Stop crashing when empty logs are received from kubernetes client It seems that in some circumstances, the K8S client might return empty logs even if "timestamps" options is specified. That should n... — committed to potiuk/airflow by potiuk 2 years ago
- Stop crashing when empty logs are received from kubernetes client (#22566) It seems that in some circumstances, the K8S client might return empty logs even if "timestamps" options is specified. T... — committed to apache/airflow by potiuk 2 years ago
I found this issue using GCP Composer v2 with KubernetesPodOperator.
And how did you do that @y0zg ?
We’ve started seeing
Error parsing timestamp (no timestamp in message ''). Will continue execution but won't update timestamp
recently (I think someone upgraded something just before this started appearing) in our logs and were initially concerned, but the jobs always succeed. I’m guessing this message can be ignored, but is there something with more detail to explain why this message is safe to ignore?Because we think the original issue from version released 2 years has been fixed. You @romanzdk @ricoms (also suggestion y00zg @tanthml) might have SIMILAR issue which might be completely different. And the best way you can help someone to diagnoe and solve your issue is open a new one where you will describe what happens in your case and provide evidence from your version (ideally after upgrading to latest version of Airflow because what you see, could have been fixed since).
This is an open-source project where you get software for free and people who solve other people’s problem mostoften do that in their free time - weekends and nights. And the best way that you can get help from those people is to make it easy for them to diagnose your issue. I know it is super-easy to write “I have the same issue”. You do not loose any time on gathering evidences and writing the issue. But your issue (especially that we are talking about someone who raised it for version 2.2 and we are 2 years later with 2.7.1 and k8s code has been rewritten 3 times since then) . Also it might depend on multiple factors like kubernetes version, provider version, type of kubernetes cluster you have etc. But your comment did not bring anyone any closer to knowing all those details.
So, if you can open a new issue @romanzdk and provide good evidences, you are increasing chances (but only that - chances) that someone will spend their free afternoon or weekend and look at your issue and maybe even fixes it.
If all that you have is question why the issue is closed, your chances for getting it solved do not increase even by a fraction of percent.
So - if you really care about your problem being solved, - I suggest you help those who try to help you and provide good, reproducible issue with good evidences of what happens (ideally looking at your system, also and corellating what happens on your system (maybe some pods were failing? Maybe you can see some unusual behaviour or output of your pods etc. That will be a great help for those who spends their nights an weekends trying to help people who use the software for completely free.
Hi, why is the issue closed? I am on 2.6.1 with airflow-cncf-kubernetes 7.5.1 (latest) and still getting this error
I am hitting this on the very latest Cloud Composer which uses
apache-airflow-providers-cncf-kubernetes==6.0.0
. The logs are ingested fine until I have some unexpected warnings from an underlying library printed, which kills the task.I solved the issue by adding persistence for logs
I’m also experiencing this problem while using
dbt
in Airflow:Airflow v2.5.0 Kubernetes: 1.24.10 (Azure Kubernetes Service)
apache-airflow-providers-cncf-kubernetes
7.0.0In this case, we’re at a step in the execution process that could take a long time and you can see that the error occurs exactly 5 minutes after the previous line got executed. There isn’t an empty log line in the log that follows. My pipeline was working before when I was dealing with a smaller dataset because the build time of this step was small enough, but I think there might be some kind of KubernetesPodOperator related timeout at play here.
@potiuk / anyone else: Any ideas on Airflow variables I could play around with to test this hypothesis?
It has different kubernetes library. Can you please chack it ?