airflow: Task processes killed with WARNING - Recorded pid does not match the current pid

Apache Airflow version: 2.1.3

Apache Airflow Provider versions (please include all providers that are relevant to your bug):

apache-airflow-providers-amazon==2.0.0
apache-airflow-providers-celery==2.0.0
apache-airflow-providers-cncf-kubernetes==2.0.0
apache-airflow-providers-docker==2.0.0
apache-airflow-providers-elasticsearch==2.0.2
apache-airflow-providers-ftp==2.0.0
apache-airflow-providers-google==4.0.0
apache-airflow-providers-grpc==2.0.0
apache-airflow-providers-hashicorp==2.0.0
apache-airflow-providers-http==2.0.0
apache-airflow-providers-imap==2.0.0
apache-airflow-providers-microsoft-azure==3.0.0
apache-airflow-providers-mysql==2.0.0
apache-airflow-providers-postgres==2.0.0
apache-airflow-providers-redis==2.0.0
apache-airflow-providers-sendgrid==2.0.0
apache-airflow-providers-sftp==2.0.0
apache-airflow-providers-slack==4.0.0
apache-airflow-providers-sqlite==2.0.0
apache-airflow-providers-ssh==2.0.0

Environment:

I’m using the airflow-2.1.2 container from dockerhub.

  • Cloud provider or hardware configuration: AWS
  • OS (e.g. from /etc/os-release): Debian GNU/Linux 10 (buster)
  • Kernel (e.g. uname -a): Linux fe52079d9ade 5.12.13-200.fc33.x86_64 #1 SMP Wed Jun 23 16:20:26 UTC 2021 x86_64 GNU/Linux

What happened:

When using the EMRStepSensor (set to reschedule mode) to monitor EMR steps, the task will sometimes fail while the EMR step sucessfully ran. Most of the time the sensor will work fine, but every so often this issue occurs (on the same DAG, without modifications).

EMRStepSensor task instance debug log
*** Reading local file: /opt/airflow/logs/derived.adobe_target_catalog_sporza/watch_adobe_target_catalog_sporza_job_emr_step/2021-08-07T05:28:00+00:00/1.log
[2021-08-08 05:29:20,367] {__init__.py:51} DEBUG - Loading core task runner: StandardTaskRunner
[2021-08-08 05:29:21,594] {base_task_runner.py:62} DEBUG - Planning to run as the  user
[2021-08-08 05:29:21,597] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> from DB
[2021-08-08 05:29:22,086] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:29:22,086] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
[2021-08-08 05:29:22,086] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance State' PASSED: True, Task state queued was valid.
[2021-08-08 05:29:22,128] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED: True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:29:22,128] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
[2021-08-08 05:29:22,129] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance Not Running' PASSED: True, Task is not in running state.
[2021-08-08 05:29:22,130] {taskinstance.py:896} INFO - Dependencies all met for <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:29:22,130] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
[2021-08-08 05:29:22,165] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED: True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:29:22,206] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
[2021-08-08 05:29:22,327] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Pool Slots Available' PASSED: True, ('There are enough open slots in %s to execute the task', 'default_pool')
[2021-08-08 05:29:22,327] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Concurrency' PASSED: True, Task concurrency is not set.
[2021-08-08 05:29:22,327] {taskinstance.py:896} INFO - Dependencies all met for <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:29:22,327] {taskinstance.py:1087} INFO - 
--------------------------------------------------------------------------------
[2021-08-08 05:29:22,327] {taskinstance.py:1088} INFO - Starting attempt 1 of 1
[2021-08-08 05:29:22,327] {taskinstance.py:1089} INFO - 
--------------------------------------------------------------------------------
[2021-08-08 05:29:24,417] {taskinstance.py:1107} INFO - Executing <Task(EmrStepSensor): watch_adobe_target_catalog_sporza_job_emr_step> on 2021-08-07T05:28:00+00:00
[2021-08-08 05:29:24,421] {standard_task_runner.py:52} INFO - Started process 7426 to run task
[2021-08-08 05:29:24,427] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'derived.adobe_target_catalog_sporza', 'watch_adobe_target_catalog_sporza_job_emr_step', '2021-08-07T05:28:00+00:00', '--job-id', '98940', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/adobe_target_catalog_sporza_wf.py', '--cfg-path', '/tmp/tmpucsqr5ep', '--error-file', '/tmp/tmpi1wh32jz']
[2021-08-08 05:29:24,427] {standard_task_runner.py:77} INFO - Job 98940: Subtask watch_adobe_target_catalog_sporza_job_emr_step
[2021-08-08 05:29:24,428] {cli_action_loggers.py:66} DEBUG - Calling callbacks: [<function default_action_log at 0x7fc8dfbe5d90>]
[2021-08-08 05:29:26,223] {settings.py:208} DEBUG - Setting up DB connection pool (PID 7426)
[2021-08-08 05:29:26,223] {settings.py:244} DEBUG - settings.prepare_engine_args(): Using NullPool
[2021-08-08 05:29:26,226] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [None]> from DB
[2021-08-08 05:29:27,030] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:27,523] {logging_mixin.py:104} INFO - Running <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> on host ip-172-31-58-121.eu-west-1.compute.internal
[2021-08-08 05:29:27,523] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:28,218] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:28,871] {taskinstance.py:677} DEBUG - Clearing XCom data
[2021-08-08 05:29:29,515] {taskinstance.py:684} DEBUG - XCom data cleared
[2021-08-08 05:29:32,916] {taskinstance.py:1302} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=vrt
AIRFLOW_CTX_DAG_ID=derived.adobe_target_catalog_sporza
AIRFLOW_CTX_TASK_ID=watch_adobe_target_catalog_sporza_job_emr_step
AIRFLOW_CTX_EXECUTION_DATE=2021-08-07T05:28:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-08-07T05:28:00+00:00
[2021-08-08 05:29:32,916] {__init__.py:146} DEBUG - Preparing lineage inlets and outlets
[2021-08-08 05:29:32,916] {__init__.py:190} DEBUG - inlets: [], outlets: []
[2021-08-08 05:29:33,486] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:36,437] {base_aws.py:368} INFO - Airflow Connection: aws_conn_id=aws_default
[2021-08-08 05:29:36,521] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:36,562] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:29:38,654] {base_aws.py:179} INFO - No credentials retrieved from Connection
[2021-08-08 05:29:38,654] {base_aws.py:87} INFO - Creating session with aws_access_key_id=None region_name=None
[2021-08-08 05:29:38,655] {hooks.py:417} DEBUG - Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
[2021-08-08 05:29:38,656] {hooks.py:417} DEBUG - Changing event name from before-call.apigateway to before-call.api-gateway
[2021-08-08 05:29:38,657] {hooks.py:417} DEBUG - Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
[2021-08-08 05:29:38,659] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
[2021-08-08 05:29:38,659] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
[2021-08-08 05:29:38,659] {hooks.py:417} DEBUG - Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
[2021-08-08 05:29:38,660] {hooks.py:417} DEBUG - Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
[2021-08-08 05:29:38,663] {hooks.py:417} DEBUG - Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
[2021-08-08 05:29:38,664] {base_aws.py:157} INFO - role_arn is None
[2021-08-08 05:29:38,667] {utils.py:364} DEBUG - IMDS ENDPOINT: http://169.254.169.254/
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: env
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: assume-role
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: assume-role-with-web-identity
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: sso
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: shared-credentials-file
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: custom-process
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: config-file
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: ec2-credentials-file
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: boto-config
[2021-08-08 05:29:38,670] {credentials.py:1974} DEBUG - Looking for credentials via: container-role
[2021-08-08 05:29:38,671] {connectionpool.py:230} DEBUG - Starting new HTTP connection (1): 169.254.170.2:80
[2021-08-08 05:29:38,673] {connectionpool.py:442} DEBUG - http://169.254.170.2:80 "GET /v2/credentials/c5f7099a-d46e-4472-a48f-6c137db9e75d HTTP/1.1" 200 1307
[2021-08-08 05:29:38,674] {loaders.py:174} DEBUG - Loading JSON file: /home/airflow/.local/lib/python3.6/site-packages/botocore/data/endpoints.json
[2021-08-08 05:29:38,683] {hooks.py:210} DEBUG - Event choose-service-name: calling handler <function handle_service_name_alias at 0x7fc8dbee8bf8>
[2021-08-08 05:29:38,691] {loaders.py:174} DEBUG - Loading JSON file: /home/airflow/.local/lib/python3.6/site-packages/botocore/data/emr/2009-03-31/service-2.json
[2021-08-08 05:29:38,695] {hooks.py:210} DEBUG - Event creating-client-class.emr: calling handler <function add_generate_presigned_url at 0x7fc8dbf947b8>
[2021-08-08 05:29:38,697] {endpoint.py:292} DEBUG - Setting elasticmapreduce timeout as (60, 60)
[2021-08-08 05:29:38,698] {loaders.py:174} DEBUG - Loading JSON file: /home/airflow/.local/lib/python3.6/site-packages/botocore/data/_retry.json
[2021-08-08 05:29:38,698] {client.py:166} DEBUG - Registering retry handlers for service: emr
[2021-08-08 05:29:38,699] {emr_step.py:73} INFO - Poking step s-581MKMYKRED7 on cluster j-XDTNJDHR23RQ
[2021-08-08 05:29:38,699] {hooks.py:210} DEBUG - Event before-parameter-build.emr.DescribeStep: calling handler <function generate_idempotent_uuid at 0x7fc8dbf06048>
[2021-08-08 05:29:38,699] {hooks.py:210} DEBUG - Event before-call.emr.DescribeStep: calling handler <function inject_api_version_header_if_needed at 0x7fc8dbf077b8>
[2021-08-08 05:29:38,699] {endpoint.py:101} DEBUG - Making request for OperationModel(name=DescribeStep) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target': 'ElasticMapReduce.DescribeStep', 'Content-Type': 'application/x-amz-json-1.1', 'User-Agent': 'Boto3/1.17.107 Python/3.6.13 Linux/4.14.238-182.422.amzn2.x86_64 exec-env/AWS_ECS_FARGATE Botocore/1.20.107'}, 'body': b'{"ClusterId": "j-XDTNJDHR23RQ", "StepId": "s-581MKMYKRED7"}', 'url': 'https://eu-west-1.elasticmapreduce.amazonaws.com/', 'context': {'client_region': 'eu-west-1', 'client_config': <botocore.config.Config object at 0x7fc8dba19470>, 'has_streaming_input': False, 'auth_type': None}}
[2021-08-08 05:29:38,700] {hooks.py:210} DEBUG - Event request-created.emr.DescribeStep: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7fc8dba19438>>
[2021-08-08 05:29:38,700] {hooks.py:210} DEBUG - Event choose-signer.emr.DescribeStep: calling handler <function set_operation_specific_signer at 0x7fc8dbf01ea0>
[2021-08-08 05:29:38,700] {auth.py:380} DEBUG - Calculating signature using v4 auth.
[2021-08-08 05:29:38,700] {auth.py:381} DEBUG - CanonicalRequest:
POST
/

content-type:application/x-amz-json-1.1
host:eu-west-1.elasticmapreduce.amazonaws.com
x-amz-date:20210808T052938Z
x-amz-security-token:IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON
x-amz-target:ElasticMapReduce.DescribeStep

content-type;host;x-amz-date;x-amz-security-token;x-amz-target
2a131454435119560c6fa09b6b1eeeb557aba91e61af62fe4883ccc24890c4f3
[2021-08-08 05:29:38,700] {auth.py:383} DEBUG - StringToSign:
AWS4-HMAC-SHA256
20210808T052938Z
20210808/eu-west-1/elasticmapreduce/aws4_request
29f0d229d76daf02f334b8c7d193cb06de4619f4adb7b3057683f53b2ea79c80
[2021-08-08 05:29:38,700] {auth.py:385} DEBUG - Signature:
66e13f874221cb3c1b275a052ac56a7e827d431e940bfe9b14414104e0a3b162
[2021-08-08 05:29:38,701] {endpoint.py:187} DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://eu-west-1.elasticmapreduce.amazonaws.com/, headers={'X-Amz-Target': b'ElasticMapReduce.DescribeStep', 'Content-Type': b'application/x-amz-json-1.1', 'User-Agent': b'Boto3/1.17.107 Python/3.6.13 Linux/4.14.238-182.422.amzn2.x86_64 exec-env/AWS_ECS_FARGATE Botocore/1.20.107', 'X-Amz-Date': b'20210808T052938Z', 'X-Amz-Security-Token': b'IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON', 'Authorization': b'AWS4-HMAC-SHA256 Credential=ASIAUEESB34CT3URIQHE/20210808/eu-west-1/elasticmapreduce/aws4_request, SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target, Signature=66e13f874221cb3c1b275a052ac56a7e827d431e940bfe9b14414104e0a3b162', 'Content-Length': '59'}>
[2021-08-08 05:29:38,702] {httpsession.py:50} DEBUG - Certificate path: /home/airflow/.local/lib/python3.6/site-packages/certifi/cacert.pem
[2021-08-08 05:29:38,702] {connectionpool.py:943} DEBUG - Starting new HTTPS connection (1): eu-west-1.elasticmapreduce.amazonaws.com:443
[2021-08-08 05:29:38,752] {connectionpool.py:442} DEBUG - https://eu-west-1.elasticmapreduce.amazonaws.com:443 "POST / HTTP/1.1" 200 799
[2021-08-08 05:29:38,752] {parsers.py:233} DEBUG - Response headers: {'x-amzn-RequestId': 'ecacd9e3-44b2-4a00-a4f0-48efe0d65847', 'Content-Type': 'application/x-amz-json-1.1', 'Content-Length': '799', 'Date': 'Sun, 08 Aug 2021 05:29:38 GMT'}
[2021-08-08 05:29:38,752] {parsers.py:234} DEBUG - Response body:
b'{"Step":{"ActionOnFailure":"CONTINUE","Config":{"Args":["spark-submit","--deploy-mode","cluster","--master","yarn","--driver-memory","2G","--executor-memory","8G","--num-executors","5","--executor-cores","3","--packages","io.delta:delta-core_2.11:0.6.1","--py-files","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/adobe_target_catalog_sporza-1.0.0-py3.7_6.egg,s3://vrt-datalake-artifacts-prod/python-datalake-helpers/vrt_datalake_helpers-21-py3-none-any.whl","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/run_6.py","prod","2021-08-08 05:28:00"],"Jar":"command-runner.jar","Properties":{}},"Id":"s-581MKMYKRED7","Name":"adobe-target-catalog-sporza-job","Status":{"State":"PENDING","StateChangeReason":{},"Timeline":{"CreationDateTime":1.628400523922E9}}}}'
[2021-08-08 05:29:38,753] {hooks.py:210} DEBUG - Event needs-retry.emr.DescribeStep: calling handler <botocore.retryhandler.RetryHandler object at 0x7fc8dba19860>
[2021-08-08 05:29:38,753] {retryhandler.py:187} DEBUG - No retry needed.
[2021-08-08 05:29:38,753] {emr_base.py:66} INFO - Job flow currently PENDING
[2021-08-08 05:29:38,753] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:39,718] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:29:39,748] {taskinstance.py:1906} DEBUG - Task Duration set to 17.617928
[2021-08-08 05:29:42,695] {taskinstance.py:1484} INFO - Rescheduling task, marking task as UP_FOR_RESCHEDULE
[2021-08-08 05:29:42,696] {cli_action_loggers.py:84} DEBUG - Calling callbacks: []
[2021-08-08 05:29:44,944] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:29:48,268] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:29:48,674] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:29:48,674] {local_task_job.py:149} INFO - Task exited with return code 0
[2021-08-08 05:29:48,675] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [up_for_reschedule]> from DB
[2021-08-08 05:29:49,740] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:30:49,437] {__init__.py:51} DEBUG - Loading core task runner: StandardTaskRunner
[2021-08-08 05:30:51,569] {base_task_runner.py:62} DEBUG - Planning to run as the  user
[2021-08-08 05:30:51,572] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> from DB
[2021-08-08 05:30:52,465] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:30:52,465] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
[2021-08-08 05:30:52,465] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance State' PASSED: True, Task state queued was valid.
[2021-08-08 05:30:52,544] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED: True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:30:52,544] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
[2021-08-08 05:30:52,545] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance Not Running' PASSED: True, Task is not in running state.
[2021-08-08 05:30:52,546] {taskinstance.py:896} INFO - Dependencies all met for <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:30:52,546] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
[2021-08-08 05:30:52,593] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED: True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:30:52,593] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
[2021-08-08 05:30:52,756] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Pool Slots Available' PASSED: True, ('There are enough open slots in %s to execute the task', 'default_pool')
[2021-08-08 05:30:52,756] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Concurrency' PASSED: True, Task concurrency is not set.
[2021-08-08 05:30:52,756] {taskinstance.py:896} INFO - Dependencies all met for <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:30:52,756] {taskinstance.py:1087} INFO - 
--------------------------------------------------------------------------------
[2021-08-08 05:30:52,756] {taskinstance.py:1088} INFO - Starting attempt 1 of 1
[2021-08-08 05:30:52,756] {taskinstance.py:1089} INFO - 
--------------------------------------------------------------------------------
[2021-08-08 05:30:53,648] {taskinstance.py:1107} INFO - Executing <Task(EmrStepSensor): watch_adobe_target_catalog_sporza_job_emr_step> on 2021-08-07T05:28:00+00:00
[2021-08-08 05:30:53,651] {standard_task_runner.py:52} INFO - Started process 7972 to run task
[2021-08-08 05:30:53,655] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'derived.adobe_target_catalog_sporza', 'watch_adobe_target_catalog_sporza_job_emr_step', '2021-08-07T05:28:00+00:00', '--job-id', '98953', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/adobe_target_catalog_sporza_wf.py', '--cfg-path', '/tmp/tmppdcqbh0o', '--error-file', '/tmp/tmpdky_uy36']
[2021-08-08 05:30:53,655] {standard_task_runner.py:77} INFO - Job 98953: Subtask watch_adobe_target_catalog_sporza_job_emr_step
[2021-08-08 05:30:53,656] {cli_action_loggers.py:66} DEBUG - Calling callbacks: [<function default_action_log at 0x7fc8dfbe5d90>]
[2021-08-08 05:30:54,515] {settings.py:208} DEBUG - Setting up DB connection pool (PID 7972)
[2021-08-08 05:30:54,515] {settings.py:244} DEBUG - settings.prepare_engine_args(): Using NullPool
[2021-08-08 05:30:54,517] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [None]> from DB
[2021-08-08 05:30:57,090] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:30:58,979] {logging_mixin.py:104} INFO - Running <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> on host ip-172-31-58-121.eu-west-1.compute.internal
[2021-08-08 05:30:58,979] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:30:59,588] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:03,931] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:04,181] {taskinstance.py:677} DEBUG - Clearing XCom data
[2021-08-08 05:31:04,914] {taskinstance.py:684} DEBUG - XCom data cleared
[2021-08-08 05:31:05,896] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:05,934] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:31:12,815] {taskinstance.py:1302} INFO - Exporting the following env vars:
AIRFLOW_CTX_DAG_OWNER=vrt
AIRFLOW_CTX_DAG_ID=derived.adobe_target_catalog_sporza
AIRFLOW_CTX_TASK_ID=watch_adobe_target_catalog_sporza_job_emr_step
AIRFLOW_CTX_EXECUTION_DATE=2021-08-07T05:28:00+00:00
AIRFLOW_CTX_DAG_RUN_ID=scheduled__2021-08-07T05:28:00+00:00
[2021-08-08 05:31:12,815] {__init__.py:146} DEBUG - Preparing lineage inlets and outlets
[2021-08-08 05:31:12,816] {__init__.py:190} DEBUG - inlets: [], outlets: []
[2021-08-08 05:31:13,431] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:13,718] {base_aws.py:368} INFO - Airflow Connection: aws_conn_id=aws_default
[2021-08-08 05:31:14,251] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:14,293] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:31:14,780] {base_aws.py:179} INFO - No credentials retrieved from Connection
[2021-08-08 05:31:14,780] {base_aws.py:87} INFO - Creating session with aws_access_key_id=None region_name=None
[2021-08-08 05:31:14,781] {hooks.py:417} DEBUG - Changing event name from creating-client-class.iot-data to creating-client-class.iot-data-plane
[2021-08-08 05:31:14,783] {hooks.py:417} DEBUG - Changing event name from before-call.apigateway to before-call.api-gateway
[2021-08-08 05:31:14,784] {hooks.py:417} DEBUG - Changing event name from request-created.machinelearning.Predict to request-created.machine-learning.Predict
[2021-08-08 05:31:14,786] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.autoscaling.CreateLaunchConfiguration to before-parameter-build.auto-scaling.CreateLaunchConfiguration
[2021-08-08 05:31:14,786] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.route53 to before-parameter-build.route-53
[2021-08-08 05:31:14,786] {hooks.py:417} DEBUG - Changing event name from request-created.cloudsearchdomain.Search to request-created.cloudsearch-domain.Search
[2021-08-08 05:31:14,787] {hooks.py:417} DEBUG - Changing event name from docs.*.autoscaling.CreateLaunchConfiguration.complete-section to docs.*.auto-scaling.CreateLaunchConfiguration.complete-section
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.logs.CreateExportTask to before-parameter-build.cloudwatch-logs.CreateExportTask
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from docs.*.logs.CreateExportTask.complete-section to docs.*.cloudwatch-logs.CreateExportTask.complete-section
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from before-parameter-build.cloudsearchdomain.Search to before-parameter-build.cloudsearch-domain.Search
[2021-08-08 05:31:14,790] {hooks.py:417} DEBUG - Changing event name from docs.*.cloudsearchdomain.Search.complete-section to docs.*.cloudsearch-domain.Search.complete-section
[2021-08-08 05:31:14,791] {base_aws.py:157} INFO - role_arn is None
[2021-08-08 05:31:14,794] {utils.py:364} DEBUG - IMDS ENDPOINT: http://169.254.169.254/
[2021-08-08 05:31:14,796] {credentials.py:1974} DEBUG - Looking for credentials via: env
[2021-08-08 05:31:14,796] {credentials.py:1974} DEBUG - Looking for credentials via: assume-role
[2021-08-08 05:31:14,796] {credentials.py:1974} DEBUG - Looking for credentials via: assume-role-with-web-identity
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: sso
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: shared-credentials-file
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: custom-process
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: config-file
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: ec2-credentials-file
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: boto-config
[2021-08-08 05:31:14,797] {credentials.py:1974} DEBUG - Looking for credentials via: container-role
[2021-08-08 05:31:14,798] {connectionpool.py:230} DEBUG - Starting new HTTP connection (1): 169.254.170.2:80
[2021-08-08 05:31:14,799] {connectionpool.py:442} DEBUG - http://169.254.170.2:80 "GET /v2/credentials/c5f7099a-d46e-4472-a48f-6c137db9e75d HTTP/1.1" 200 1307
[2021-08-08 05:31:14,800] {loaders.py:174} DEBUG - Loading JSON file: /home/airflow/.local/lib/python3.6/site-packages/botocore/data/endpoints.json
[2021-08-08 05:31:14,810] {hooks.py:210} DEBUG - Event choose-service-name: calling handler <function handle_service_name_alias at 0x7fc8dbee9bf8>
[2021-08-08 05:31:14,817] {loaders.py:174} DEBUG - Loading JSON file: /home/airflow/.local/lib/python3.6/site-packages/botocore/data/emr/2009-03-31/service-2.json
[2021-08-08 05:31:14,822] {hooks.py:210} DEBUG - Event creating-client-class.emr: calling handler <function add_generate_presigned_url at 0x7fc8dbf957b8>
[2021-08-08 05:31:14,824] {endpoint.py:292} DEBUG - Setting elasticmapreduce timeout as (60, 60)
[2021-08-08 05:31:14,825] {loaders.py:174} DEBUG - Loading JSON file: /home/airflow/.local/lib/python3.6/site-packages/botocore/data/_retry.json
[2021-08-08 05:31:14,825] {client.py:166} DEBUG - Registering retry handlers for service: emr
[2021-08-08 05:31:14,826] {emr_step.py:73} INFO - Poking step s-581MKMYKRED7 on cluster j-XDTNJDHR23RQ
[2021-08-08 05:31:14,826] {hooks.py:210} DEBUG - Event before-parameter-build.emr.DescribeStep: calling handler <function generate_idempotent_uuid at 0x7fc8dbf07048>
[2021-08-08 05:31:14,826] {hooks.py:210} DEBUG - Event before-call.emr.DescribeStep: calling handler <function inject_api_version_header_if_needed at 0x7fc8dbf087b8>
[2021-08-08 05:31:14,826] {endpoint.py:101} DEBUG - Making request for OperationModel(name=DescribeStep) with params: {'url_path': '/', 'query_string': '', 'method': 'POST', 'headers': {'X-Amz-Target': 'ElasticMapReduce.DescribeStep', 'Content-Type': 'application/x-amz-json-1.1', 'User-Agent': 'Boto3/1.17.107 Python/3.6.13 Linux/4.14.238-182.422.amzn2.x86_64 exec-env/AWS_ECS_FARGATE Botocore/1.20.107'}, 'body': b'{"ClusterId": "j-XDTNJDHR23RQ", "StepId": "s-581MKMYKRED7"}', 'url': 'https://eu-west-1.elasticmapreduce.amazonaws.com/', 'context': {'client_region': 'eu-west-1', 'client_config': <botocore.config.Config object at 0x7fc8dba0feb8>, 'has_streaming_input': False, 'auth_type': None}}
[2021-08-08 05:31:14,826] {hooks.py:210} DEBUG - Event request-created.emr.DescribeStep: calling handler <bound method RequestSigner.handler of <botocore.signers.RequestSigner object at 0x7fc8dba0fe80>>
[2021-08-08 05:31:14,827] {hooks.py:210} DEBUG - Event choose-signer.emr.DescribeStep: calling handler <function set_operation_specific_signer at 0x7fc8dbf01ea0>
[2021-08-08 05:31:14,827] {auth.py:380} DEBUG - Calculating signature using v4 auth.
[2021-08-08 05:31:14,827] {auth.py:381} DEBUG - CanonicalRequest:
POST
/

content-type:application/x-amz-json-1.1
host:eu-west-1.elasticmapreduce.amazonaws.com
x-amz-date:20210808T053114Z
x-amz-security-token:IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON
x-amz-target:ElasticMapReduce.DescribeStep

content-type;host;x-amz-date;x-amz-security-token;x-amz-target
2a131454435119560c6fa09b6b1eeeb557aba91e61af62fe4883ccc24890c4f3
[2021-08-08 05:31:14,827] {auth.py:383} DEBUG - StringToSign:
AWS4-HMAC-SHA256
20210808T053114Z
20210808/eu-west-1/elasticmapreduce/aws4_request
a2f4ed4536a8bb05d35fc345f45eec9b42f7e071c230286854f034ea4878ced2
[2021-08-08 05:31:14,827] {auth.py:385} DEBUG - Signature:
90da79c86917ea04d8aed50abba5b1b3d5152e5327941eda3bf485e6af620e6b
[2021-08-08 05:31:14,827] {endpoint.py:187} DEBUG - Sending http request: <AWSPreparedRequest stream_output=False, method=POST, url=https://eu-west-1.elasticmapreduce.amazonaws.com/, headers={'X-Amz-Target': b'ElasticMapReduce.DescribeStep', 'Content-Type': b'application/x-amz-json-1.1', 'User-Agent': b'Boto3/1.17.107 Python/3.6.13 Linux/4.14.238-182.422.amzn2.x86_64 exec-env/AWS_ECS_FARGATE Botocore/1.20.107', 'X-Amz-Date': b'20210808T053114Z', 'X-Amz-Security-Token': b'IQoJb3JpZ2luX2VjEGwaCWV1LXdlc3QtMSJHMEUCIQCZ4NXI3+3YJKZwyu+L4vTyY4swRSd7+zHkDO3Nwc0JiwIgdDuzR2+qJ5vFE03B1RRhzZYGV4Iy/pt7jVXRr0YjXTUqogQIhP//////////ARABGgwyODM3NzQxNDgzNTciDDl+9YdFHetmruZnfSr2A4vFCZwFNe0LEigdz0ayjIhvTZMsjY6CClml27JRBbUllyWz6t3b5SjafRWERgfEe1xIjnFFvp/UFSDIrZVuOefX94uUcgWQKV5zhz/dHdpCzpPltF+syapd2IeY5lgaUZWr+ax+Pfs75JILyfTpa7u43hg1M6P9L07zP/IUWdrXvyvu0wVh07r6/bHUDKJGy52Ok2q6Hr4ltT4MVAY/EZIEDZA8dlNdd7/UDddecp+/7/QIMftzZsgQJVyO+IIB+cfHCIV+1g/ACnsSbBkJuH2fn/B0YeP6e1135yTFknzNKVUxf6OeBIuPrkcu78HnaOXcjv+JSKb9AcFP+2zOo9JYmr040XGVqesZiAc+uDWbL74LDQ7ze5b4sLSNXfCCEvqsYVy9Nt7UlopCXoIS2lBrWgXNhsppDQ976Gr+WgyV1Wbjx2NZlJFCtSNarwF0qDPBmm6yhQxyKMYarOrUpy9aNl/18ujIhJZ2/JMUYlzH0TOBc1YCKIuZLNPwYfxZW6JmosfKXjAFUK7eDn4tU6LSXyo2GNLYzjY71OyCOpllJYtxg/NDAQBlmMCbFPpFCCTjPECU0sGnhfPAdlsyYngT3L0bSyjAZpPRfYrfPd4i3j2niPq9BwF/iOIR95T2CK7zqxXB4TIpporSE0pItmFgynGTh+IwtZm9iAY6pgENShKQaHxJvNj1O5ddYeCwW3H2SqjrHroeOt82xGkH5stZ7g28Uy7iPIPEehlkrB7ry/KjLq9DAo4zVneHhfKJBUBTBxz944AZhd0d9VvSRGgEXnmxzTsHma9hnkwQfGdbXwH4ja582o7hPEI98AAgoo5OCuV/vyMHHgrJBa7z+7eRtGnCeg+EJZy9WlsilHHGze4QGDEDc4NJypNZM4JON503hXON', 'Authorization': b'AWS4-HMAC-SHA256 Credential=ASIAUEESB34CT3URIQHE/20210808/eu-west-1/elasticmapreduce/aws4_request, SignedHeaders=content-type;host;x-amz-date;x-amz-security-token;x-amz-target, Signature=90da79c86917ea04d8aed50abba5b1b3d5152e5327941eda3bf485e6af620e6b', 'Content-Length': '59'}>
[2021-08-08 05:31:14,828] {httpsession.py:50} DEBUG - Certificate path: /home/airflow/.local/lib/python3.6/site-packages/certifi/cacert.pem
[2021-08-08 05:31:14,828] {connectionpool.py:943} DEBUG - Starting new HTTPS connection (1): eu-west-1.elasticmapreduce.amazonaws.com:443
[2021-08-08 05:31:15,021] {connectionpool.py:442} DEBUG - https://eu-west-1.elasticmapreduce.amazonaws.com:443 "POST / HTTP/1.1" 200 799
[2021-08-08 05:31:15,021] {parsers.py:233} DEBUG - Response headers: {'x-amzn-RequestId': '12e9220a-4dc3-4ca0-9ec9-93a529032f9b', 'Content-Type': 'application/x-amz-json-1.1', 'Content-Length': '799', 'Date': 'Sun, 08 Aug 2021 05:31:14 GMT'}
[2021-08-08 05:31:15,021] {parsers.py:234} DEBUG - Response body:
b'{"Step":{"ActionOnFailure":"CONTINUE","Config":{"Args":["spark-submit","--deploy-mode","cluster","--master","yarn","--driver-memory","2G","--executor-memory","8G","--num-executors","5","--executor-cores","3","--packages","io.delta:delta-core_2.11:0.6.1","--py-files","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/adobe_target_catalog_sporza-1.0.0-py3.7_6.egg,s3://vrt-datalake-artifacts-prod/python-datalake-helpers/vrt_datalake_helpers-21-py3-none-any.whl","s3://vrt-datalake-artifacts-prod/adobe-target-catalog-sporza/python/run_6.py","prod","2021-08-08 05:28:00"],"Jar":"command-runner.jar","Properties":{}},"Id":"s-581MKMYKRED7","Name":"adobe-target-catalog-sporza-job","Status":{"State":"PENDING","StateChangeReason":{},"Timeline":{"CreationDateTime":1.628400523922E9}}}}'
[2021-08-08 05:31:15,022] {hooks.py:210} DEBUG - Event needs-retry.emr.DescribeStep: calling handler <botocore.retryhandler.RetryHandler object at 0x7fc8db9b82e8>
[2021-08-08 05:31:15,022] {retryhandler.py:187} DEBUG - No retry needed.
[2021-08-08 05:31:15,023] {emr_base.py:66} INFO - Job flow currently PENDING
[2021-08-08 05:31:15,023] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:17,730] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:31:17,771] {taskinstance.py:1906} DEBUG - Task Duration set to 25.225634
[2021-08-08 05:31:21,580] {taskinstance.py:1484} INFO - Rescheduling task, marking task as UP_FOR_RESCHEDULE
[2021-08-08 05:31:21,581] {cli_action_loggers.py:84} DEBUG - Calling callbacks: []
[2021-08-08 05:31:24,405] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:31:26,452] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:31:26,531] {base_job.py:227} DEBUG - [heartbeat]
[2021-08-08 05:31:26,531] {local_task_job.py:149} INFO - Task exited with return code 0
[2021-08-08 05:31:26,531] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [up_for_reschedule]> from DB
[2021-08-08 05:31:27,269] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [up_for_reschedule]>
[2021-08-08 05:32:35,215] {__init__.py:51} DEBUG - Loading core task runner: StandardTaskRunner
[2021-08-08 05:32:36,815] {base_task_runner.py:62} DEBUG - Planning to run as the  user
[2021-08-08 05:32:36,817] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> from DB
[2021-08-08 05:32:38,537] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:32:38,537] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
[2021-08-08 05:32:38,538] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance State' PASSED: True, Task state queued was valid.
[2021-08-08 05:32:38,578] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED: True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:32:38,578] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
[2021-08-08 05:32:38,579] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Instance Not Running' PASSED: True, Task is not in running state.
[2021-08-08 05:32:38,579] {taskinstance.py:896} INFO - Dependencies all met for <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:32:38,579] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Previous Dagrun State' PASSED: True, The task did not have depends_on_past set.
[2021-08-08 05:32:39,309] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Ready To Reschedule' PASSED: True, The task instance is not in State_UP_FOR_RESCHEDULE or NONE state.
[2021-08-08 05:32:39,310] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Not In Retry Period' PASSED: True, The task instance was not marked for retrying.
[2021-08-08 05:32:39,477] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Pool Slots Available' PASSED: True, ('There are enough open slots in %s to execute the task', 'default_pool')
[2021-08-08 05:32:39,477] {taskinstance.py:911} DEBUG - <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]> dependency 'Task Concurrency' PASSED: True, Task concurrency is not set.
[2021-08-08 05:32:39,477] {taskinstance.py:896} INFO - Dependencies all met for <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [queued]>
[2021-08-08 05:32:39,477] {taskinstance.py:1087} INFO - 
--------------------------------------------------------------------------------
[2021-08-08 05:32:39,477] {taskinstance.py:1088} INFO - Starting attempt 1 of 1
[2021-08-08 05:32:39,477] {taskinstance.py:1089} INFO - 
--------------------------------------------------------------------------------
[2021-08-08 05:32:43,976] {taskinstance.py:1107} INFO - Executing <Task(EmrStepSensor): watch_adobe_target_catalog_sporza_job_emr_step> on 2021-08-07T05:28:00+00:00
[2021-08-08 05:32:43,979] {standard_task_runner.py:52} INFO - Started process 8339 to run task
[2021-08-08 05:32:43,984] {standard_task_runner.py:76} INFO - Running: ['airflow', 'tasks', 'run', 'derived.adobe_target_catalog_sporza', 'watch_adobe_target_catalog_sporza_job_emr_step', '2021-08-07T05:28:00+00:00', '--job-id', '98962', '--pool', 'default_pool', '--raw', '--subdir', 'DAGS_FOLDER/adobe_target_catalog_sporza_wf.py', '--cfg-path', '/tmp/tmpbqux_rdl', '--error-file', '/tmp/tmpyb9gdojg']
[2021-08-08 05:32:43,984] {standard_task_runner.py:77} INFO - Job 98962: Subtask watch_adobe_target_catalog_sporza_job_emr_step
[2021-08-08 05:32:43,985] {cli_action_loggers.py:66} DEBUG - Calling callbacks: [<function default_action_log at 0x7fc8dfbe5d90>]
[2021-08-08 05:32:51,719] {settings.py:208} DEBUG - Setting up DB connection pool (PID 8339)
[2021-08-08 05:32:51,720] {settings.py:244} DEBUG - settings.prepare_engine_args(): Using NullPool
[2021-08-08 05:32:51,722] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [None]> from DB
[2021-08-08 05:32:53,315] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:32:54,868] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:32:55,812] {logging_mixin.py:104} INFO - Running <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> on host ip-172-31-58-121.eu-west-1.compute.internal
[2021-08-08 05:32:55,813] {taskinstance.py:614} DEBUG - Refreshing TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]> from DB
[2021-08-08 05:32:57,199] {taskinstance.py:649} DEBUG - Refreshed TaskInstance <TaskInstance: derived.adobe_target_catalog_sporza.watch_adobe_target_catalog_sporza_job_emr_step 2021-08-07T05:28:00+00:00 [running]>
[2021-08-08 05:32:57,203] {local_task_job.py:194} WARNING - Recorded pid 7972 does not match the current pid 8339
[2021-08-08 05:32:57,206] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 8339
[2021-08-08 05:32:57,210] {process_utils.py:66} INFO - Process psutil.Process(pid=8339, status='terminated', exitcode=<Negsignal.SIGTERM: -15>, started='05:32:43') (8339) terminated with exit code Negsignal.SIGTERM

What you expected to happen:

I’d expect the EMRStepSensor to run until the EMR step succeeded, and report a succesful run.

If my understanding is correct, these final lines in the log show the runner terminating the task process. If I’m reading the log correctly, 8339 is the correct PID for the task, and the recorded pid 7972 is the pid for a previous run. Could it be possible that this pid is not correctly being updated?

[2021-08-08 05:32:57,203] {local_task_job.py:194} WARNING - Recorded pid 7972 does not match the current pid 8339
[2021-08-08 05:32:57,206] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 8339
[2021-08-08 05:32:57,210] {process_utils.py:66} INFO - Process psutil.Process(pid=8339, status='terminated', exitcode=<Negsignal.SIGTERM: -15>, started='05:32:43') (8339) terminated with exit code Negsignal.SIGTERM

Anything else we need to know:

The symptoms look very similar to #17394, but I’m not using run_as_user, and the reported pids are not the same, so I’m not sure whether this is the same issue.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 16
  • Comments: 66 (24 by maintainers)

Most upvoted comments

I found that I change aiflow.cfg default_impersonation = myuser. When I set it back to default default_impersonation= this issue is resolved for me in Airflow 2.2.2. I made this change after I found Logs suppressed when impersonation is enabled https://github.com/apache/airflow/issues/18041

For those having this issue while using multiple schedulers, I added a fix of an issue I got while running two schedulers. You can try the patch. See https://github.com/apache/airflow/pull/18975

cf https://github.com/apache/airflow/issues/18041#issuecomment-915383493

For me for the moment the pb is on backfill mode. tonight my processes will run on this new airflow session I will see If I get the same errors.

UPDATE: I have modified the param scheduler_heartbeat_sec to 60 sec instead of 5 sec and it is better: so maybe a problem of performance in the backend (postgresql) creates this issue … @kiwy42 what backend are u using to store the task instances ?

Is there any workaround since my dags are no longer working properly ? tks

Hi all,

I am experiencing this on 2.3.2 with LocalExecutor (4 schedulers), Postgres, and Ubuntu 22.04.

This is, however, running a clone of our staging environment of dags that run fine on 2.1.4 and Ubuntu 16.04. I’m also running on a much smaller and less powerful instance, and so it may be exacerbating race conditions.

I did some investigation into the process state, and when this error leads to a failure, this is what I see in process executions:

  • The Scheduler task is the root of everything, as you’d expect (airflow scheduler)
  • recorded_pid , which is assigned to be the taskinstance pid (ti.pid) normally and the parent of the taskinstance pid (psutil.Process(ti.pid).ppid()) when RUN_AS_USER is set. When failing, this consistently shows up as the worker (worker -- LocalExecutor). This is a persistent and long term process.
  • The child of the recorded_pid is the pid of the current process (as reported by os.getpid()), which is the airflow task supervisor. This (and everything below) is one of the short term task-specific processes.
  • The current_pid can be different things, but always appears to be the child of the task supervisor / current pid. Often times this must be a fleeting process as I can barely catch a record of it when I’m trying to fetch a snapshot. Here are a couple that I have seen:
    • In some cases, I have seen this as the task runner’s pid - airflow tasks run [taskname]
    • I have also seen this as the airflow task su, and the tasks are RUN_AS_USER, so likely related.

I came to wonder, since this error happens because (a) the final recorded_pid is not None and (B) recorded_pid != current_pid - it doesn’t make much sense to ever be comparing against the Task Instance pid since that’s hanging around for a very long time and the heatbeat function appears to be identifying when the current task runner is zombified or missing.

As I’ve investigated further, I’ve found on task failures for RUN_AS_USER tasks in which this fails, the ti.pid is almost invariably None, which means the recorded_pid comes in as psutil.Process(None).ppid(), which will be the parent of the current process. I am currently under the impression that this was not intended - and that the error condition should only be tested when ti.pid is not None, instead of recorded_pid is not None.

I’m testing this right now and it seems to work - and if that seems to hold up I’ll put in a PR.

Thanks, @potiuk will re-raise again if the observed issue again.

@shubhampatel94 and @potiuk -

Note that since (my) patch was applied that fixed the race condition, I have occasionally seen this error when the process was killed for another reason - for example, we have OS monitors that will kill processes that are being bad citizens, or certain times when the task had an unexpected exception and died by itself.

I’ve verified in these cases that the task failed to complete and Airflow throws this error message, but Airflow is not the ultimate culprit that is terminating the tasks, and this is the message terminating the containing process of the dead task. Is it possible that this is what is happening with you?

At some point I was going to try to dig deeper to verify the situation and propose a better way to identify it and send a better error message, but I haven’t had the time.

I am still seeing this issue with Airflow 2.2.2 and the “tutorial” DAG sleep task. https://airflow.apache.org/docs/apache-airflow/stable/_modules/airflow/example_dags/tutorial.html Here is the log

[2021-11-17 14:21:40,353] {taskinstance.py:1035} INFO - Dependencies all met for <TaskInstance: tutorial.sleep scheduled__2021-11-16T20:43:29.803380+00:00 [queued]>
[2021-11-17 14:21:40,361] {taskinstance.py:1035} INFO - Dependencies all met for <TaskInstance: tutorial.sleep scheduled__2021-11-16T20:43:29.803380+00:00 [queued]>
[2021-11-17 14:21:40,361] {taskinstance.py:1241} INFO -
--------------------------------------------------------------------------------
[2021-11-17 14:21:40,361] {taskinstance.py:1242} INFO - Starting attempt 2 of 4
[2021-11-17 14:21:40,361] {taskinstance.py:1243} INFO -
--------------------------------------------------------------------------------
[2021-11-17 14:21:40,370] {taskinstance.py:1262} INFO - Executing <Task(BashOperator): sleep> on 2021-11-16 20:43:29.803380+00:00
[2021-11-17 14:21:40,371] {base_task_runner.py:141} INFO - Running on host: ***
[2021-11-17 14:21:40,371] {base_task_runner.py:142} INFO - Running: ['airflow', 'tasks', 'run', 'tutorial', 'sleep', 'scheduled__2021-11-16T20:43:29.803380+00:00', '--job-id', '5650', '--raw', '--subdir', 'DAGS_FOLDER/tutorial.py', '--cfg-path', '/tmp/tmpqizvuf8t', '--error-file', '/tmp/tmpwyrb231g']
[2021-11-17 14:21:41,834] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep [2021-11-17 14:21:41,834] {dagbag.py:500} INFO - Filling up the DagBag from /home/myuser/airflow/dags/tutorial.py
[2021-11-17 14:21:45,432] {local_task_job.py:206} WARNING - Recorded pid 17329 does not match the current pid 17330
[2021-11-17 14:21:45,435] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 17330
[2021-11-17 14:21:45,460] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep Running <TaskInstance: tutorial.sleep scheduled__2021-11-16T20:43:29.803380+00:00 [running]> on host ***
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep Traceback (most recent call last):
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/bin/airflow", line 8, in <module>
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     sys.exit(main())
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/__main__.py", line 48, in main
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     args.func(args)
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/cli/cli_parser.py", line 48, in command
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     return func(*args, **kwargs)
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/utils/cli.py", line 92, in wrapper
[2021-11-17 14:21:45,462] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     return f(*args, **kwargs)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/task_command.py", line 292, in task_run
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     _run_task_by_selected_method(args, dag, ti)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/task_command.py", line 107, in _run_task_by_selected_method
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     _run_raw_task(args, ti)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/cli/commands/task_command.py", line 180, in _run_raw_task
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     ti._run_raw_task(
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/utils/session.py", line 70, in wrapper
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     return func(*args, session=session, **kwargs)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 1332, in _run_raw_task
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     self._execute_task_with_callbacks(context)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 1458, in _execute_task_with_callbacks
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     result = self._execute_task(context, self.task)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 1514, in _execute_task
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     result = execute_callable(context=context)
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/operators/bash.py", line 178, in execute
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     result = self.subprocess_hook.run_command(
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/hooks/subprocess.py", line 87, in run_command
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     for raw_line in iter(self.sub_process.stdout.readline, b''):
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep   File "/usr/local/lib/python3.8/dist-packages/airflow/models/taskinstance.py", line 1413, in signal_handler
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep     raise AirflowException("Task received SIGTERM signal")
[2021-11-17 14:21:45,463] {base_task_runner.py:122} INFO - Job 5650: Subtask sleep airflow.exceptions.AirflowException: Task received SIGTERM signal
[2021-11-17 14:21:45,648] {process_utils.py:66} INFO - Process psutil.Process(pid=17334, status='terminated', started='14:21:41') (17334) terminated with exit code None
[2021-11-17 14:21:45,741] {process_utils.py:66} INFO - Process psutil.Process(pid=17330, status='terminated', exitcode=1, started='14:21:40') (17330) terminated with exit code 1

airflow info

Apache Airflow version | 2.2.2 executor | LocalExecutor task_logging_handler | airflow.utils.log.file_task_handler.FileTaskHandler sql_alchemy_conn | mysql://myuser@localhost/airflow dags_folder | /home/myuser/airflow/dags plugins_folder | /home/myuser/airflow/plugins base_log_folder | /home/myuser/airflow/logs remote_base_log_folder |

System info OS | Linux architecture | x86_64 uname | uname_result(system=‘Linux’, node=‘*****’, release=‘4.15.0-159-generic’, version=‘#167~16.04.1-Ubuntu SMP Wed | Sep 22 14:59:34 UTC 2021’, machine=‘x86_64’, processor=‘x86_64’) locale | (‘en_US’, ‘UTF-8’) python_version | 3.8.9 (default, Apr 3 2021, 01:02:10) [GCC 5.4.0 20160609] python_location | /usr/bin/python3.8

Tools info git | git version 2.7.4 ssh | OpenSSH_7.2p2 Ubuntu-4ubuntu2.10, OpenSSL 1.0.2g 1 Mar 2016 kubectl | NOT AVAILABLE gcloud | NOT AVAILABLE cloud_sql_proxy | NOT AVAILABLE mysql | mysql Ver 14.14 Distrib 5.7.33, for Linux (x86_64) using EditLine wrapper sqlite3 | NOT AVAILABLE psql | NOT AVAILABLE

Paths info airflow_home | /home/myuser/airflow system_path | /usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin: python_path | /usr/local/bin:/usr/lib/python38.zip:/usr/lib/python3.8:/usr/lib/python3.8/lib-dynload:/usr/local/lib/python3.8/dist-packages:/usr/lib/python3/di | st-packages:/home/myuser/airflow/dags:/home/myuser/airflow/config:/home/myuser/airflow/plugins airflow_on_path | True

Providers info apache-airflow-providers-ftp | 2.0.1 apache-airflow-providers-http | 2.0.1 apache-airflow-providers-imap | 2.0.1 apache-airflow-providers-mysql | 2.1.1 apache-airflow-providers-sqlite | 2.0.1

For those having this issue while using multiple schedulers, I added a fix of an issue I got while running two schedulers. You can try the patch. See #18975

I think this patch has solved the issue for me (or at least reduced its occurrence substantially) - Airflow 2.1.4, Python 3.7.9, using 2 schedulers

Ideally please upgrade to 2.2.2 and get us a fresh log with this error - this might be immensely helpful to diagnose and fix the issue @brosenkrantz @imamdigmi @stephenonethree and anyone else facing the problem.

The best way to do it is to create a gist in GitHub and link to it from this issue.

Facing this same issue running Airflow 2.1.x-2.2.x using this repo’s helm chart on AKS running two schedulers.

hello , I’m facing the same issue : running airflow version 2.1.4 with python version 3.8.10 [2021-10-06 15:44:35,118] {local_task_job.py:203} WARNING - Recorded pid 3761164 does not match the current pid 3761425 [2021-10-06 15:44:35,143] {process_utils.py:100} INFO - Sending Signals.SIGTERM to GPID 3761425 [2021-10-06 15:44:35,153] {process_utils.py:66} INFO - Process psutil.Process(pid=3761425, status='terminated', exitcode=<Negsignal.SIGTERM: -15>, started='15:44:29') (3761425) terminated with exit code Negsignal.SIGTERM