apm-agent-python: ElasticAPM is hanging lambda function after processing all events in an SQS queue

I believe ElasticAPM will hang the lambda function when no more events exist in the SQS queue.

To Reproduce

  • Setup a lambda function to process events from an SQS queue
  • Send events to an SQS queue
  • Set ELASTIC_APM_LOG_LEVEL: debug

Environment (please complete the following information)

  • OS: debian:bullseye-slim (python:3.9-slim container image) x86
  • Python version: 3.9
  • Framework and version [e.g. Django 2.1]:
  • APM Server version: 8.4.2
  • Agent version: 6.13.2

Additional context Cloudwatch logs (can dive deeper if needed) after execution is completed image

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 18 (16 by maintainers)

Commits related to this issue

Most upvoted comments

I love it when it’s easy! Just frustrated I missed the config issue originally. 😅 Thanks for opening a PR!

@brett-fitz Thanks for sharing the configuration and the debug logs.

I can see that you are setting ELASTIC_APM_SERVER_URL to the APM Server’s URL which might be causing the current issue. APM-Agents are supposed to proxy their calls to APM-Server through the lambda extension but in the current case, the agents seem to be directly connected to the APM Server. Based on the extension’s logs, I can see that the extension doesn’t get any metadata or the flush signal from the agent.

Can you remove/unset the ELASTIC_APM_SERVER_URL or set it to http://localhost:8200 and try the extension again? (Only the ELASTIC_APM_LAMBDA_APM_SERVER should be set to point to APM Server’s URL).

Yup, I can do that today. Due to this issue I had to gut APM from all of our serverless services running on lambda. It may take me some time to get this back up, stand by for logs.