airflow: ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: cannot allocate memory in static TLS block

I am getting this error in a lot of DAGs that seem very random. The full trace of the error is:

Traceback (most recent call last):
[2021-08-11 09:29:19,497]  INFO -   File "/home/***/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 18, in <module>
[2021-08-11 09:29:19,498]  INFO -     from . import _mysql
[2021-08-11 09:29:19,499]  INFO - ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: cannot allocate memory in static TLS block
[2021-08-11 09:29:19,500]  INFO - 
[2021-08-11 09:29:19,501]  INFO - During handling of the above exception, another exception occurred:
[2021-08-11 09:29:19,502]  INFO - 
[2021-08-11 09:29:19,503]  INFO - Traceback (most recent call last):
[2021-08-11 09:29:19,504]  INFO -   File "/opt/***/dags/import_apple_app_store/py_s3_apple.py", line 7, in <module>
[2021-08-11 09:29:19,505]  INFO -     from ***.models import Variable
[2021-08-11 09:29:19,506]  INFO -   File "/home/***/.local/lib/python3.8/site-packages/***/__init__.py", line 46, in <module>
[2021-08-11 09:29:19,506]  INFO -     settings.initialize()
[2021-08-11 09:29:19,507]  INFO -   File "/home/***/.local/lib/python3.8/site-packages/***/settings.py", line 445, in initialize
[2021-08-11 09:29:19,508]  INFO -     configure_adapters()
[2021-08-11 09:29:19,508]  INFO -   File "/home/***/.local/lib/python3.8/site-packages/***/settings.py", line 325, in configure_adapters
[2021-08-11 09:29:19,509]  INFO -     import MySQLdb.converters
[2021-08-11 09:29:19,510]  INFO -   File "/home/***/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 24, in <module>
[2021-08-11 09:29:19,510]  INFO -     version_info, _mysql.version_info, _mysql.__file__
[2021-08-11 09:29:19,511]  INFO - NameError: name '_mysql' is not defined
[2021-08-11 09:29:19,647] {python_file_operator.py:118} INFO - Command exited with return code 1

Characteristics:

  • Official Docker image apache/airflow:2.1.2.
  • I add to the official image the following: RUN sudo apt-get update && sudo apt-get install -y build-essential gconf-service libasound2 libatk1.0-0 libcairo2 libcups2 libfontconfig1 libgdk-pixbuf2.0-0 libgtk-3-0 libnspr4 libpango-1.0-0 libxss1 fonts-liberation libappindicator1 libnss3 lsb-release xdg-utils wget libappindicator3-1 libgbm1
  • Also I have a requirements file. The only mysql-related library is the provider one.
  • Prod env in an Ubuntu 18.04 AWS EC2 of 128GB RAM
  • Local env MacOS 10GB allocated to Docker. I have this parameter in the docker-compose AIRFLOW__OPERATORS__DEFAULT_RAM: 2048
  • All errors occurred in python scripts executed by BashOperators
  • All errors happen during the imports, not even a single line of code after the imports is executed
  • No DAGs with the error uses any MySql library

First, in prod, with Python 3.6, I had the error on several DAGs (let’s say DAGs A and B worked, and C and D did not). In both A and B, there was a common point, importing Variable before SnowflakeHook. So the code was:

from snowflake_hook import SnowflakeHook (this file does NOT make use of Variable anywhere in the code)
from airflow.models import Variable

And changing the order solved the problem:

from airflow.models import Variable
from snowflake_hook import SnowflakeHook

Now, in local, I was testing updating the python version using the official images. Starting from Python 3.9, suddenly, the dag A, which was already fixed in prod, stopped working.

Then I change to Python 3.8 and the dag A starts working again. However, dag C, which never had a problem, start failing with the same error. In this case, there were no import of SnowflakeHook, just common python libraries imports. Anyway, I change the code from:

import logging
import pandas as pd
import requests

from airflow.models import Variable

to

from airflow.models import Variable

import logging
import pandas as pd
import requests

And it works.

Furthermore, dag D, which also didn’t have a problem, fails also. I try to change the orders of the imports, and in this case, it does not work. At the time of failing, there were only 5 dags active and just 1 of them (light one) running, so it does not seem to be a RAM problem.

So I am very lost about how to deal with this error. The docker-compose is very similar to the one in the official docs.

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 19 (11 by maintainers)

Commits related to this issue

Most upvoted comments

I had a similar issue with the apache/airflow:2.1.4-python3.8 Docker image:

WARNING:root:OSError while attempting to symlink the latest log directory
Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 18, in <module>
    from . import _mysql
ImportError: /usr/lib/x86_64-linux-gnu/libstdc++.so.6: cannot allocate memory in static TLS block

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/airflow/.local/bin/airflow", line 5, in <module>
    from airflow.__main__ import main
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/__init__.py", line 46, in <module>
    settings.initialize()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/settings.py", line 445, in initialize
    configure_adapters()
  File "/home/airflow/.local/lib/python3.8/site-packages/airflow/settings.py", line 325, in configure_adapters
    import MySQLdb.converters
  File "/home/airflow/.local/lib/python3.8/site-packages/MySQLdb/__init__.py", line 24, in <module>
    version_info, _mysql.version_info, _mysql.__file__
NameError: name '_mysql' is not defined
Running command: airflow users create

The LD_PRELOAD fix didn’t work for me.

In my case it occurred when using remote logging with Google Cloud Storage and the Google Cloud Secrets backend. With these two services off it didn’t occur. Unfortunately I haven’t had a chance to test it with one or the other off.

What fixed it was removing all the Python packages installed by the Airflow Docker image, including the airflow providers, before re-installing Airflow and just installing a subset of the providers and required Python packages.

RUN pip freeze | xargs pip uninstall -y
RUN python -m pip install --upgrade pip
RUN pip cache purge 

I’m happy to give more details or investigate a little bit more if it is helpful.

Same here, it happens when using remote logging with Google Cloud Storage. (we are using Workload Identity as credentials since we are on GKE)

  1. Without any dependency with MySQL. airflow still depend on mysql.

It does not. Airflow only uses mysql when mysql is configured via sqlalchemy URL https://docs.sqlalchemy.org/en/14/dialects/mysql.html#module-sqlalchemy.dialects.mysql.mysqlconnector (This actually brings me to another option to try @JavierLopezT ) - maybe using different mysql driver might help as well - those can be chosen by a different connection string.

  1. If import airflow before pandas, the bug disappears. ​ I’m not sure is this mysql issue or airflow issure

I think it’s neither. It looks like it is some 3rd-party library (maybe even pandas) which badly uses the feature and causes subsequent initialization of memory via stdlibc++ crash.

There are multiple similar examples scattered around the internet (completely unrelated to Airflow) that show similar problems and they also show that exchanging import sequence might fix the problem:

For example here: https://www.codegrepper.com/code-examples/shell/ImportError%3A+%2Fusr%2Flib%2Faarch64-linux-gnu%2Flibgomp.so.1%3A+cannot+allocate+memory+in+static+TLS+block

I had a similar issue where I was unable to import cv2. I can import opencv in console I was importing tensorflow before cv2. For some reason switching the order (importing cv2 first) worked. I am not sure how it worked but it might help you.

One common point - all the problems I saw involve python packages that use some deep C++ integrated/compiled parts - opencv, tensorflow, scikit-;earn and so on. That would indeed imply that the problem is pandas. Maybe then simply manually upgrading pandas to latest available version will work (another thing to try @JavierLopezT ).

Side-note that for 2.2 we just (2 days ago) - got rid of pandas (https://github.com/apache/airflow/pull/17575) and numpy (https://github.com/apache/airflow/pull/17594) as “core” airflow dependency. I think it does not change much in this case (they will simply not be installed by default when just “core” airflow is installed without any extras, but at least it will not be always installed, and if you install it separately (because your DAGs will need it), you will no longer be even “expected” to use the version specified in the constraints

Looking at the workaround, I don’t think there is a 8.0.19 version of mysql client for buster (debian10) debian. The last one I could find was 8.0.20 http://repo.mysql.com/apt/debian/pool/mysql-8.0/m/mysql-community/

However, the issue looks slightly differently than the one from Ubuntu.

readelf --dynamic libmysqlclient.so.21.1.25 | grep BIND

Produces no output, and this was one of the indications, there is a problem. It might be that there is another library involved (and mysql lib version has nothing to do with it - seems that this might be a problem with other libraries and I believe, since we are at .25 now, Oracle should have fixed the problem already).

I think @JavierLopezT if you could try two workarounds, that would be great - we could see if one of them works.

1. Upgrade to latest version of mysqlclient

Try LATEST version of the mysqlclient. Oracle released .26 version in July, and even if I had not seen any related changes in the changelog, it might also upgrade some related libraries. The nice thing about it is that if this works, then 2.1.3 will automatically include that latest client and will fix the problem automatically.

You just need to build a new docker image with this Dockerfile:

FROM apache/airflow:2.1.2
USER root
RUN apt-get update \
  && apt-get install -y --no-install-recommends \
        libmysqlclient21=8.0.26-1debian10  \
  && apt-get autoremove -yqq --purge \
  && apt-get clean \
  && rm -rf /var/lib/apt/lists/*
USER airflow

Run docker build . -t apache/airflow:2.1.2-upgraded-libmysqlclient and you can swap your image to use the new tag 2.1.2-upgraded-libmysqlclient

This is what gets installed when I try it!
(Reading database ... 9950 files and directories currently installed.)
Preparing to unpack .../0-mysql-client_8.0.26-1debian10_amd64.deb ...
Unpacking mysql-client (8.0.26-1debian10) over (8.0.25-1debian10) ...
Preparing to unpack .../1-libmysqlclient21_8.0.26-1debian10_amd64.deb ...
Unpacking libmysqlclient21:amd64 (8.0.26-1debian10) over (8.0.25-1debian10) ...
Preparing to unpack .../2-mysql-community-client_8.0.26-1debian10_amd64.deb ...
Unpacking mysql-community-client (8.0.26-1debian10) over (8.0.25-1debian10) ...
Preparing to unpack .../3-mysql-community-client-core_8.0.26-1debian10_amd64.deb ...
Unpacking mysql-community-client-core (8.0.26-1debian10) over (8.0.25-1debian10) ...
Preparing to unpack .../4-mysql-community-client-plugins_8.0.26-1debian10_amd64.deb ...
Unpacking mysql-community-client-plugins (8.0.26-1debian10) over (8.0.25-1debian10) ...
Preparing to unpack .../5-mysql-common_8.0.26-1debian10_amd64.deb ...
Unpacking mysql-common (8.0.26-1debian10) over (8.0.25-1debian10) ...
Setting up mysql-common (8.0.26-1debian10) ...
Setting up mysql-community-client-plugins (8.0.26-1debian10) ...
Setting up libmysqlclient21:amd64 (8.0.26-1debian10) ...
Setting up mysql-community-client-core (8.0.26-1debian10) ...
Setting up mysql-community-client (8.0.26-1debian10) ...
Setting up mysql-client (8.0.26-1debian10) ...
Processing triggers for libc-bin (2.28-10) ...

2. Set LD_PRELOAD env variable

One of the workarounds in the issue pointed by @uranusjr was to set LD_PRELOAD flag to preload libstdc++ library. It’s a bit tricky to set it just for airflow, but in order to test it, it’s enough to have it set “globally” for the whole container. This will slow down loading of executable files a bit, but at least we will be able to see if it helps.

LD_PRELOAD=/usr/lib/x86_64-linux-gnu/libstdc++.so.6

If that works, we can think of a “proper” solution - but would be nice to see if it works.