airflow: Airflow python exception at login

Official Helm Chart version

1.3.0 (latest released)

Apache Airflow version

2.1.0

Kubernetes Version

1.20.9

Helm Chart configuration


images:
  airflow:
    repository: "beertechcontainerregistrydev.azurecr.io/beertech-airflow-dev"
    tag: "dev-fb73985f1f2dbe7b7a2668c5a3fd0f541a5cf4cf"
  useDefaultImageForMigration: true

nodeSelector:
  agentpool: "airflow"

ingress:
  enabled: true
  web:
    annotations: 
      kubernetes.io/ingress.class: "nginx"
    hosts: ["something"]
    tls:
      enabled: true
      secretName: ingress-tls-secret
  # flower:
  #   annotations: 
  #     kubernetes.io/ingress.class: "nginx"  
  #   path: "/flower"
  #   hosts: ["something"]
  #   tls:
  #     enabled: true
  #     secretName: ingress-tls-secret
webserver:
  defaultUser:
    enabled: true
    role: Admin
    username: devops
    email: 
    firstName: devops
    lastName: admin
    password: nottelling
dags:
  gitSync:
    enabled: true
    repo: "ssh://git@github.com/ab-inbev-beertech/beertech-airflow.git"
    branch: "main"
    subPath: "dags/"
    sshKeySecret: airflow-ssh-secret
extraSecrets:
  airflow-ssh-secret:
    data: |
      gitSshKey: ""nottellin"

Docker Image customisations

FROM apache/airflow:latest-python3.7

USER root RUN sudo apt-get update
&& sudo apt-get install -y g++
&& sudo apt-get install -y unixodbc-dev
&& sudo apt-get install -y python3.7-dev
&& python3.7 -m pip install --upgrade pip USER airflow COPY ./.netrc . RUN mv .netrc $HOME RUN pip install --no-cache-dir --user lola_utils --index-url https://abinbev.jfrog.io/artifactory/api/pypi/beerte-virtual-pypi/simple

What happened

I get this error at log in:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
psycopg2.errors.UndefinedColumn: column dag.concurrency does not exist
LINE 1: ..., dag.schedule_interval AS dag_schedule_interval, dag.concur...
                                                             ^


The above exception was the direct cause of the following exception:

Traceback (most recent call last):
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 2447, in wsgi_app
    response = self.full_dispatch_request()
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1952, in full_dispatch_request
    rv = self.handle_user_exception(e)
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1821, in handle_user_exception
    reraise(exc_type, exc_value, tb)
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/_compat.py", line 39, in reraise
    raise value
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1950, in full_dispatch_request
    rv = self.dispatch_request()
  File "/home/airflow/.local/lib/python3.7/site-packages/flask/app.py", line 1936, in dispatch_request
    return self.view_functions[rule.endpoint](**req.view_args)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/www/auth.py", line 34, in decorated
    return func(*args, **kwargs)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/www/views.py", line 547, in index
    filter_dag_ids = current_app.appbuilder.sm.get_accessible_dag_ids(g.user)
  File "/home/airflow/.local/lib/python3.7/site-packages/airflow/www/security.py", line 298, in get_accessible_dag_ids
    return {dag.dag_id for dag in accessible_dags}
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3535, in __iter__
    return self._execute_and_instances(context)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/orm/query.py", line 3560, in _execute_and_instances
    result = conn.execute(querycontext.statement, self._params)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1011, in execute
    return meth(self, multiparams, params)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/sql/elements.py", line 298, in _execute_on_connection
    return connection._execute_clauseelement(self, multiparams, params)
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1130, in _execute_clauseelement
    distilled_params,
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1317, in _execute_context
    e, statement, parameters, cursor, context
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1511, in _handle_dbapi_exception
    sqlalchemy_exception, with_traceback=exc_info[2], from_=e
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/util/compat.py", line 182, in raise_
    raise exception
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/base.py", line 1277, in _execute_context
    cursor, statement, parameters, context
  File "/home/airflow/.local/lib/python3.7/site-packages/sqlalchemy/engine/default.py", line 593, in do_execute
    cursor.execute(statement, parameters)
sqlalchemy.exc.ProgrammingError: (psycopg2.errors.UndefinedColumn) column dag.concurrency does not exist
LINE 1: ..., dag.schedule_interval AS dag_schedule_interval, dag.concur...
                                                             ^

[SQL: SELECT dag.dag_id AS dag_dag_id, dag.root_dag_id AS dag_root_dag_id, dag.is_paused AS dag_is_paused, dag.is_subdag AS dag_is_subdag, dag.is_active AS dag_is_active, dag.last_parsed_time AS dag_last_parsed_time, dag.last_pickled AS dag_last_pickled, dag.last_expired AS dag_last_expired, dag.scheduler_lock AS dag_scheduler_lock, dag.pickle_id AS dag_pickle_id, dag.fileloc AS dag_fileloc, dag.owners AS dag_owners, dag.description AS dag_description, dag.default_view AS dag_default_view, dag.schedule_interval AS dag_schedule_interval, dag.concurrency AS dag_concurrency, dag.has_task_concurrency_limits AS dag_has_task_concurrency_limits, dag.next_dagrun AS dag_next_dagrun, dag.next_dagrun_create_after AS dag_next_dagrun_create_after 
FROM dag]
(Background on this error at: http://sqlalche.me/e/13/f405)

What you expected to happen

I expected to see the home page.

How to reproduce

Here are the steps followed:

The purpose of this document is to outline the steps in deploying Apache Airflow which includes the infrastructure and the configuration settings to sync with your DAG workflows.

The Airflow deployment includes the following:

Azure Resource Group Kubernetes Add SSH Deploy Key to DAGS repository Add SSH key to Key Vault for Github authentication to DAGs repository Airflow Helm Deployment Nginx Ingress Controller Deployment DNS A record to the Ingress Load Balancer IP HTTPS enabled to the ingress All the modules used in the products repository can be found in the terraform-modules GitHub repository. Be sure to use the most up to date modules

  1. Add SSH Deployment Key to DAGS Repository Create your SSH Key: ssh-keygen -t rsa -b 4096 -C “your_email@example.com” Add the public key to your private repo with the DAGs (under Settings > Deploy keys). Example DAGs Repo

  2. Create Key Vault Secret for SSH Key Go to the Azure Portal Access the Key Vault in which you will add your Base64 encoded SSH Private Key. a. To Base64 encode your private key rn base64 ./<path to private key> Screen Shot 2021-10-08 at 10.15.55 AM.png Add your Secret name and value and select create.

  3. Create TLS Kubernetes Secret with TLS Cert Run the command: kubectl create secret tls ingress-tls-secret -n airflow --key ./beertech-generated-private-key.key --cert cert.crt Note: The ingress of for Airflow will use this secret ingress-tls-secret needs to be added as an input for the secret_name variable

  4. Nginx Controller Deployment Create a nginx.tf file in the <product dir>/dev/azure/kubernetes folder. Add the necessary variable to the variables.tf file. a. NOTE: Make sure all defaults are set for variables Commit changes. Create a pull request and merge after approvals. Example:

  5. DNS A record to the Ingress Load Balancer IP Add A record to existing zone in products, products/devops/prod/dns-zones/beertech-com

Commit changes.

Create a pull request and merge after approvals. Example:

Note: The IP of the A record should match the LoadBalancer IP of ingress

  1. Create Airflow Helm AKS Deployment helm upgrade --install airflow apache-airflow/airflow -f ./airflow-override-values.yaml -n airflow

Anything else

This happens every time at login

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Comments: 26 (21 by maintainers)

Most upvoted comments

All that said, I am a little surprised the wait-for-migrations init container didn’t prevent the components from starting.

It did not, because it was using the very same 'defult airflow image" (which was different version than the “regular image”) - this is the whole point 😃 - both “migration” and “wait-for-migration” use that image.

      initContainers:
        - name: wait-for-airflow-migrations
          resources:
{{ toYaml .Values.scheduler.resources | indent 12 }}
          image: {{ template "airflow_image_for_migrations" . }}
          imagePullPolicy: {{ .Values.images.airflow.pullPolicy }}
          args:
          {{- include "wait-for-migrations-command" . | indent 10 }}
          envFrom:

The problem is that you did not change “defaultAirflowTag” but you rebuilt your image with “latest” version - which has been upgraded to 2.2.2.

If you want to keep 2.1.0 version of Airlfow you should explicitly base your image on it: In your docker image you shoud use:

FROM apache/airflow:2.1.0-python3.7

The latest tag is moving whenever we release new version.

The problem is that your “custom” image is likely based on 2.2.2 version (which is current latest) but since you have useDefaultImageForMigration, you are using defaultAirflowTag which is probably 2.1.0 as this was the default version released together with the chart. Which means that migration did not actually migrate the database as needed by Airflow 2.2.2.

The solution is that you should make sure that both - your custom image and your defaultAirflowTag point to the same version. Likely what should work for you is to replace the FROM line with 2.1.0 (verify if your defaultAirflowTag is also 2.1.0, rebuild and push your image and make sure the new image is pulled when you deploy Airflow.

Thanks for opening your first issue here! Be sure to follow the issue template!