airflow: Templating (like {{ ds }} ) stopped working in papermill after upgrade from 2.3.x to 2.4.x

Apache Airflow version

2.4.1

What happened

I am using CeleryKubernetesExecutor and I have Celery Worker. I am running Papermill task. Succendly after upgrade from Airflow 2.3.4 to 2.4.1, {{ ds }} template stopped being recognized in worker in Papermill notebooks. I can see it being rendered properly on UI but on worker there is only {{ ds }} in parameters and operator fails with ValueError: time data '{{ ds }}' does not match format '%Y-%m-%d' error

    featuresNumber = PapermillOperator(
        task_id='features_number',
        input_nb=scriptsPath("features/features_number/features_number.ipynb"),
        output_nb=scriptsPath("features/features_number/features_number_rendered.ipynb"),
        parameters={
            'processingDate': "{{ ds }}",
            'dailyFeaturesInputPath': outputDataPath("daily/features_number"),
            'workdir': dgxPath("secrets/feature-matrix/")
        },
        queue="dgx"
    )

Now all papermill tasks using {{ ds }} are broken.

What you think should happen instead

{{ ds }} should be properly templated.

How to reproduce

Run airflow instance with celery worker, both should be 2.4.1. Try running PapermillOperator on the worker with {{ ds }} as a parameter for notebook. Check that in the output notebook in the parameter there is {{ ds }} instead of proper value.

Operating System

docker image nvidia/cuda:10.1-cudnn8-devel-ubuntu18.04 (Ubuntu 18.04)

Versions of Apache Airflow Providers

apache-airflow-providers-apache-beam 3.1.0 pyhd8ed1ab_0 conda-forge apache-airflow-providers-apache-cassandra 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-apache-hive 2.0.2 pyhd8ed1ab_0 conda-forge apache-airflow-providers-apache-spark 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-celery 2.1.0 pyhd8ed1ab_0 conda-forge apache-airflow-providers-cncf-kubernetes 2.0.2 pyhd8ed1ab_0 conda-forge apache-airflow-providers-ftp 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-google 5.1.0 pyhd8ed1ab_0 conda-forge apache-airflow-providers-http 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-imap 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-jdbc 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-mysql 2.1.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-papermill 3.0.0 pyhd8ed1ab_0 conda-forge apache-airflow-providers-postgres 2.2.0 pyhd8ed1ab_0 conda-forge apache-airflow-providers-sftp 2.1.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-sqlite 2.0.1 pyhd8ed1ab_0 conda-forge apache-airflow-providers-ssh 2.1.1 pyhd8ed1ab_0 conda-forge

Deployment

Other Docker-based deployment

Deployment details

Google Kubernetes Engine for Airflow, regular Docker for celery worker

Anything else

It happened when I upgraded from airflow 2.3.4 to 2.4.1, no other libraries were changed. Tried both apache-airflow-providers-papermill 3.0.0 and apache-airflow-providers-papermill 2.2.3

Are you willing to submit PR?

  • Yes I am willing to submit a PR!

Code of Conduct

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Comments: 20 (8 by maintainers)

Most upvoted comments

You can override _render_nested_template_fields to provide rendering support for custom objects. See operators like KubernetesPodOperator (which uses it to render V1EnvVar objects) for examples.