azure-functions-durable-python: Known Regression: Activity without a $return binding returned a non-None value

Known regression notice: Activity without a $return binding returned a non-None value

Error description: We have just been noticed that some subset of users are suddenly, and intermittently, experiencing errors in the Durable Functions Python apps with an exception that reads: “( Activity ) without a $return binding returned a non-None value”.

What this error means is that Activity Triggers, which are both a trigger and an output bindings, somehow not being allowed to return a value. Therefore, the error is complaining that the Function has a return statement (the “returned a non-None value” part) despite the fact that Activities are (incorrectly) not allowed to return data.

Reproducer: The error does not seem reproducible locally, but it can be reproduced, intermittently, on Azure. The simplest reproducer is to have a standard Function-chaining application, such as this one and to modify the Activity Function to utilize an output binding.

For example, here’s a hello-world Activity that writes a hardcoded string to blob storage:

def main(name: str, blob) -> str:
    blob.set("a")
    return f"Hello {name}!"

and it’s funtion.json

{
  "scriptFile": "__init__.py",
  "bindings": [
    {
      "name": "name",
      "type": "activityTrigger",
      "direction": "in"
    },
    {
      "type": "blob",
      "direction": "out",
      "name": "blob",
      "path": "test/blobtest",
      "connection": "AzureWebJobsStorage"
    }

  ]
}

After deploying and executing the orchestrator, the orchestrator may fail with the aforementioned exception.

In short - it appears this error triggers when an Activity is paired with an output binding.

Root cause theories: At this time, we are fairly confident this is not a regression in the Durable Functions SDK and also not in the Durable Functions Extension. Instead, it appears that affected applications may have gone through a Functions Host (the base Azure Functions component) that triggered this behavior.

This leads us to believe that the problem is either caused by some regression in the Functions Host or, most likely, by some regression in the Python worker (the component that allows Azure Functions to run Python code) which was bundled in the Functions Host.

The Python team is currently investigating this, with out help, to understand and patch this regression as soon as possible.

Update 8/31:

Our current understanding is that this error occurs only on Functions V4 apps that are using the latest Host version (4.9.1.1). This latest Host version includes a refactoring in the Python worker that may be to blame for this issue, but the specifics are still being investigated. That said, this is enough to provide a workaround. Please see the update for 8/31 in the “workarounds” section.

Workarounds:

Workarounds are being worked on. We need to better understand the root cause to provide them. I will update this thread as soon as possible.

Update 8/31:

If you are affected by this error, you should be able to circumvent it by reverting back to a previous version of the Host. The general guidance for doing this on Linux (the only OS for Python support today) can be found here: https://docs.microsoft.com/en-us/azure/azure-functions/set-runtime-version?tabs=portal#manual-version-updates-on-linux

For this, you will need to utilize the "Azure Functions CLI / az CLI. The link above contains a link to download the az on your local machine, but it should be able to run az CLI commands from the Azure Cloud Shell as well. Again, this is all in the link above, under the Azure CLI tab.

Just in case you’re unfamiliar with how to use the az CLI: To be able to manipulate your Azure Functions with it, you will need to first log in (you can run az login to do that) and then change the “active subscription” to be the subscription of your target app. You can read about how to change your az-CLI active subscription here: https://docs.microsoft.com/en-us/cli/azure/manage-azure-subscriptions-azure-cli#change-the-active-subscription

How to revert to a previous version of the Azure Functions Host:

You will need to modify your linuxFxVersion to pin your application to a previous Host version. We will be pinning to Host version 4.8.0, where we believe the error should be avoidable or at least much less frequent.

Here’s the command you will use:

az functionapp config set \
 -g <resource_group> \
 -n <function_app_name> \
 --subscription <subscription_id> \
 --linux-fx-version <docker_image_with_the_right_host_version>

So, for example, if your resource group is called “myResourceGroup” your appName is called “Foo”, and your subscription ID were “123”, then you’d run the following command (ignoring the linux-fx-version parameter).

az functionapp config set \
 -g myResourceGroup \
 -n Foo \
 --subscription "123" \
 --linux-fx-version <docker_image_with_the_right_host_version>

The value of the linux-fx-version depends on whether your application is in a Consumption plan, or not.

If you’re using the Consumption plan, then you should use: "DOCKER|mcr.microsoft.com/azure-functions/mesh:4.8.0-python3.9" You may change the suffix “python3.9” to “python3.8” or “python3.7” according to your Python interpreter preference.

Please see my latest update on this thread (on 9/2) - there seems to be a blocker preventing manual-editing of linuxFxVersion in the Consumption plan. As a result, we are automatically rolling back the default Host version to 4.8.0 on linux consumption for Python

If you are using the App Service plan and/or the Elastic Premium plan, then the docker image is slightly different. It is as follows: “DOCKER|mcr.microsoft.com/azure-functions/python:4.8.0-python3.9-appservice”

  • Again, you may change the suffix “python3.9” to “python3.8” or “python3.7” according to your Python interpreter preference.

If you are running this command in a PowerShell shell, be aware that the “pipe” (|) symbol in the docker image names will cause issues if you just specify the string with a single pair of quotes. To get around this, please wrap the name around '"-pairs. For instance, this is a full command for our example app, if it were on linux elastic premium:

az functionapp config set  -g myResourceGroup -n Foo --subscription "123" --linux-fx-version '"DOCKER|mcr.microsoft.com/azure-functions/python:4.8.0-python3.9-appservice"'

After invoking this command, please give your app enough time to apply the change - a minute or two should suffice. You will know this change got applied successfully if, on your Function app portal view, under the “Essentials” bar, the “Runtime version” field reads “4.8.0.0”. If you do not see this, consider restarting your app.

Finally, if this guidance does not work, or you find any typos, please report them in this thread and we’ll look to assist you. Thank you, and apologies for the inconvenience. Do note that the guidance here will need to be undone in the future to ensure your Functions Host continues getting regular updates. To revert this change, just set the linuxFxVersion to 'python|3.9" (replacing 3.9 for your python interpreter preference).

In the meantime, we’re working on a permanent fix. We will update this thread once we have it.

About this issue

  • Original URL
  • State: closed
  • Created 2 years ago
  • Reactions: 3
  • Comments: 24

Most upvoted comments

Hi all. I worked today with the Python group and we identified the source of this regression. Using that, we should have a clearer path towards a prompt resolution. I’ll provide updates on this as soon as possible.

I need to talk with the Functions Team on Linux to discuss the mitigation plan. I should be able to update this thread with that tomorrow.

@FaCoffee1984 / @KhaoticMind : To revert you application back to “auto-updating”, it should suffice to revert linuxFxVersion to “python|3.9” (or 3.8, 3.7, depending on your preferred interpreter version). Note that the value of linuxFxVersion in the auto-upgrading mode does not specify the Host version number, just the language; I believe that’s why it is auto-updating 😃 .

I’ll triple check this guidance on reverting back the change, but I’m fairly confident of it at the moment.

Another update:

Regarding the long-term fix: The Python group has implemented a fix, which is currently being integration-tested. Once the tests pass, the fix should start deploying at the next available release date. Since the release date is not yet confirmed, I can’t discuss it publicly, but we’re pushing for it to occur as early as can be. I’ll provide an update on this asap.

Regarding mitigations For linux app service and premium plans: the recommendation to pin your Host version using the guidance above still applies.

For linux consumption users: we are automatically rolling back your Host version (for python apps only) right now. In the next few hours, your Host version should automatically return to 4.8.0 provided that you have returned to the “default Host version” setting of your app. In other words, if you return your linuxFxVersion to python|3.9 (or 3.8, 3.7, depending on your interpreter of choice). We are doing this rollback because we have found an issue with setting the linuxFxVersion as an end-user, similar to what @FaCoffee1984 reported.

In summary:

  • A patch has been implemented, to be rolled out asap
  • For app service and premium plan users: to prevent this bug, please use guidance above to set your linuxFxVersion to Host 4.8.0
  • For consumption users: we will be reverting back the default Host version to 4.8.0 while we wait for the patch to roll out. So this error should auto-resolve soon.

I will look to provide more details on the regression root-cause at some point. For now, I’ll be focusing on following up to make sure these releases are rolled out. Thanks again for your patience.

@FaCoffee1984: I’ll respond to those questions as soon as possible, on my way to work.

In the meantime,if the workaround didn’t work and now some Functions are not even triggering - can you please confirm if you were able to revert the change so that your Functions are at least invoking again?

Important to know if/how we can keep it “auto-updating”.

@FaCoffee1984, @KhaoticMind, @thec0dewriter - please see my update above ^