azure-functions-durable-python: DurableOrchestrationClient.start_new raises exception when orchestrator instance exists with the same instance id instead of silent replace.
đ Describe the bug In the python azure-functions-durable 1.1.6 SDK the DurableOrchestrationClient.start_new() methodâs documentation says:
Start a new instance of the specified orchestrator function. If an orchestration instance with the specified ID already exists, the existing instance will be silently replaced by this new instance.
Instead of silently replacing a Pending instance it raises exception:
Exception: Exception: {âMessageâ: âSomething went wrong while processing your requestâ, âExceptionMessageâ: âAn Orchestration instance with the status Pending already exists.â, âExceptionTypeâ: âSystem.InvalidOperationExceptionâ, âStackTraceâ: â at DurableTask.AzureStorage.AzureStorageOrchestrationService.CreateTaskOrchestrationAsync(TaskMessage creationMessage, OrchestrationStatus[] dedupeStatuses) in //src/DurableTask.AzureStorage/AzureStorageOrchestrationService.cs:line 1570\n at DurableTask.Core.TaskHubClient.InternalCreateOrchestrationInstanceWithRaisedEventAsync(String orchestrationName, String orchestrationVersion, String orchestrationInstanceId, Object orchestrationInput, IDictionary2 orchestrationTags, OrchestrationStatus[] dedupeStatuses, String eventName, Object eventData, Nullable
1 startAt) in //src/DurableTask.Core/TaskHubClient.cs:line 614\n at Microsoft.Azure.WebJobs.Extensions.DurableTask.DurableClient.Microsoft.Azure.WebJobs.Extensions.DurableTask.IDurableOrchestrationClient.StartNewAsync[T](String orchestratorFunctionName, String instanceId, T input) in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\ContextImplementations\DurableClient.cs:line 210\n at Microsoft.Azure.WebJobs.Extensions.DurableTask.HttpApiHandler.HandleStartOrchestratorRequestAsync(HttpRequestMessage request, String functionName, String instanceId) in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\HttpApiHandler.cs:line 875\n at Microsoft.Azure.WebJobs.Extensions.DurableTask.HttpApiHandler.HandleRequestAsync(HttpRequestMessage request) in D:\a\_work\1\s\src\WebJobs.Extensions.DurableTask\HttpApiHandler.cs:line 306â}
đ¤ Expected behavior
What should have happened?
Based on the documentation I would expect it to replace the existing instance with a new one without raising exception.
â Steps to reproduce
An orchestration client triggers an orchestration function.
In the orchestration client it checks the satatus:
AZURE_INACTIVE_ORCHESTRATOR_STATUS_SET = { OrchestrationRuntimeStatus.Completed, OrchestrationRuntimeStatus.Failed, OrchestrationRuntimeStatus.Terminated, OrchestrationRuntimeStatus.Pending, None, }
if instance.runtime_status in AZURE_INACTIVE_ORCHESTRATOR_STATUS_SET: instance_id = await client.start_new(INSTANCE_ID, instance_id=INSTANCE_ID, client_input={'time': time_list[i]}) logging.info(f"Started orchestration with ID = '{instance_id}'.")
So if the orchestration instance in Pending it should create a new one. The status of the orchestration instance can be flipped in the corresponding storage account table.
it can be reproduced locally in VScode as well.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Comments: 44 (1 by maintainers)
The discrepancy youâre seeing might be because there are multiple processes involved which track time differently. For Python, there is the functions host process, which is where the triggers run, and then thereâs the python process, which is where your code executes.
The process for invoking a function involves the following steps:
For your specific example, steps 1-3 are taking 17 minutes. The 230ms is probably the time that your function code actually runs. The remaining time is most likely time spent waiting for a Python thread to come available inside the python process to execute your code. Depending on your version of Python and your app configuration, you might be running on a single-threaded Python worker, in which case any concurrent operations will be blocked waiting for the single Python thread to finish executing other function executions.
This is actually a really common problem thatâs documented here. There are several workarounds that you could consider:
Try adjusting some of these configuration values to see if that improves the reported end-to-end latencies of your functions.
Normally the âlocksâ on these messages are never released, but instead the message is deleted when the processing is complete. Even an entity failing with an error counts as a success, so the message will be deleted. Only in the event of a major failure, like a process restart, will the message become visible again (it takes 5 minutes before the default visibility timeout expires).
If the number of queue messages increases, it means there are more events to be processed by your orchestrations and/or entities.
Hi @davidmrdavid ,
Managed to capture the issue this week: The first name is the isntance id the second is the name of the actual function.
prod-epex-scraper-func orchestrate_parse_order_da orchestrate_parse 2023-02-18T23:10:00Z 20ms Pending
orchestrate_parse_order_history orchestrate_parse 2023-02-18T23:07:00Z 13ms Pending
orchestrate_parse_result_da orchestrate_parse 2023-02-18T23:07:00Z 14ms Pending
orchestrate_parse_result_history orchestrate_parse 2023-02-18T23:05:00Z 16ms Pending
orchestrate_scrape_archive_order orchestrate_scrape 2023-02-18T23:10:00Z 13ms Pending
orchestrate_scrape_archive_result orchestrate_scrape 2023-02-18T23:05:00Z 20ms Pending
orchestrate_scrape_current_order orchestrate_scrape 2023-02-18T23:05:00Z 31ms Pending
orchestrate_scrape_current_result orchestrate_scrape 2023-02-18T23:17:00Z 14ms Pending
test-epex-scraper-func orchestrate_parse_order_da orchestrate_parse 2023-02-16T12:40:00Z 26ms Pending
orchestrate_parse_order_history orchestrate_parse 2023-02-16T12:37:00Z 42ms Pending
orchestrate_parse_result_da orchestrate_parse 2023-02-16T12:37:00Z 26ms Pending
orchestrate_parse_result_history orchestrate_parse 2023-02-16T12:35:00Z 130ms Pending
orchestrate_scrape_archive_order orchestrate_scrape 2023-02-16T12:40:00Z 48ms Pending
orchestrate_scrape_archive_result orchestrate_scrape 2023-02-16T12:35:00Z 53ms Pending
orchestrate_scrape_current_order orchestrate_scrape 2023-02-16T12:35:00Z 22ms Pending
orchestrate_scrape_current_result orchestrate_scrape 2023-02-16T12:32:00Z 138ms Pending
Hi @davidmrdavid ,
Will provide the instance ID and the time stamp when it happens next.