azure-functions-host: Problems with scoped lifetime services (AddScoped)

Hi,

We are using durable functions with dependency injection. In our Startup.cs file, we register our dependencies as follows:

services.AddScoped<IMyService>(serviceProvider => new MyService(...))

In our activity function, we are using normal constructor injection to get IMyService instances.

The problem is that even if we are using AddScoped for registering the service, during the activity function run, each class that asks for the service, gets a different instance of IMyService. This breaks our app logic, because IMyService users won’t see each other’s changes.

As a workaround to the earlier Azure Functions runtime DI issues, we had the following pinning in place FUNCTIONS_EXTENSION_VERSION = 2.0.12673.0. The pinned version is not supported anymore by Azure, so our function DI is now broken again.

About this issue

  • Original URL
  • State: closed
  • Created 5 years ago
  • Reactions: 6
  • Comments: 109 (25 by maintainers)

Most upvoted comments

@jeffhollan @brettsam any update on this issue?

I made more experiments and it seem that the DI problem occurs only when I have used services.AddHttpClient() to register HttpClients. If I comment out services.AddHttpClient() calls, AddScoped works as expected.

Hi @fabiocav,

Do you know if there is an update on this issue yet?

Sorry to keep asking but It’s a pretty big blocking issue for my company’s migration which we are pushing further and further away without being able to suggest when we can resume it.

Thanks

@rodhemphill sticky tape and plasters I’m afraid. Basically failing requests and relying on the runtime to execute the request again with fingers crossed for subsequent attempts. We detect the problem in constructors of classes we know should be scoped, and thus when two instances are created in a logical execution scope (determined by AsyncLocal) we throw an exception and log it. Makes a mess of my logs - see https://github.com/Azure/azure-functions-host/issues/4914#issuecomment-601251057.

It caused quite an embarrasing false start to our UAT phase for sure (deadlines and all), customer was quite alarmed at the amount of failing transactions, and that was as I was chasing it here to get it fixed. I had to refactor in the end, as is the advice per @fabiocav. even changing the order of parameters in constructors reduced the pain! It’s still failing on some non-critical code paths, but we can live with that.

Agree- anything non-trivial causes it to fail. We built a polyglot solution, but one component was full DDD stack based on Microsoft’s own patterns advice due to the domain complexity. Our simpler Functions Apps (just plain HTTP and read/write storage) work without issue.

We’ve dropped functions from our list of candidate dotnet technologies for future products & projects in favour of plain ASP.NET Core until this is fixed. Obviously we lose the serverless benefits.

Still, I managed to get money back from Azure for the development headaches it caused, as it is a GA product. If you’re suffering the same I would recommend you to also raise support requests. Of course, it didn’t come close to covering the amount of expense, inconvenience, and damage to our reputation. I had a very professional support experience with Microsoft and their Functions support team were amazing. But they couldn’t fix the underlying issue - that’s clearly an engineering problem.

Perhaps they might treat this more seriously if having to refund more customers via support requests … 😉

Providing an update here at the end of the week. @brettsam and I have made some progress on the different scenarios reported, testing and investigation, hopefully we’ll have something ready before the sprint is over, but will continue to provide updates here for awareness.

@fabiocav any update on releasing “EnableEnhancedScopes” so we don’t need the feature flag anymore?

The validation phase has been completed and the release is starting to rollout globally. We don’t usually publish the Core Tools until after the deployment, but we’re making some changes to the process and will see if we can get this up ASAP so you can test locally as well.

Again, thanks for the patience with this!

@fabiocav : Not to blame anyone, but how can a basic feature like Scoped or not being labeled as a p2 ? This seems like a rather basic functionality ? Can you point us to workarounds to resolve this issue ?

The code change was small, but changes like this can be pretty impactful, which is why we decided to hide it behind a feature flag and get that change out ASAP. We wanted to let everyone try it and let us know if they run into issues while we continue testing scenarios. If all goes smoothly, we will make this the default everywhere in a future release.

As for the timeline – it’s currently in validation and the deployment is starting this week. We don’t like giving exact timetables (hotfixes, unrelated incidents, etc can delay a deployment)… but it generally takes 6-ish business days to roll out worldwide once it has started.

I haven’t been able to investigate yet – but I’ve moved it to the next sprint so it comes back on @fabiocav’s radar.

@fabiocav please can you advise as to any updates for this issue? We’ve some gaps opened up in our roadmap in August in which we’re looking forward to paying down some technical debt, and revisiting this issue is on our tracker alongside migrating to dotnet 3+.

Is this recognised as a potential https://github.com/dadhi/DryIoc issue? Is there a correlating ticket with the DryIOC project?

Thanks

We have experienced the same problem, but I also noticed this: It’s only the first time after the function started/triggered, that the scoped services are ‘wrong’. The second time, it seems to be fine. Has anyone else noticed this? (I was able to reproduce it with the code posted here by @heikkilamarko.)

This is a pretty big problem for us as well, any progress on this issue?

Some additional information: Azure Functions Core Tools - 2.7.1846 Function Runtime Version - 2.0.12858.0

Not sure why this fixes it, any explanation would be great.

@ebwinters your function would return when it hits the first await. I suppose the functions host has no way of telling there is still work pending, because you did not return a Task for it to wait on.

The async void case is a “fire and forget”: You start the task chain, but you don’t care about when it’s finished. When the function returns, all you know is that everything up to the first await has executed. Everything after the first await will run at some unspecified point in the future that you have no access to. https://devblogs.microsoft.com/oldnewthing/20170720-00/?p=96655

@brettsam Glad to see the fix was very simple. Would you be able to provide a timeline on deployment?

I’ve been facing the same problem and I’ve found a workaround. While I was testing some scenarios, I found out that the problem only happens on classes that expect the HttpClient as DependencyInjection. For example:

` public class SomeClass { private readonly MyHttpClient myHttpClient; private readonly MyScopedDependency myScopedDependency;

public SomeClass(MyHttpClient myHttpClient, MyScopedDependency myScopedDependency)
{
    this.myHttpClient = myHttpClient;
    this.myScopedDependency = myScopedDependency;
}

public void SomeMethod()
{
    myHttpClient.DoSomething();
}

} `

In this scenario, the “MyScopedDependency” would not be resolved as Scoped, but if I change my code to something like this:

` public class ClassToHoldHttpClient { public MyHttpClient MyHttpClient { get; }

public ClassToHoldHttpClient(MyHttpClient myHttpClient) => MyHttpClient = myHttpClient;

}

public class SomeClass { private readonly ClassToHoldHttpClient classToHoldHttpClient; private readonly MyScopedDependency myScopedDependency;

public SomeClass(ClassToHoldHttpClient classToHoldHttpClient, MyScopedDependency myScopedDependency)
{
    this.classToHoldHttpClient = classToHoldHttpClient;
    this.myScopedDependency = myScopedDependency;
}

public void SomeMethod()
{
    classToHoldHttpClient.MyHttpClient.DoSomething();
}

} `

With this code, “MyScopedDependency” is now beeing resolved as Scoped again.

Is it a great solution? Of course not, but as this problem is not solved for about to 1 year, maybe this workaround may save some people.

Impossible to implement Domain Events with Azure functions without this functionality properly… Can’t believe it does not work. Still…

I’m asking to change the priority, as this is a blocker.

Thanks

@darrenhull the workarounds will require some code refactoring at the moment. This is being assigned to a sprint for investigation, so we hope to have an update on this issue soon.

@fabiocav Hope you are having a nice day 😃 We are also hopeful that this issue will be fixed soon. We have spent a lot of time writing code to wire up our services and we really love the way scoped services work in ASP.NET Core. Let me know if there is anything I can do to help get this moving!

I believe this is the same as my issue #4914 - clearly something wrong with scoped registrations.

@fabiocav can we get an update on these tickets? Are they being investigated? Thx

The problem is not durable function specific. Same happens with normal TimerTrigger functions also.

@Assassinbeast Your issues seems to be in the isolated model. Could you please open a new issue here: https://github.com/Azure/azure-functions-dotnet-worker/issues

@SoloHam – can you open an issue in the https://github.com/Azure/azure-functions-dotnet-worker/issues repo? Please include some example code to show us how things are set up in your scenario.

This issue is related to the in-proc DI setup, which is different from that in the isolated worker.

@Gevil The workaround we found when using HttpClients is by adding an extra abstraction layer and don’t inject an HttpClient (or HttpClientFactory) directly. This is now running for about 1 year in production without any issues.

Example:

public class FunctionWorkaround
{
        private readonly IHttpClientFactory httpClientFactory;

        public FunctionWorkAround(IHttpClientFactory httpClientFactory)
        {
            this.httpClientFactory = httpClientFactory;
        }
        
        public HttpClient CreateClient() => httpClientFactory.CreateClient(...);
}

@t-l-k I’m sending the changes to make this the default in the current sprint. Next release will have the new behavior.

This is now working great for me, my failed request rates and exception rates have basically dropped to zero, and applications are working better than ever. This brings Azure Functions back to the table for projects in flight!!!

p.s. I skipped local validation using the func.exe tools and just went for the docker image mcr.microsoft.com/azure-functions/dotnet:3.0.14785, which worked great as well.

In C# you should only use async void when you have to (event handlers, which functions are not in that sense), so likely due to that https://stackoverflow.com/questions/13636648/wait-for-a-void-async-method

@espray I guess that’s some kind of development, when you can just drop things out of a scope forever and ever. As I can see 2 or 3 things in each of their sprints, that are not closed and just hang there without being tackled.

I’ve been waiting almost for a year for this thing to be closed. But I guess something like Scoped is not a priority, so I’ve just created my own Scoped extension out of Singleton and Transient for one of my customers.

Now, as an architect in my organization I’m advising client’s not to use Azure Functions, as they are still not ready for production use.

Hi @rodhemphill I’m not sure that I get you.

As I said the Scoped lifitime works sometimes, but sometimes I get different instances per request. I can reproduce it on “cold start”. Then everything works fine. After some idle the function does a “cold start” again and I again get this issue.

Thanks

Assigning this to the next sprint. @Arash-Sabet, if you can open an issue detailing your scenario, we might be able to assist with a workaround until this is resolved.

This issue is a huge blocker and addressing it has to be expedited. If there’s a P1 label indicating a higher priority than P2, that one should be applied instead of P2. We cannot proceed with our project gracefully without a fix! @fabiocav

We too have had to stop using @azfunc as the scoping issue makes them unusable for our solution. This is very frustrating for a team that have embraced Azure functions from day one. I’ve been very disappointed with how long this issue has taken to be resolved @fabiocav , oh wait it’s still active! We too are seeking money back on our subscriptions.

I have had all of my teams stopped all .Net Core AzFunc development, this Scoped DI issue was the final straw for me. If basic .Net Core DI dose not work who knows what other problems may show up at the wrong time, the risk is too great.

May be send a message/tweet to Damian Edwards @DamianEdwards and/or Scott Hanselman @shanselman about your AzFunc development experience.

@daniel-white Any pointer on where it should be fixed in 3.1 ? Because we faced a similar issue in 3.1 (using the functions DI setup). Thanks !

I find it hard to believe this is by design. If it is so, documentation should state that very clearly.