runtime: The token supplied to the function is invalid at System.Net.NTAuthentication.GetOutgoingBlob

Description

When using HttpClient on multiple threads in parallel, Windows Authentication (NTLM) seems to fail some requests with following error: The token supplied to the function is invalid Stack trace:

   at System.Net.NTAuthentication.GetOutgoingBlob(Byte[] incomingBlob, Boolean throwOnError, SecurityStatusPal& statusCode)
   at System.Net.NTAuthentication.GetOutgoingBlob(String incomingBlob)
   at System.Net.Http.AuthenticationHelper.<SendWithNtAuthAsync>d__53.MoveNext()
   at System.Net.Http.HttpConnectionPool.<SendWithVersionDetectionAndRetryAsync>d__83.MoveNext()
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
   at System.Net.Http.AuthenticationHelper.<SendWithAuthAsync>d__17.MoveNext()
   at System.Net.Http.DiagnosticsHandler.<SendAsyncCore>d__8.MoveNext()
   at System.Threading.Tasks.ValueTask`1.get_Result()
   at System.Runtime.CompilerServices.ConfiguredValueTaskAwaitable`1.ConfiguredValueTaskAwaiter.GetResult()
   at System.Net.Http.RedirectHandler.<SendAsync>d__4.MoveNext()
   at System.Net.Http.HttpClient.<<SendAsync>g__Core|83_0>d.MoveNext()
   at PCT20018NtlmTest.Program.<>c__DisplayClass1_0.<<DoSomething>b__0>d.MoveNext() in C:\Users\hajek\Scratch\PCT20018NtlmTest\PCT20018NtlmTest\Program.cs:line 45

This happens when sending requests both over HTTP and HTTPS.

Reproduction Steps

  1. I have a machine running in Azure with Windows Server 2019 Datacenter (build 10.0.17763). The server has IIS installed and Windows Authentication enabled on default site.
  2. When I run following code (dependent on AsyncEnumerator package):
using Dasync.Collections;
using System;
using System.Collections.Generic;
using System.Net;
using System.Net.Http;
using System.Threading.Tasks;

namespace PCT20018NtlmTest
{
	internal class Program
	{
		static void Main(string[] args)
		{
			DoSomething().Wait();
		}

		static async Task DoSomething()
		{
			var baseUrl = "http://13.94.243.22/";
			var messageHandler = new HttpClientHandler
			{
				Credentials = new NetworkCredential("AzureUser", "mysuperstrongpwd.1", ".")
			};
			var httpClient = new HttpClient(messageHandler)
			{
				BaseAddress = new Uri(baseUrl),
			};

			var collection = new List<string>();
			for (var a = 0; a < 1000; a++)
			{
				collection.Add(a.ToString());
			}

			await collection.ParallelForEachAsync(async x =>
			{
				try
				{
					var result = await httpClient.GetAsync("/");
					result.EnsureSuccessStatusCode();
				}
				catch (Exception e)
				{
					Console.WriteLine(e.Message);
				}
			}, maxDegreeOfParallelism: 20);
		}
	}
}
  1. Some of the requests pass through, however some start throwing the exception The token supplied to the function is invalid

Expected behavior

I expect all requests to complete successfully.

Actual behavior

Some requests fail (in my case 8 out of 1,000).

Regression?

No response

Known Workarounds

No response

Configuration

Running on Windows 22H2 (25276.1000) with .NET 6.0 6.0.100-preview.7.21379.14 [C:\Program Files\dotnet\sdk] x64

Other information

Also, the code contains credentials on purpose, so you can test it from your machine.

No response

About this issue

  • Original URL
  • State: closed
  • Created a year ago
  • Reactions: 6
  • Comments: 31 (28 by maintainers)

Most upvoted comments

@filipnavara - since I can see you are Prague based as well, we could arrange to share the affected computer with you if it would provide any help.

I don’t want to set your expectations too high. I am just a community triager and contributor that happened to dig into the NTLM/Negotiate code a lot over the past year.

That said, drop me a note at filip.navara (at) gmail.com, and let’s see if I can come up with more ideas on what to try and what diagnostic information would be worth collecting.

The difference in behavior compared to 7.0 is also curious.

The difference between .NET 6 and .NET 7 is easy to explain. It was deliberate change not to propagate Win32Exception from the authentication code. The are two reasons for that. First, the exception now doesn’t happen in the first place since the status code is propagated as enum from NegotiateAuthentication and not as exception. Second, HttpClient.Send[Async] was never documented to throw Win32Exception. It now either ends up returning the last 401 Unauthorized response with the “invalid token”, or in other cases throws HttpRequestException. There’s still optional tracing to get the underlying status codes.

Long run Im wondering @filipnavara if it would be useful to add one switch to force managed implementation even on platforms with GSSAPI. Or make it default for “NTLM” and leave GSSAPI to handle “Negotiate”. But I feel that would be separate from this particular issue.

I would definitely be inclined to add this switch. Ideally we would allow switching both NTLM and SPNEGO to managed implementation. I started some work on refactoring the code to make it possible but it was too late to land it for .NET 7 and I didn’t revisit it since then…

I have reinstalled my computer since and haven’t encountered the issue.

Is channel binding configured on the client or server?

It was observed when connecting to http endpoint so no TLS channel binding was present AFAICT.

Once there is a version of 8.0 to test against, please let me know. Will try it as well.

.NET 8 Preview 1 branched out this week. However, there are no changes relevant to this issue AFAICT.

Triage: we were not able to reproduce it, but we should figure out what is happening in 8.0 and fix it.

To offer some context for the trace output above. The test case fails on the affected machine. The third column in the log file is the thread id ([22] iirc). Walking back from the failure we can see it happens when processing the Challenge NTLM token. I tried to manually take the token from trace and inject it into NegotiateAuthentication to artificially reproduce the failure. It accepts the token just fine though…

This leaves basically three possible causes of the issue:

  1. Buffer mishandling (unlikely since it works in isolation and the buffers are not shared)
  2. The NTAuthentication/NegotiateAuthentication context being unintentionally reset between the initial HTTP request with Authorization: header and the reply with the token that is processed (no trace of this happening in the log)
  3. Intermittent failure in the Windows SSPI itself

I ran through the code with symbols loaded for .NET, and I managed to get to method NegotiateStreamPal.InitializeSecurityContext which seems to return statusCode.ErrorCode = InvalidToken which probably comes from here: https://github.com/dotnet/runtime/blob/f429780c9ccce3546e9c9e25c05ed083318428bd/src/libraries/Common/src/System/Net/Security/NegotiateStreamPal.Windows.cs#L21 (but VS doesn’t let me go into this method).

Which would translate to SEC_E_INVALID_TOKEN from https://learn.microsoft.com/en-us/windows/win32/api/sspi/nf-sspi-initializesecuritycontexta

There’s actually one interesting thing in the Fiddler trace. The very last successful request reused an existing session and skipped the authentication. I previously didn’t consider that connection pooling may have some effect. (For reference, request 11 is the Negotiate / Challenge NTLM packets; request 17 finishes with the Authenticate NTLM packet and first 200 OK response; request 18 reuses the same authenticated connection.)

Unfortunately, the dump seems a bit difficult to analyze due to the async exceptions / stack traces. Not sure I will find anything useful in it.

I’ve run the code on my Windows 11 22H2 Build 22621.1105 and got the same exception (.NET 6.0).

That’s the same build I have FWIW. I tried to change the parameters (number of iterations, degree of parallelism) and it didn’t reproduce on my machine so far.

Additionally, I run it on Azure VM with Windows Server 2022 and it didn’t hit the exception either. 🤷‍♂️

I’ve run the code on my Windows 11 22H2 Build 22621.1105 and got the same exception (.NET 6.0).

The Fiddler trace doesn’t seem to be very useful in this particular case. It’s a session-based authentication, so it’s normal to see 401 errors during the first steps. There’s 6 sessions that go through the authentication process, 5 of them reach the full exchange (ie. all three NTLM messages back and forth). One of them does not and the last request is not present in the Fiddler trace. That likely means just that .NET 7 reports the “invalid token” error as “401 Unauthorized” (last reply from server). That’s expected change of behavior that was intentionally done to match documented exceptions. Internally it likely fails with the same API error on the Win32 API call.

I’ll look at the dump next.

I could not reproduce it on my Win11 VM with 6.0.8. I agree with @filipnavara that using current release(s) is worth of shot. And thanks for the repro & site. It is very useful.