azure-functions-host: Host crashes caused by log contention
We are experiencing unexpected host restarts when running a function. When this happens, the diagnostic logs show the normal trace messages from the execution of the function but occasionally, before the invocation completes, the logs show that the host restarted, with no indication of what has caused it to crash.
As I said, the streaming logs do not provide any information, but in the event log messages in Kudu (eventlog.xml) we have observed that every time that this happens, it seems to have been caused by contention while writing to the host logs. Here’s an excerpt of the log showing the two events that we always see before each crash.
<Event>
<System>
<Provider Name="ASP.NET 4.0.30319.0"/>
<EventID>1325</EventID>
<Level>1</Level>
<Task>0</Task>
<Keywords>Keywords</Keywords>
<TimeCreated SystemTime="2018-10-31T01:37:53Z"/>
<EventRecordID>777728515</EventRecordID>
<Channel>Application</Channel>
<Computer>…</Computer>
<Security/>
</System>
<EventData>
<Data>
An unhandled exception occurred and the process was terminated.
Application ID: /LM/W3SVC/1134994283/ROOT
Process ID: 1780
Exception: System.IO.IOException
Message: The process cannot access the file 'D:\home\LogFiles\Application\Functions\Host\2018-10-31T01-34-39Z-7f29d1a4c5.log' because it is being used by another process.
StackTrace: at System.IO.__Error.WinIOError(Int32 errorCode, String maybeFullPath)
at System.IO.FileStream.Init(String path, FileMode mode, FileAccess access, Int32 rights, Boolean useRights, FileShare share, Int32 bufferSize, FileOptions options, SECURITY_ATTRIBUTES secAttrs, String msgPath, Boolean bFromProxy, Boolean useLongPath, Boolean checkHost)
at System.IO.FileStream..ctor(String path, FileMode mode, FileAccess access, FileShare share)
at System.IO.FileInfo.Open(FileMode mode, FileAccess access, FileShare share)
at Microsoft.Azure.WebJobs.Script.FileTraceWriter.AppendToFile(FileInfo fileInfo, String content) in C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Diagnostics\FileTraceWriter.cs:line 279
at Microsoft.Azure.WebJobs.Script.FileTraceWriter.Flush(Boolean timerCallback) in C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Diagnostics\FileTraceWriter.cs:line 151
at Microsoft.Azure.WebJobs.Script.FileTraceWriter.FlushCallback(Object state) in C:\projects\azure-webjobs-sdk-script\src\WebJobs.Script\Diagnostics\FileTraceWriter.cs:line 91
at System.Threading.TimerQueueTimer.CallCallbackInContext(Object state)
at System.Threading.ExecutionContext.RunInternal(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.ExecutionContext.Run(ExecutionContext executionContext, ContextCallback callback, Object state, Boolean preserveSyncCtx)
at System.Threading.TimerQueueTimer.CallCallback()
at System.Threading.TimerQueueTimer.Fire()
at System.Threading.TimerQueue.FireQueuedTimerCompletion(Object state)
at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
</Data>
</EventData>
</Event>
<Event>
<System>
<Provider Name=".NET Runtime"/>
<EventID>1026</EventID>
<Level>1</Level>
<Task>0</Task>
<Keywords>Keywords</Keywords>
<TimeCreated SystemTime="2018-10-31T01:37:54Z"/>
<EventRecordID>777729218</EventRecordID>
<Channel>Application</Channel>
<Computer>…</Computer>
<Security/>
</System>
<EventData>
<Data>
Application: w3wp.exe
Framework Version: v4.0.30319
Description: The process was terminated due to an unhandled exception.
Exception Info: System.IO.IOException
at System.IO.__Error.WinIOError(Int32, System.String)
at System.IO.FileStream.Init(System.String, System.IO.FileMode, System.IO.FileAccess, Int32, Boolean, System.IO.FileShare, Int32, System.IO.FileOptions, SECURITY_ATTRIBUTES, System.String, Boolean, Boolean, Boolean)
at System.IO.FileStream..ctor(System.String, System.IO.FileMode, System.IO.FileAccess, System.IO.FileShare)
at System.IO.FileInfo.Open(System.IO.FileMode, System.IO.FileAccess, System.IO.FileShare)
at Microsoft.Azure.WebJobs.Script.FileTraceWriter.AppendToFile(System.IO.FileInfo, System.String)
at Microsoft.Azure.WebJobs.Script.FileTraceWriter.Flush(Boolean)
at Microsoft.Azure.WebJobs.Script.FileTraceWriter.FlushCallback(System.Object)
at System.Threading.TimerQueueTimer.CallCallbackInContext(System.Object)
at System.Threading.ExecutionContext.RunInternal(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.ExecutionContext.Run(System.Threading.ExecutionContext, System.Threading.ContextCallback, System.Object, Boolean)
at System.Threading.TimerQueueTimer.CallCallback()
at System.Threading.TimerQueueTimer.Fire()
at System.Threading.TimerQueue.FireQueuedTimerCompletion(System.Object)
at System.Threading.QueueUserWorkItemCallback.System.Threading.IThreadPoolWorkItem.ExecuteWorkItem()
at System.Threading.ThreadPoolWorkQueue.Dispatch()
at System.Threading._ThreadPoolWaitCallback.PerformWaitCallback()
</Data>
</EventData>
</Event>
- Timestamp: 2018-11-06T12:53:45.889
- Function App version (1.0 or 2.0): 1.0
- Invocation ID: d2c4757a-7444-498c-b07c-f766fabef199
- Region: East US
Any suggestions or workarounds?
Thanks
About this issue
- Original URL
- State: closed
- Created 6 years ago
- Reactions: 3
- Comments: 20 (6 by maintainers)
I have the same issue today The process cannot access the file ‘D:\home\LogFiles\Application\Functions\Function\ItemsTrigger\2019-05-05T16-16-01Z-8d04d9d4f1.log’ because it is being used by another process.
function app: cosmosdemo-function functionName: ItemsTrigger
This incident have been open for 6 months now, is somebody going to take a look at it? 😃
Same issue here. I have an HTTP triggered function that logs an exception " log.LogError(“An error has occurred.”, ex);". I’m getting the error below when logging the exception.
There’s actually an issue I’ve opened a few days ago to track this work for V1 (https://github.com/Azure/azure-functions-host/issues/4573), let’s use that to follow the progress on getting this backported. Shouldn’t take long.
I’ve just been doing some re-creation of this issue (I’d had a notification that this issue was resolved in the latest runtime release so was trying to test it before I realised it hadn’t been). I had been experiencing the issue with repeated large errors being logged (to be picked up by app insights), in recreating the issue I have found that high volume small errors have the same effect, and that the file is blocked until app insights has finished reading in the entries. From the discussion and looking at the PR for the fix I’m concerned this won’t actually be resolved. It’s easy to recreate the issue with the below code on a function with app insights connected, on my initial run it got to ~4000 errors logged, which did eventually make it into app insights, but took a while to catch up, on subsequent runs it didn’t get anywhere near as far through.
for (int i = 0; i < 10000; i++) { log.LogError(new Exception("error"+i), "error"+i); }
The same code but with LogInformation() gets significantly further but still suffers from the same issue.Same issue here, Azure Functions v2, publishing using package option.
We are a team inside Microsoft having the same issue. We discussed this problem with Azure support (ticket 119021321000209) and so far they suggested to turn off local logging, since we also have Application Insights active. However, turning off local logging did not fix this issue.
We use Azure Functions v2 and we Publish updates using the package (zip) option.