FreeRTOS-Kernel: [BUG] - TCBs and dynamic stacks leaked when scheduler is stopped
When the scheduler is stopped, TCBs and dynamic stacks of running threads are not being freed. While ports, such as the posix port, clean up their resources, it appears as if the kernel itself is not freeing these resources when the scheduler is stopped.
TCBs are leaked from here, tasks.c: 1670
/* Allocate space for the TCB. */
/* MISRA Ref 11.5.1 [Malloc memory assignment] */
/* More details at: https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/MISRA.md#rule-115 */
/* coverity[misra_c_2012_rule_11_5_violation] */
pxNewTCB = ( TCB_t * ) pvPortMalloc( sizeof( TCB_t ) ); <----------------------------
Stacks are leaked from here, tasks.c:1662
/* MISRA Ref 11.5.1 [Malloc memory assignment] */
/* More details at: https://github.com/FreeRTOS/FreeRTOS-Kernel/blob/main/MISRA.md#rule-115 */
/* coverity[misra_c_2012_rule_11_5_violation] */
pxStack = pvPortMallocStack( ( ( ( size_t ) usStackDepth ) * sizeof( StackType_t ) ) );
if( pxStack != NULL )
{
@chinglee-iot with the merge of the posix changes I’m making use of them here for some tests. I’m not sure how it was missed before but it looks like we may have missed leaking task TCBs and dynamic task stacks.
A task’s tcb appears to be freed in prvDeleteTCB() but its unclear how the posix port changes of cancelling and deleting threads accomplish this.
Should vTaskStartScheduler() delete the TCBs and dynamic stacks for any running threads before returning? It feels like the port shouldn’t really be doing this since TCBs/dynamic stacks are created and managed by FreeRTOS kernel proper.
Valgrind trace that got me started:
==1415351== HEAP SUMMARY:
==1415351== in use at exit: 37,088 bytes in 4 blocks
==1415351== total heap usage: 2,506 allocs, 2,502 frees, 491,264 bytes allocated
==1415351==
==1415351== 160 bytes in 1 blocks are possibly lost in loss record 2 of 4
==1415351== at 0x4845828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1415351== by 0x1F0BB6: pvPortMalloc (heap_3.c:65) <---------------- this is a TCB
==1415351== by 0x1EB38D: prvCreateTask (tasks.c:1670)
==1415351== by 0x1EB460: xTaskCreate (tasks.c:1722)
==1415351== by 0x1E6405: WaterTemperatureSim::Enable() (WaterTemperatureSim.h:51)
==1415351== by 0x1E59D1: WaterTemperatureSimTest(void*) (test_WaterTemperatureSim.cpp:23)
==1415351== by 0x1F1442: prvWaitForStart (port.c:507)
==1415351== by 0x4C8CAD9: start_thread (pthread_create.c:444)
==1415351== by 0x4D1D2E3: clone (clone.S:100)
==1415351==
==1415351== 4,000 bytes in 1 blocks are possibly lost in loss record 3 of 4
==1415351== at 0x4845828: malloc (in /usr/libexec/valgrind/vgpreload_memcheck-amd64-linux.so)
==1415351== by 0x1F0BB6: pvPortMalloc (heap_3.c:65) <---------------- this is a task dynamic stack
==1415351== by 0x1EB378: prvCreateTask (tasks.c:1662)
==1415351== by 0x1EB460: xTaskCreate (tasks.c:1722)
==1415351== by 0x1E6405: WaterTemperatureSim::Enable() (WaterTemperatureSim.h:51)
==1415351== by 0x1E59D1: WaterTemperatureSimTest(void*) (test_WaterTemperatureSim.cpp:23)
==1415351== by 0x1F1442: prvWaitForStart (port.c:507)
==1415351== by 0x4C8CAD9: start_thread (pthread_create.c:444)
==1415351== by 0x4D1D2E3: clone (clone.S:100)
About this issue
- Original URL
- State: closed
- Created 5 months ago
- Comments: 17 (17 by maintainers)
@chinglee-iot the processing of deferred deletion during vTaskEndScheduler looks like it will resolve the remaining leaks. Feel free to mention me on any PR and I can test here.
@cmorganBE
You point out some problems. vTaskDelete() and vTaskEndScheduler() also need to be updated for the problems.
This is true. If the last running task, which is the task calling vTaskEndScheduler(), is deleted after scheduler stopped, we can get rid of this dilemma. Because task resources (such as task stacks) will no longer be used, the task can be safely deleted.
The following jobs need to be done in vPortEndScheduler() to delete the last running task:
Application can call vTaskDelete to delete the last running task after vTaskEndScheduler() is called.
Idle task takes care of the deferred deletion. There is chance that idle task doesn’t have the chance to run before vTaskEndScheduler() is called. Therefore, the deferred deletion should also be done in vTaskEndScheduler().
I am creating a PR to address these problems. It still take some time check a couple of tests. Once it is ready, I will update in this thread again to discuss the solution here.
There are 2 type of kernel objects -
In order to keep the resource ownership clear, I think the application should delete the kernel objects that it creates and the kernel should delete the ones it creates. This probably means that we need to update our documentation as well.
@cmorganBE Would it solve your problem? With this, you can delete all the kernel objects (including tasks) in your test harness before calling the vTaskEndScheduler() and valgrind should report no memory leak.
@cmorganBE Thank you for reporting back. I will take a look at this issue.