winston: [Bug]: FileDescriptors not properly released on file rotate
š Search Terms
file, disk usage
The problem
Environment: Ubuntu 18.04 LTS
On running the winston file transport, we found that with the tailable option is set to true, on rotation of logs, files are renamed up to max files, but old file descriptors remain open by the node process.
The result is that if you calculate file space usage using a tool like āduā the deleted files no longer show up, but the disk space is still āusedā and will be reported as used by tool like ādfā. If the node application using the winston file transport in this way is left running long enough, eventually all file descriptors on the partition will be in use by the node process, which will result in the failure to create any new files due to no file descriptor availability.
What version of Winston presents the issue?
v3.2.1 and v3.6.0
What version of Node are you using?
v14.19.0
If this worked in a previous version of Winston, which was it?
unknown (but works in Node v12 and Node v10)
Minimum Working Example
This example will set up the file rotation such that you will quickly see the issue. You can play with the MAX_SIZE, MAX_FILES, and TIMEOUT_MS if the issue is happening too fast.
'use strict';
const MAX_SIZE = 2000;
const MAX_FILES = 10;
const TIMEOUT_MS = 1000;
const winston = require('winston');
const {format} = require('logform');
const path = require('path');
const {combine, timestamp, json} = format;
// Set up file transport - I assume logs folder exists
const logsDir = path.resolve(process.cwd(), './logs');
const filePath = path.join(logsDir, 'example.log');
const fileTransport = {
tailable: true,
maxsize: MAX_SIZE,
maxFiles: MAX_FILES,
filename: filePath,
handleExceptions: true,
level: 'debug',
};
fileTransport.format = combine(
timestamp({format: 'YYYY-MM-DD HH:mm:ss.SSS'}),
json(),
);
// Create logger with file transport
const logger = winston.createLogger({
transports: [new winston.transports.File(fileTransport)],
exitOnError: false,
});
logger.log('info', 'Created file logger');
// Add periodic logs so that files are rotated
let i = 0;
setTimeout(timeoutLogger, TIMEOUT_MS);
function timeoutLogger() {
i++;
logger.log('info', 'another log message ' + i);
logger.log('error', 'error message ' + i);
logger.log('http', 'http log message ' + i);
logger.log('verbose', 'verbose log message ' + i);
logger.log('debug', 'debug log message ' + i);
setTimeout(timeoutLogger, TIMEOUT_MS);
}
While running this example, use a terminal to check the file descriptor status using:
lsof | grep example | grep deletedps aux | grep node(to find the node PID) and thenls -al /proc/${PID}/fd
Youāll see ādeletedā files associated with open file descriptors.
Additional information
As I mentioned above, this problem is occurring in Node 14 and we did NOT see it in Node 12 or Node 10. A check through the Node 14 documentation showed a change to the fd garbage collection that may be what is triggering this issue now: https://github.com/nodejs/node/pull/28396
I poked around the winston file transport code, and the problem seems to be tied to calling the _rotateFile() function multiple times (multiple logs triggering the logic that the max file size has been reached and the file needs to be rotated). That function then calls the _incFile() function multiple times. The asynchronous handling of the multiple calls seems to cause the file descriptor cruft.
Just to gain some proof, I modified the line lib/winston/transports/file.js:172 as follows, and that does fix the problem.
// Original line:
this._endStream(() => this._rotateFile());
// My change:
this._endStream(() => {
if (this._rotate === true) {
this._rotate = false;
this._rotateFile();
}
});
I donāt suppose that is a good and robust fix, but maybe it points to a potential solution. Iām happy to try out potential solutions if the experts have thoughts on that. Please let me know if you need any other information or if there is anything I can do to help resolve this.
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 5
- Comments: 23 (9 by maintainers)
Commits related to this issue
- Fix #2100 FileDescriptors release on file rotate — committed to fsievers/winston by deleted user 2 years ago
Yeah, that sounds great, thank you.
FWIW there was some good analysis done here https://github.com/winstonjs/winston/pull/2301#issuecomment-1523859559
If there is any specific info you can provide on what youāre doing so I can reproduce things with peertube on my machine, that would be helpful. Like if thereās any way to make the log rotation happen faster so that I donāt have to wait a long time to see if things are working or not, that would be helpful.
From what I can tell, #2301 solved the issue mentioned here though (at least I couldnāt reproduce it anymore on Mac or Debian with two different node versions), so Iām going to go ahead and close this one out. Please feel free to @ me in the new issue you open.
@andrewscfc is https://github.com/winstonjs/winston/pull/2231 still working for you?
@JohnXLivingston Iām assuming https://github.com/winstonjs/winston/pull/2301 did not fix things for you?
This solution https://github.com/winstonjs/winston/pull/2115 was also proposed earlier, but itās untested and apparently breaks an existing test. It sounds like it might be an attempt at an actual fix, though, as opposed to https://github.com/winstonjs/winston/pull/2231, which does seem to bandaid the issue but doesnāt fix whatever the root cause is here.
Actually, I guess https://github.com/winstonjs/winston/pull/2115 is what was mentioned in the issue report at the top by @kpalumbokitu , but it was mentioned that this might not be robust. Anyone have thoughts/opinions on whether thatās the case or not? If itās robust but ugly, and fixes things for everyone, perhaps we should go ahead and merge for now [if failing tests can be fixed]? Does anyone have time to volunteer to look a bit deeper?
Let me know - Iām happy to merge whatever folks think is the best solution here. Unfortunately the maintainers lack funding to investigate this more deeply, so weāll have to take the approach of merging whatever solution the community thinks is best, and going from there ā assuming there is at least something half-decent for us to merge š Thanks folks!
@andrewscfc @esoltys We have upgraded to winston version 3.9.0. The issue is however not resolved and we are still ādeletedā files associated with open file descriptors.
Thanks for the heads up @esoltys, I tried something similar I believe before ending up with this rather extreme workaround: https://github.com/winstonjs/winston/pull/2231/files
Weāll give the upgrade a go at somepoint
Can confirm we are seeing the same issue, 3.2.1. Using lsof reports hundreds of file descriptors holding onto the *1.log and *9.log files. The issue only seems to happen to applications which log in rapid bursts (thatās a separate issue which weāll address!) I havenāt had a chance to troubleshoot yet.