node-continuation-local-storage: The problems from my 1+ year experience with CLS
I’m writing it as a note to myself and maybe other guys who plan/use CLS (continuation-local-storage).
I met several practical problems with it. But my another issue hangs about half-year+ without an answer (not even sure if it’s actual any more, after AL updates), so I’m not asking anything from devs. Anyway I’m not using CLS for anything serious any more.
Just noting the things I met for a year+ of trying to make it work (with occasional success), so that people know the difficulties ahead, cause some of them are rather hidden.
The patching problem:
- There are many modules which do not support CLS. Namely, bluebird, mongoose, a bunch of others. CLS just doesn’t work with them or works with serous bugs (ye I know why).
- Most modules can be monkeypatched to support CLS. But that’s only possible if the needed functionality is exported, otherwise I need to fork/really patch it.
- In practice the monkeypatch tends to break on new major versions of the modules. Then repatching takes time to dive deep into the updated module once again.
The private versions patching problem:
- Even if I manage to patch something, like bluebird v3, there’s a chance that a 3rd-party module in my app uses it’s own version of bluebird, say v2, deep inside it’s node_modules hierarchy.
- So I must watch not only over modules I need, but also track any module in the
node_modules
tree of my app, to see if there’s a private version of a CLS-unfriendly module, and patch that. Every module/submodule/subsubmodule/… must support CLS or be patched for it. - That greatly extends the area of maintenance I need to do to keep everything CLS’able. Also gives so much space for bugs in case if something updates and I miss it.
The dangerous bugs problem:
- The bugs introduced by a missed CLS-unfriendly updates of a module or it’s submodule may be subtle yet really destructive and hard to fix back. Imagine a payment going to a wrong account. Or a regular user getting an admin view of page with all the secret information.
- That prevents relying on CLS even if though mostly works.
I admire the idea behind the module/async-listener. Async chains in Node.JS are like green threads. A context for an async chain is so cool, it’s a must. It’s like thread local variables, widely used by so many systems.
Maybe there’s something that can be done in the node core to make it work everywhere, make every module CLS’able without critical perf loss?
It would be great to get more attention to the module and its functionality from the minds of the community.
About this issue
- Original URL
- State: open
- Created 8 years ago
- Reactions: 27
- Comments: 22 (3 by maintainers)
AsyncWrap
is indeed a good alternative toasync-listener
, but it doesn’t provide enough functionality to cover the continuation-local-storage use cases.OP mentioned ‘the patching problem’. I call it the ‘user space queuing problem’ (more details here). It stems from the fact that any user space code (JS or native) could queue callbacks can resume them in a context different. The status quo here has been to discover such code, and get them monkey patch them to ensure they continue working. This is a never ending battle.
IMO a longer term solution would be to
Ideally we want this to work with no monkey patching at all.
Authors of bluebird have indicated that they have little interest in propagating context for continuation-local-storage; they do propagate the current domain, however, since domains are part of core. This is an understandable position, and bluebird is not the only module with this disposition.
In my opinion, to get a solution that is better than ‘mostly works’, we need:
As of Node 6, Trevor Norris’s AsyncWrap API is a documented, stable part of the core Node API. It would be excellent to port CLS to take advantage of AsyncWrap when it’s available, as I believe that it addresses a number of the points raised above (and Trevor is a much better steward of AsyncWrap than I’ve been able to be for
async-listener
, which is after all only my backport of Trevor’s previous sketch at this kind of API). It would also be excellent to see CLS extended to directly incorporate all of the known shims for packages like Bluebird that require a little extra help to work with CLS.If anyone felt like taking on that work, I would be happy to add them to the repo and list of owners on npm, because it’s quite clear to me that my other commitments don’t afford me the time to take on that work, which is pretty time-consuming and finicky.
The overall tracking issue for any changes to async-hooks is at https://github.com/nodejs/diagnostics/issues/124.
For what it is worth we use
cls-hooked
on top ofasync-hooks
in production without any real issues. It generally just works. I am pretty sure there is nothing else out there that is any better, and to be honest I also think it is good enough in the vast majority of cases.Yes, that’s exactly the same issue that I was talking about in the beginning of the thread. So the async hook solution is no different.
My original issue is still actual, unfortunately.
@holm and others who use cls-hooked in production.
After a close look, I really see no working difference between this module and cls-hooked. The low-level internals are different: hook-patching instead of monkey-patching 😃, but the problems are the same. The principle is same.
Mongoose, bluebird - exactly same failures.
Or please explain 😃
Thanks for the prompt answer! @holm @overlookmotel
I am asking because we hit some strange problem when using cls under node 8.x environment (not sure if this is the root cause), the context is not removed properly after the execution ended, which causes cls to return the other execution context from other async execution (this is very serious in our use cases as we use cls for storing permission information of user). After studying the source code from cls, we suspect that this related to bugs in async-listener, thus we are asking if there’s any plan for cls to be integrated with Async Hook.
@ckitt I think the plan is for CLS to be rewritten to utilise Async Hooks.
cls-hooked module claims to do exactly this, but I don’t know how stable it is at present.
Also, Async Hooks has been going through some changes to tackle problems that became evident at its first release, particularly as it relates to Promises. Again, I can’t tell you where this work is up to - I’m out of date - and whether Async Hooks is really ready for prime time at this point.