uppy: Uploads don't start until a client connects, fail in multi-instances environment
I’m not sure I understand the logic behind this mechanism, IMO getting the progress is not worth not starting the transfer.
From upload.js:
// wait till the client has connected to the socket, before starting
// the download, so that the client can receive all download/upload progress.
logger.debug('Waiting for socket connection before beginning remote download/upload.', null, req.id)
await uploader.awaitReady()
logger.debug('Socket connection received. Starting remote download/upload.', null, req.id)
From Uploader.js:
async awaitReady () {
// TODO timeout after a while? Else we could leak emitters
logger.debug('waiting for socket connection', 'uploader.socket.wait', this.shortToken)
await new Promise((resolve) => emitter().once(`connection:${this.token}`, resolve))
logger.debug('socket connection received', 'uploader.socket.wait', this.shortToken)
}
I’m setting up multiple companion pods in a k8s cluster, so it happens that an upload starts in 1 instance and the client connection is received by a different instance. Resulting in the first instance “awaitReady” being stuck forever, and the upload never starts.
So basically my questions are:
- What am I missing in my multiple-instances setup?
- Can we please make this logic of waiting for client connection configurable? or have a configurable timeout as the TODO comment suggests?
- I set up a Redis as the documentation suggests, why isn’t this allowing for the second instance (that receives the connection) to commence the upload?
About this issue
- Original URL
- State: closed
- Created 2 years ago
- Reactions: 1
- Comments: 18 (16 by maintainers)
Commits related to this issue
- document Companion multi-instance feature #3538 — committed to transloadit/uppy by mifi 2 years ago
Actually when I look more closely at the code, unless my brain is too tired, I can see that socket.js does indeed proxy/forward all websocket messages to redis. https://github.com/transloadit/uppy/blob/fc6f8f3fe5700fc59cd89dddeb21183af0129834/packages/%40uppy/companion/src/server/socket.js#L57
This would imply that we do actually support fault-tolerant scaling between multiple companion instances on the same domain 🤯
consider this:
/geton file.await uploader.awaitReady()which waits for theconnect:${this.token}event send over redisconnect:${this.token}event over redisSo if this theory is correct, then I think it should just work if you enable redis in your setup. In such case we need to update our docs also.
The explanation for why express sessions are being used (even though they don’t seem to work over XHR) is this: Grant uses express sessions for sending data between redirects, so that we don’t need to expose secret tokens in URLs https://github.com/simov/grant#callback-session - after that the session is no longer used. And because these are not XHR requests coming from Uppy, the CORS configuration doesn’t matter (the browser is making requests directly to companion in a new tab/window!)
If this really works, I think we should make an integration or e2e test for running with redis and a load balancer, to confirm that switching between two instances on upload really works.
I’ve done some more research and I’ve found that companion does indeed not support true fault-tolerant non-sticky load balancing with multiple companion instances behind one (sub)domain (where any client request can go to any instance).
Instead it supports a different type of scaling where each companion instance needs to have its own domain name. e.g.
companion1.example.com,companion2.example.com, … There will be one “master” companion instance that handles all authentication, in order to avoid having to set up separate callbacks URL for each instance’s subdomain on the OAuth providers’ admin page. This master instance is configured using the optionsoauthDomainandvalidHosts.This means that you don’t need to set up sticky sessions, redis or anything like this, because Uppy/Companion will handle stickiness internally (actually the express session code isn’t really in use AFAIK.)
The only thing that you need to do is to set
companionUrlin Uppy to a URL with a randomised companion instance from a list of subdomains.I found this information in #1845 and #2321
I will update the documentation because this is a frequently requested feature.