workers-rs: [BUG] WebSocket sends the wrong data

Is there an existing issue for this?

  • I have searched the existing issues

What version of workers-rs are you using?

0.0.18

What version of wrangler are you using?

3.6.0

Describe the bug

WebSocket::send_with_bytes reproducibly corrupts outgoing binary data in some situations. I have included a repro where [00 01 02] becomes [65 00 01].

Repro repo: https://github.com/paulgb/cloudflare-repro/tree/main

I been able to reproduce this on Mac and Linux.

Steps To Reproduce

  1. Clone https://github.com/paulgb/cloudflare-repro.git
  2. npm i && npm run dev
  3. open another terminal and cd into client
  4. npm i && npx ts-node ./client.ts

Observe that the output is received: <Buffer 65 00 01> rather than received: <Buffer 01 02 03>.

About this issue

  • Original URL
  • State: closed
  • Created 10 months ago
  • Reactions: 3
  • Comments: 15 (15 by maintainers)

Most upvoted comments

I hit the same issue. After some investigation it seems the bug is in web-sys itself, and may be related to how wasm_bindgen passes &[u8] to the Websocket.send() on the js side. This bug doesn’t occur if you manually create an array buffer first:

let uint8_array = Uint8Array::from(data.as_slice());
ws.as_ref().send_with_array_buffer(&uint8_array.buffer())?;

Thanks for investigating this. It’s an interesting bug!

If I understand correctly, the runtime is reading that memory concurrently with the wasm code freeing and reusing it.

What protects us from this exact bug coming back? (I assume future maintainers may not know the history of this problem.) Should there be a big comment next to any code that transfers a buffer to a websocket? This should set off alarms to rust engineers, who have become accustomed to not needing such load-bearing comments in safe code.

What if someone adds another send function in the future? How do they know what is allowed? Normally we could rely on the rust type system to uphold ownership and lifetime guarantees, but if the runtime and web_sys can’t agree on what those guarantees are, then the compiler can’t protect us, which is a big reason to use Rust in the first place.

I doubt that the optimizer would learn how to elide unnecessary-looking memory copies in FFI interfaces, but if I’m wrong, could this fail again on a newer rust toolchain?

Are there other places where the runtime makes this assumption, where lifetime/ownership assumptions are different from other runtimes (and from web_sys)?

If there are widespread disagreements over memory ownership or lifetimes between web_sys and the workers runtime, then it seems like the robust fix would be for one or the other to change so that they agree on the lifetime of that memory. Maybe the worker crate should fork the affected web_sys modules?

Ok, I’m able to reproduce your new POC.

What we believe is happening is that Rust is assuming that the runtime will fully clone the provided buffer when send is called. Our runtime does not clone as this is generally wasteful, and instead asynchronously sends the original buffer. This behavior may differ in other runtimes, but this is the first time we have found it to be an issue. This would apply in any case where the ArrayBuffer is mutated right after calling send. In JavaScript, this is generally not an issue unless the user explicitly mutates the buffer, as the memory will not be freed due to the runtime holding a strong reference to the buffer. In the case of Wasm, the memory is opaque to the runtime and Rust appears to free the buffer right away after calling send.

I don’t believe this is a security issue (only the Wasm linear memory is accessed), but it is probably unintended behavior for those calling web_sys::Websocket.send_with_u8_array directly.

In the case of workers-rs, I believe the patch that you suggested is a good fix because it explicitly forces this clone into a JavaScript object.

cc @kentonv