hyper: illegal header name from httparse

Why do we have this defined as a panic? https://github.com/hyperium/hyper/blob/dbea7716f157896bf7d2d417be7b4e382e7dc34f/src/proto/h1/role.rs#L36 https://github.com/hyperium/hyper/blob/dbea7716f157896bf7d2d417be7b4e382e7dc34f/src/proto/h1/role.rs#L47 https://github.com/hyperium/hyper/blob/dbea7716f157896bf7d2d417be7b4e382e7dc34f/src/proto/h1/role.rs#L60

Even if under debug_assertions this library has no right to panic as it’s completely normal for it to be used on invalid input(I use it in a broad web crawler and web is full of surprises - good and bad). It should always simply return an error back to the user if anything bad happened

as I understand it panics are reserved for unrecoverable errors, and because this is a library it has no right to panic and crash client code… Error “unrecoverability” is completely at library’s user discretion

About this issue

  • Original URL
  • State: closed
  • Created 3 years ago
  • Reactions: 1
  • Comments: 37 (17 by maintainers)

Most upvoted comments

To wrap up, this specific issue (the parse error) is fixed in a new httparse release. You can get by running cargo update -p httparse.

The problem was that httparse introduced the ability to parse HTTP responses without a reason-phrase, but had a bug in how it handled those when only followed by LF. The status code, the newline, and the next header name would parsed as the first header. However, http::HeaderName would return an error that a newline is not valid. The LF-only case of a reason-phrase-less response is now handled the same way it would a response that did have a reason-phrase.

Because the original issue is fixed, I’m going to close this issue.

The other part that has been discussed in this issue is around the internal assertion panicking. I’ve opened #2546 to track changing that with something more appropriate.

All browsers and many HTTP libs support that, AFAIK most of them do so by default and don’t provide a setting.

I completely forgot about this but chatting with @Geal this weekend I was reminded that RFC 7230 actually says that LF may also be accepted in addition to CRLF:

Although the line terminator for the start-line and header fields is the sequence CRLF, a recipient MAY recognize a single LF as a line terminator and ignore any preceding CR.

So that’s why httparse supports LF too (just writing this in case someone is curious about this stuff).

Checking http://watson.addy.com/ with curl, it uses 0x0a to separate header lines and 0x0a 0x0a to end headers which is of course incorrect.

Should that be parsed into headers as it’s not correct HTTP? I think that should return an error.

Probably there should be an option somewhere about how strict we want to follow http standard, and if we still can parse something malformed and this option is on(non_strict_http: true) we should make our best to parse

All browsers and many HTTP libs support that, AFAIK most of them do so by default and don’t provide a setting.

Note that httparse is already more lax than the RFCs, too.

That said, I think we can change this to something better here.

I think it’s best to keep the panic, personally. There are other cases where we found panics in h2 or hyper or whatever and we just fixed those.

Saying that it shouldn’t panic ever is in my opinion a little bit grandstanding, as tokio will catch that at task boundary anyway.

FWIW the bug is fixed in httparse 1.4.1.

seanmonstar/httparse#96

Should that be parsed into headers as it’s not correct HTTP? I think that should return an error.

Checking http://watson.addy.com/ with curl, it uses 0x0a to separate header lines and 0x0a 0x0a to end headers which is of course incorrect.

Here’s start of the response:

00000000  48 54 54 50 2f 31 2e 30  20 32 30 30 0a 43 6f 6e  |HTTP/1.0 200.Con|
00000010  74 65 6e 74 2d 74 79 70  65 3a 20 74 65 78 74 2f  |tent-type: text/|
00000020  68 74 6d 6c 0a 0a 3c 68  74 6d 6c 3e 0a 3c 68 65  |html..<html>.<he|
00000030  61 64 3e 0a 3c 73 74 79  6c 65 3e 0a 68 31 7b 0a  |ad>.<style>.h1{.|

I believe you should be able to wrap future_with_hyper_that_may_panic into std::panic::AssertUnwindSafe

@let4be Could you give us some code to repro the issue? Maybe the HTTP message that was parsed? That would be helpful.