got: Don't force query string normalization
What problem are you trying to solve?
In an url like http://example.org/random?param=SOMETHING~SOMETHING
the special character ~
is percent-encoded before the request, resulting in http://example.org/random?param=SOMETHING%7ESOMETHING
which is not supported (decoded) by some HTTP servers.
As described by RFC 3986 in section 2.3 " URI comparison implementations do not always perform normalization prior to comparison. For consistency, percent-encoded octets in the ranges of ALPHA (%41-%5A and %61-%7A), DIGIT (%30-%39), hyphen (%2D), period (%2E), underscore (%5F), or tilde (%7E) should not be created by URI producers and, when found in a URI, should be decoded to their corresponding unreserved characters by URI normalizers. "
Also in RFC 3986, section 6.2.2.2 " The percent-encoding mechanism is a frequent source of variance among otherwise identical URIs. In addition to the case normalization issue noted above, some URI producers percent-encode octets that do not require percent-encoding, resulting in URIs that are equivalent to their non-encoded counterparts. These URIs should be normalized by decoding any percent-encoded octet that corresponds to an unreserved character, as described in Section 2.3. "
The percent-encoding of ~
by got happen because NodeJS follows the “WHATWG URL API” (https://nodejs.org/api/url.html#url_the_whatwg_url_api) which misses ~
from the unreserved characters (https://url.spec.whatwg.org/#interface-urlsearchparams, the Note below the example, and https://url.spec.whatwg.org/#urlencoded-serializing) and, by the way, includes *
.
Describe the feature
My proposal is to add a flag to the options to prevent the normalization by skipping the append and delete of “_GOT_INTERNAL_TRIGGER_NORMALIZATION”
Checklist
- I have read the documentation and made sure this feature doesn’t already exist.
About this issue
- Original URL
- State: closed
- Created 4 years ago
- Comments: 34 (1 by maintainers)
It’s not invalid, your statement would be true if the format applied to the query string was
application/x-www-form-urlencoded
, which requires that%
character should be escaped.What about a custom format that don’t use percent escapes? It’s still into the specs, as described before HTTP do not have specs about the query string content (except for the
#
character, as it’s the terminator of the query string).You can argue that
application/x-www-form-urlencoded
is a spec and Got follows it, but in the homepage I readand not
What i really meant with
was that that’s not the problem.
I know this “Feature request” started as “I’ve a problem with ~, damn WHATWG spec”, but after this comment https://github.com/sindresorhus/got/issues/1234#issuecomment-625757221 , it became a “Bug report” on the incorrect handling of the query string.
I wanted to discuss exhaustively about this issue because, as you said, it would be a breaking change. In particular I think it is going to break the merging of the URL params.