Skip to content

Commit 52b3218

Browse files
authored
Merge pull request #4797 from handrews/pctenc
v3.2: Editorial improvements to Appendix E (Percent-Encoding)
2 parents 741a0e7 + 2dc01c9 commit 52b3218

File tree

1 file changed

+14
-3
lines changed

1 file changed

+14
-3
lines changed

src/oas.md

Lines changed: 14 additions & 3 deletions
Original file line numberDiff line numberDiff line change
@@ -4374,7 +4374,18 @@ For multiple values, `style: "form"` is always incorrect as name=value pairs in
43744374
_**NOTE:** In this section, the `application/x-www-form-urlencoded` and `multipart/form-data` media types are abbreviated as `form-urlencoded` and `form-data`, respectively, for readability._
43754375

43764376
Percent-encoding is used in URIs and media types that derive their syntax from URIs.
4377-
This process is concerned with three sets of characters, the names of which vary among specifications but are defined as follows for the purposes of this section:
4377+
The fundamental rules of percent-encoding are:
4378+
4379+
* The set of characters that MUST be encoded varies depending on which version of which specification you use, and (for URIs) in which part of the URI the character appears.
4380+
* The way an unencoded `+` character is decoded depends on whether you are using `application/x-www-form-urlencoded` rules or more general URI rules; this is the only time where choice of decoding algorithm can change the outcome.
4381+
* Encoding more characters than necessary is always safe in terms of the decoding process, but may produce non-normalized URIs.
4382+
* In practice, some systems tolerate or even expect unencoded characters that some or all percent-encoding specifications require to be encoded; this can cause interoperability issues with more strictly compliant implementations.
4383+
4384+
The rest of this appendix provides more detailed guidance based on the above rules.
4385+
4386+
### Percent-Encoding Character Classes
4387+
4388+
This process is concerned with three classes of characters, the names of which vary among specifications but are defined as follows for the purposes of this section:
43784389

43794390
* _unreserved_ characters do not need to be percent-encoded; while it is safe to percent-encode them, doing so produces a URI that is [not normalized](https://datatracker.ietf.org/doc/html/rfc3986#section-6.2.2.2)
43804391
* _reserved_ characters either have special behavior in the URI syntax (such as delimiting components) or are reserved for other specifications that need to define special behavior (e.g. `form-urlencoded` defines special behavior for `=`, `&`, and `+`)
@@ -4423,7 +4434,7 @@ Note that content-based serialization for `form-data` does not expect or require
44234434

44244435
#### Interoperability with Historical Specifications
44254436

4426-
In most cases, generating query strings in strict compliance with [[RFC3986]] is sufficient to pass validation (including JSON Schema's `format: "uri"` and `format: "uri-reference"`), but some `form-urlencoded` implementations still expect the slightly more restrictive [[RFC1738]] rules to be used.
4437+
In most cases, generating query strings in strict compliance with [[RFC3986]] is sufficient to pass validation (including JSON Schema's `format: "uri"` and `format: "uri-reference"` when `format` validation is enabled), but some `form-urlencoded` implementations still expect the slightly more restrictive [[RFC1738]] rules to be used.
44274438

44284439
Since all RFC1738-compliant URIs are compliant with RFC3986, applications needing to ensure historical interoperability SHOULD use RFC1738's rules.
44294440

@@ -4433,7 +4444,7 @@ WHATWG is a [web browser-oriented](https://whatwg.org/faq#what-is-the-whatwg-wor
44334444
WHATWG's percent-encoding rules for query strings are different depending on whether the query string is [being treated as `form-urlencoded`](https://url.spec.whatwg.org/#application-x-www-form-urlencoded-percent-encode-set) (where it requires more percent-encoding than [[RFC1738]]) or [as part of the generic syntax](https://url.spec.whatwg.org/#query-percent-encode-set), where it allows characters that [[RFC3986]] forbids.
44344445

44354446
Implementations needing maximum compatibility with web browsers SHOULD use WHATWG's `form-urlencoded` percent-encoding rules.
4436-
However, they SHOULD NOT rely on WHATWG's less stringent generic query string rules, as the resulting URLs would fail RFC3986 validation, including JSON Schema's `format: uri` and `format: uri-reference`.
4447+
However, they SHOULD NOT rely on WHATWG's less stringent generic query string rules, as the resulting URLs would fail RFC3986 validation, including JSON Schema's `format: uri` and `format: uri-reference` (when `format` validation is endabled).
44374448

44384449
### Decoding URIs and `form-urlencoded` Strings
44394450

0 commit comments

Comments
 (0)