You may have seen this before: the frontend sends a query string that looks correct, but the backend receives corrupted text. Or a URL breaks as soon as it includes spaces, +, or non-Latin characters. In many cases, this is not a transport problem. It is a URL encoding problem caused by encoding the wrong part at the wrong time.
1. What Is URL Encoding?
A URL can safely carry only a limited character set. When you need spaces, Unicode text, emoji, or special symbols such as & and =, those characters must be converted using percent-encoding: % followed by two hexadecimal digits.
For example, a space is commonly encoded as %20. If data is not encoded correctly, parsers may treat values as structural separators, splitting or rewriting your parameters.
2. Which Characters Should Be Encoded?
A practical way to reason about this is by character categories:
- Unreserved (typically safe as-is):
A-Z,a-z,0-9,-,_,.,~ - Reserved (structural meaning):
:,/,?,#,[,],@,&,= - Non-ASCII (Unicode text): must be UTF-8 encoded first, then percent-encoded
The goal is not "encode everything". The goal is "encode the right characters in the right URL component".
3. Why Is Space Sometimes %20 and Sometimes +?
This is a classic source of confusion. In regular percent-encoding, spaces are represented as %20. In application/x-www-form-urlencoded rules (commonly used by HTML forms), spaces are represented as +.
If one system interprets + as a literal plus sign and another treats it as a space, data mismatches are inevitable. Agree on one encoding contract across services.
4. JavaScript: encodeURI vs encodeURIComponent
Both functions encode, but they serve different purposes:
encodeURI: use for an entire URL. It preserves structural characters like:,/, and?.encodeURIComponent: use for a single parameter value. It encodes characters such as&and=to avoid query-string corruption.
const keyword = 'C++ guide & examples';
const url = '/search?q=' + encodeURIComponent(keyword);
// /search?q=C%2B%2B%20guide%20%26%20examples
If you encode parameter values with encodeURI, separators like & may leak into query structure and break parsing.
5. Common Failure: Double Encoding
Double encoding happens when data is encoded more than once across layers (frontend, SDK, gateway, backend). A literal % becomes %25, and values look corrupted.
- Original:
hello world - Encoded once:
hello%20world - Encoded twice:
hello%2520world
Document ownership clearly: who encodes, who decodes, and at what boundary.
6. Treat Path, Query, and Fragment Separately
Different URL components follow different rules. A common mistake is treating the full URL as one homogeneous string.
| Component | Recommended Approach | Common Mistake |
|---|---|---|
| Path | Encode per segment while preserving route structure | Encoding the full path and breaking slashes |
| Query | Encode keys and values independently | Leaving & or = unencoded in values |
| Fragment | Handle according to frontend/router behavior | Mixing fragment and server query rules |
7. Security Note: Encoding Is Not Sanitization
URL encoding protects transport representation, not application security. It does not replace input validation, output escaping, or parameterized database access.
- Validate decoded data at the receiver.
- Escape for the target output context (HTML, SQL, shell, etc.).
- Do not inject decoded content directly into the DOM.
Conclusion
Reliable URL encoding is less about memorizing character tables and more about applying consistent rules at clear boundaries. Once your team aligns on component-level encoding and ownership, many "mysterious" API bugs disappear.