Introducing the URL validation bypass cheat sheet

URL validation bypasses are the root cause of numerous vulnerabilities including many instances of SSRF, CORS misconfiguration, and open redirection. These work by using ambiguous URLs to trigger URL parsing discrepancies and bypass validation. However, many of these techniques are poorly documented and overlooked as a result.

To address this, we wanted to create a cheat sheet that consolidates all known payloads, saving you the time and effort of searching and gathering information from across the Internet. Today, we're excited to introduce a new tool designed to solve this problem: the URL Validation Bypass Cheat Sheet.

We hope you find it useful! This is a frequently updated repository of all known techniques, allowing you to quickly generate a wordlist that meets your needs.

How to get started

The URL Validation Bypass Cheat Sheet is a brand new interactive web application that automatically adjusts its settings based on your context. Currently, there are three contexts available:

Initially, the cheat sheet provides six types of payload wordlists. The advanced settings allow you to select a specific wordlist or use all of them simultaneously. Here's a brief overview of the most important ones:

Encodings

The URL Validation Cheat Sheet supports several types of string encoding:

Note: Unencoded strings should be used with caution, as Unicode values may not be transmitted correctly.

Advanced settings

IPv4 Addresses representation

When working with web applications, encoding IP addresses into different formats can be crucial for testing, validation, and security purposes. The cheat sheet supports standard IPv4 address as attacker IP input and returns an array of encoded representations, including octal, hexadecimal, binary, and decimal formats. It also converts an IPv4 address into its IPv6-mapped address format.

Encoding Details:

Normalization

The wordlists include numerous payloads that exploit Unicode string normalization. For instance, the normalization of the following characters results in an empty string:

These techniques can be used to bypass Web Application Firewalls (WAFs).

Another example of an allowed domain bypass occurs when a validation regular expression permits multiline strings. For instance, if the regex ^allowed_domain$ is used, the following can bypass the validation:

Credits

This cheat sheet wouldn't be possible without the web security community who share their research. Big thanks to: Gareth Heyes, James Kettle, Jann Horn, Liv Matan, Takeshi Terada, Orange Tsai, Nicolas Grégoire.

We published all payloads at our GitHub account https://github.com/PortSwigger/url-cheatsheet-data, so you can contribute to this cheat sheet by creating a new issue or updating the JSON files and submitting a pull request.

We look forward to your interesting discoveries using our new URL validation bypass cheat sheet!

Back to all articles

Related Research