Get involved in the Burp challenge for opportunities to test your skills and win swag  –   Challenge me


Crawl options

  • Last updated: November 25, 2022

  • Read time: 8 Minutes

Numerous options are available to configure the behavior of Burp Scanner during crawl-based scans. These can be configured on-the-fly when launching a scan, or can be maintained in Burp's configuration library.


Burp Suite's configuration library comes with a set of built-in configurations so you can quickly configure these options.

Crawl optimization

These settings control the behavior of the crawl logic to reflect the objectives of the crawl and the nature of the application.

The maximum link depth represents the maximum number of navigational transitions (clicking links and submitting forms) that the crawler will make from the start URL(s). Modern applications tend to build a mass of navigation into every response, in locations like menus and the page footer. For this reason, it is normally possible to reach the vast majority of an application's content and functionality within a small number of hops from the start URL. Fully covering multi-stage processes (like viewing an item, adding it to a shopping cart, and checking out) will require more hops.

Some applications contain extremely long navigational sequences that don't lead to interestingly different functionality. For example, a shopping application might have a huge number of product categories, sub-categories, and view filters. To a crawler, this can appear as a very deep nested tree of links, all returning different content. In this situation, there are clearly diminishing returns to crawling deeply into the navigational structure, so it is sensible to limit the maximum link depth to a small number, such as 8.

Crawl strategy

Real-world applications differ hugely in the way they organize content and navigation, the volatility of their responses, and the extent and complexity of application state. At one extreme, an application might employ a unique and stable URL for each distinct function, return deterministic content in each response, and contain no server-side state. At the other extreme, an application might employ ephemeral URLs that change each time a function is accessed, overloaded URLs that reach different functions through different navigational paths, volatile content that changes non-deterministically, and heavily stateful functions where user actions cause changes in the content and behavior that is subsequently observed.

Burp's crawler can handle both of these extremes. Where required, it can handle ephemeral and overloaded URLs, volatile content, and changes in application state. However, fully handling these cases imposes a material overhead in the quantity of work that is involved in the crawl. You can use the crawl strategy setting to tune the approach that is taken to specific applications. In practice, this setting represents a trade-off between the speed of the crawl and the completeness of coverage that is achieved. The default strategy represents a trade-off between speed and coverage that is appropriate for typical applications. You can select a strategy that is more optimized for speed, when crawling an application with more stable and unique URLs, and no stateful functionality. Or you can select a strategy that is more optimized for completeness, when crawling an application with more volatile or overloaded URLs, or more complex stateful functionality.

"Fastest" crawl strategy

Note that Burp Scanner's "Fastest" crawl strategy differs from other available strategies in some important ways:

  • The crawler does not try to reset and reproduce the target application's state, and requests pages directly instead of navigating a path from the root directory.
  • Unlike other strategies, the crawler uses cookies from the cookie jar as initial values. If you want to perform an authenticated crawl, you should authenticate with the application using Burp's browser before crawling. If you don't want your crawl to be authenticated, log out of the application before crawling.
  • The crawler will not attempt to register a new user.
  • Authentication is attempted when potential login forms are first discovered, rather than in a separate phase. If credentials are supplied, only the first set will be used.

Crawl using my provided logins only

When this option is enabled, Burp Scanner skips the unauthenticated crawl phase if you have provided one or more application logins for it to use. It will use only your provided logins and will not attempt to self-register users or trigger login failures. This can help to reduce the overall crawl time.

If you don't provide any application logins, the crawler automatically reverts to performing an unauthenticated crawl instead.

Crawl limits

Crawling modern applications is sometimes an open-ended exercise, due to the amount of stateful functionality, volatile content, and unbounded navigation. Burp's crawler uses various techniques to maximize discovery of unique content early in the crawl. The settings for crawl limits let you impose a limit on the extent of the crawl, as it reaches the point of diminishing returns. It is generally sensible to configure a limit to the extent of the crawl, based on your knowledge of the application being scanned.

You can choose to limit the crawl based on:

  • The time taken.
  • The number of unique locations discovered. A location represents a distinct unit of content or functionality, based on the chosen crawl strategy.
  • The number of HTTP requests that are made.

Login functions

These settings control how the crawler will interact with any login functionality that is encountered during the crawl. You can configure whether the crawler should:

  • Attempt to automatically register a new user on the target website - This removes the need to manually set up a user account before the crawl. Note that you can still provide valid application logins in the scan launcher settings if you want.
  • Deliberately trigger login failures using invalid usernames - This can be useful for reaching account recovery features that are only accessed when invalid credentials are submitted. Note that the crawler will not deliberately submit an invalid password for any of the usernames that you provided as application logins. This is to avoid triggering any account locking features on these accounts.


These settings are not compatible with recorded login sequences. When using recorded logins for a scan, these settings will be ignored.

How does the crawler identify login and registration forms?

The crawler uses the following checklist to identify login and registration forms on the target site:

  • The form is a standard HTML form.
  • It contains an input field with the attribute type=password.
  • The password field has a non-empty name attribute.

If all of these criteria are met, the crawler then distinguishes registration forms from login forms by applying the following rules in order. For example, if two forms have an equal number of password fields, it will then compare the number of text fields, and so on.

The registration form is whichever form has the most:

  1. Password fields
  2. Text fields
  3. Multi-value select fields
  4. Single-value select fields
  5. If all of the above are equal, whichever form was found first is assumed to be the registration form.

Why is the crawler not filling my login forms?

The crawler identifies login and registration forms based on the password field. However, it will only be able to enter a username or email address if the related fields:

  • Have the attribute type=email or type=text
  • Have a non-empty name attribute

If either of these conditions is not met, the crawler will successfully identify the form but will be unable to enter the corresponding data correctly.

Handling application errors during crawl

These settings control how Burp Scanner handles application errors (connection failures and transmission timeouts) that arise during the crawl phase of the scan.

You can configure the following options:

  • The number of consecutive timed out requests, or the overall percentage of timed out requests, before pausing the task.
  • The number of follow-up passes that are performed on completion of the crawl, to retry requests that timed out.

You can leave any setting blank to disable it.

Crawl project option overrides

These settings let you override the following settings for the crawl task you are configuring:

  • Timeouts - values are in seconds. If an option is set to zero, then Burp will never time out that function. Note that if an option is left blank here, Burp will use the setting defined in the settings.

Miscellaneous crawl settings

These settings let you customize some details of the crawl:

  • Submit forms - This setting controls whether forms are submitted.
  • Customize User-Agent - This setting lets you specify a custom User-Agent header.
  • Request robots file - This setting controls whether to fetch the robots.txt file and extract links from it.
  • Request site map - This setting controls whether to fetch the sitemap.xml file and extract links from it. You can configure the maximum number of items to extract.
  • Follow hidden links in comments and JavaScript - This setting controls whether to parse HTML comments and JavaScript for URLs that are not visible within the page navigation. You can configure the maximum number of items to extract.
  • Parse API definitions - This setting controls whether the crawler attempts to parse any API definitions it encounters to identify potential endpoints to scan. For more information, please refer to the API scanning documentation.
  • Application uses fragments for routing - Single-page applications (SPAs) often use URL fragments for client-side routing. This enables them to display what appear to be several distinct pages without the browser making additional requests to the server. Burp Scanner needs to know whether the target application uses fragments in this way in order to crawl it effectively. By default, if a fragment contains any of the following characters, the crawler assumes that it is used for client-side routing: / \ ? = &. However, you can use this setting to control this manually if you prefer.

Burp's browser options

These settings let you control the behavior of Burp's browser:

  • Use Burp's browser for Crawl and Audit - This setting controls whether Burp Scanner uses Burp's browser to navigate the target site. We often refer to this as browser-powered scanning.

    By default, Burp Scanner uses Burp's browser if your machine appears to meet the required spec, but you can force it to always use the browser if you want. We recommend a machine with at least 2 CPU cores and 8 GB RAM. Alternatively, you can choose to disable browser-powered scanning completely. In this case, Burp Scanner will use the legacy crawling engine.

  • Fetch required resources and data from out-of-scope hosts - This setting controls whether the crawler will issue requests to out-of-scope hosts where necessary. As websites often require the browser to load externally hosted subresources or fetch data from an API in order to function correctly, allowing these requests helps maximize the coverage of your scans. Note that out-of-scope requests made by the crawler will not be audited.

  • Read timeout for site resources - This setting determines how long Burp Scanner will wait in milliseconds when attempting to load subresources during the crawl.

  • Show the crawl in a headed browser - By default, Burp Scanner uses a headless browser for crawling. If you enable this setting, when the scan starts, a new browser window will open in which you can watch the crawler navigating around the target website in real time. This can be useful for troubleshooting.


When watching the crawl in a headed browser, you may see Burp Scanner open a multiple windows and stop using existing ones. This is expected behavior and is not indicative of any issues with the scan. Any redundant windows will automatically be closed after a certain period of time.

Was this article helpful?