Using Burp Spider
Burp Spider is a tool for automatically crawling web applications. You can use this in conjunction with manual mapping techniques to speed up the process of mapping an application's content and functionality.
Manual Preparation
Before performing any automated spidering, it is generally preferable to carry out some manual preparatory work:
- Set up your browser and Burp Proxy to work together, and then manually map the target application . This process will populate the Target site map with a fairly complete record of the application's visible content.
- Using the Target site map, define the target scope for your work, and if desired configure a site map display filter to show only in-scope content.
- Manually review and request any unrequested items in the site map.
Note: Although this manual process is more time consuming than proceeding directly to automated crawling, it is generally safer and more effective.
Configuring Spider Settings
Burp Spider uses various techniques to crawl application content, and by default it will follow all in-scope links, submit forms with dummy data, and make additional requests (for robots.txt, directory roots, etc.). In some situations, running an automated spider in this way can result in unintended consequences, such as registering new user accounts, generating feedback emails, or changing other application state. You should use any automated tools with caution, if possible against only non-production systems. You should also closely review the Spider settings before use, and ensure that these are suitable for your application and your requirements. In particular, you should review the following details:
- General crawler settings.
- Settings for form submission, including the dummy inputs used in automated submissions.
- Settings for application login and general session handling (if applicable).
- The defined spidering scope, which determines the links that the Spider will follow.
Note: When running, the Spider will follow links for any URLs that are within the currently defined scope . For example, if you define a whole domain as being in scope, and then initiate spidering from a single branch in the site map, the Spider may still request items that are outside that branch, but within the wider scope. To ensure that the Spider only requests items within a specific branch, you should first configure the spidering scope to include only this branch.
Initiating the Spider
If you have already performed manual application mapping , and configured a suitable spidering scope, then you can begin spidering by pressing the " Spider is running / paused " toggle button on the Control tab.
Alternatively, you can select a branch of the target site map, or a request anywhere within Burp, and initiate spidering via the context menu. If you do this for a branch or item that is not currently in the spidering scope, Burp will prompt you for confirmation, and if you do so Burp will expand the current scope to include the specified item and any sub-items within the site map.
When spidering a selected branch of the site map, Burp will carry out the following actions (depending on your settings):
- Request any unrequested URLs already present within the branch.
- Submit any discovered forms whose action URLs lie within the branch.
- Re-request any items in the branch that previously returned 304 status codes, to retrieve fresh (uncached) copies of the application's responses.
- Parse all content retrieved to identify new URLs and forms.
- Recursively repeat these steps as new content is discovered.
- Continue spidering all in-scope areas until no new content is discovered.
You can monitor the status of the Spider when running, via the Control tab. Any newly discovered content will be added to the Target site map.
Note: When spidering, or performing other content discovery tasks, you can easily monitor the site map to identify items that have been newly added. To do this, select the entire application within the site map tree, and sort the table view on the "Time requested" column (click the column header to cycle through ascending sort, descending sort, and unsorted). This will order the table entries according to the time they were requested, allowing you to quickly identify new items as they appear.