|

Contents
What is Burp Spider?
Using Burp Spider
Run tab
Options tab
Scope
Settings
Burp proxy interface
Timing
Forms
Request headers
Comms tab
Results window
Tree tab
Tree legend
Table tab
Content menus
What is Burp Spider?
Burp Spider is a tool for mapping web applications. It uses
various intelligent techniques to generate a comprehensive inventory of an
application's content and functionality.
Burp Spider maps a target application by following hyperlinks found
within HTML and JavaScript, submitting forms, and using other clues such as
directory listings, source code comments and the robots.txt file. Results are
displayed in both tree and table format, providing a clear and highly detailed
overview of the target application.
Burp Spider enables you to obtain a detailed understanding of how a web
application works, avoiding the time-consuming and unreliable task of manually
following links, submitting forms and scouring HTML source code. Potentially
vulnerable application functions can be quickly identified, allowing you to
check for specific vulnerabilities such as SQL injection and directory
traversal.
Using Burp Spider
To launch Burp Spider against an application requires three simple steps:
- Provide one or more "seed URLs" (see run tab).
These are the starting points from which Burp Spider will begin.
- Configure the scope to include or exclude specific domains, directories,
URLs and filetypes (see options). By default, the
domains used in any seed URLs are included in the scope.
- Start Burp Spider running.
An easy way to perform the first two steps is to select the "send to spider"
option within any of the other Burp tools. This will automatically add the
relevant seed URL and also include the relevant host and parent path within the
spider scope.
Numerous configuration options are also available to control the behaviour of
Burp Spider.
Run tab
This tab is used to set seed URLs, start and stop Burp Spider, and monitor
its progress.

Spider running - This is used to start and stop the spider. When Burp
Spider is stopped no requests will be made, although the spider will continue
to process any response data pending from prior requests.
seed URLs - These are the starting points from which Burp Spider will
begin its mapping. If all of a web site's content can be navigated to by following links
from the main page, then only a single seed URL is necessary. If the site
contains pockets of isolated content, then seed URLs from these pockets should
also be added. Seed items can be entered as standard URLs and also IP ranges,
with optional service and port number specified, for example:
- 10.1.1.1/24
- https://192.168.1-5.2:8443
- http://www.target.com/app/
automatically add seed URL domains to scope - If this is checked, then
the root directory of all seed domains will be added to the scope of the
Spider.
show results - This toggles display of the results
window. Because of the resources required for screen updates, Burp Spider will
run faster when the results window is not displayed.
Options tab
This tab contains various configuration options which control the behaviour
of Burp Spider, as described below. These settings can be modified after the
Spider has started running, and will be applied retrospectively to prior
results. For example, if the maximum link depth is increased, then links which
were previously outside the maximum depth will be rechecked, and requested if
appropriate.

max link depth - This is the maximum number of "hops" which
Burp Spider will navigate from any seed URL. A value of zero will cause Burp
Spider to request seed URLs only. If the box is unchecked, then in-scope links
will be followed indefinitely.
include in / exclude from scope - These two lists control the scope of
Burp Spider. Items can be entered as IP ranges, with optional service and
port number specified, or as regular expressions to match part or all of a URL.
For example, adding "logout" to the "exclude from scope" list will prevent
Burp Spider from requesting any URL containing this expression anywhere within it.
include any URL not specifically excluded - This will cause Burp
Spider to request any discovered URLs which are not ruled out by the exclude
list. This option should be used with caution, as following off-site links may
result in an attempt to spider the entire Internet!
exclude filetypes - URLs with the specified filetypes will not be
requested. Any non-text filetypes used within the target application should be
added to this list.

check robots.txt - If checked, Burp Spider will request and process
the robots.txt file from all in-scope domains. This file is used by the robots
exclusion protocol to control the behaviour of spider-like agents on the
Internet. Note that Burp Spider does not confirm to the robots exclusion
protocol. Because Burp Spider is designed to comprehensively enumerate a target
application's content, all entries in robots.txt will be requested if they are
in-scope.
parse JavaScript for links - If checked, Burp Spider will attempt to
extract hyperlinks from within JavaScript code, in addition to HTML. Burp Spider
is effective at identifying absolute and relative URLs within JavaScript,
however it does not attempt to reassemble URLs which the script constructs
dynamically at runtime from smaller components.
request root of all identified directories - If checked, Burp Spider
will request all identified directories within the target web site, in addition
to files within those directories. This option is useful if directory indexing
is available on the target site.
detect custom "not found" responses - The HTTP
protocol requires web servers to return a 404 response code if a requested
resource is not found. However, some web applications are configured to return
customised non-404 responses. If this is the case, then using this option can
prevent false positives in the mapping of site content. Burp Spider detects
custom "not found" responses by requesting several nonexistent
resources from each domain, and compiling a fingerprint with which to diagnose
"not found" responses to other requests.
use Referer header - If checked, Burp Spider will submit the relevant
Referer header when requesting any resource that was linked to from another
page.
use cookies - If checked, Burp Spider will process Set-Cookie
instructions in server responses, and will submit any received cookies in
subsequent requests to the same domain. This option is normally necessary when
spidering web applications which use forms-based authentication.
test session-dependence of pages - If checked, Burp Spider will check
whether each identified page responds differently if cookies are not used. This
may be useful in identifying genuine "application" pages, as opposed
to static content. This option is only available if Burp Spider is configured to
use cookies.
follow links in random order - If checked, Burp Spider will request
enumerated URLs in a random order, as opposed to the order in which they were
identified. This option is useful if stealth is required, or to spread the load
more evenly when multiple servers are included in the scope.
store responses in temp dir - If checked, Burp Spider will store
server responses on disk in a temporary directory, as opposed to volatile
memory. This option significantly reduces memory consumption when spidering very
large websites.

These options control the interface
between Burp Proxy and Burp Spider, which allows "passive" spidering of
websites, controlled through the user's browser.
process requests made through burp proxy - If checked, Burp Spider
will process all HTTP requests made through Burp Proxy, to identify links and
forms on web pages visited. Using this option can enable Burp Spider to build up
a detailed picture of an application's contents even when you have only browsed
a subset of that content with your browser, because all content that is linked
from visited content is automatically added to the spider's map.
get cookies from requests / responses - These options control whether
Burp Spider will update the cookies it uses when making its own requests, from
the Cookie: header in web requests passing through Burp Proxy, and the
Set-Cookie: header in web responses passing through Burp Proxy.
link depth to associate with proxy requests - This option controls the
"link depth" which will be associated with web pages accessed through
Burp Proxy. To prevent Burp Spider following any links in these pages (even when the
Spider is running and these links are in-scope) set a higher value for this
option than the "maximum link depth" option above.

throttle between requests - This option enforces a time delay (in
milliseconds) between requests. This is useful to avoid overloading the target
server, or consuming excessive CPU resources on the client computer.
add random variances - This option will vary the throttle between 0.5
and 1.5 times the specified delay. Some intrusion detection systems may be
triggered by a large number of requests issued at highly regular intervals. This
option reduces the likelihood of Burp Spider's activity being diagnosed as
automated. If used in conjunction with a long request throttle and random
sequencing of requests, this allows highly stealthy spidering of a target
application.

parse HTML for forms - If checked, Burp Spider will extract details of
HTML forms from all pages requested.
do not submit - If selected, Burp Spider will not submit any forms
identified.
prompt for guidance - If selected, Burp Spider will prompt you for guidance before submitting each identified form. This allows
you to
enter custom data into form fields as required.
auto submit - If selected, Burp Spider will automatically submit any
forms identified (provided that the form's action URL is in scope). Text fields
will be populated with the specified value. If a form contains elements such as
select lists and checkboxes, Burp Spider will submit each form multiple times
with all permutations of available items, up to the maximum number of
submissions specified. Forms containing password fields will not be
automatically submitted, but will be presented to the user for guidance. Note:
The auto-submit option should be used with caution, as it may result in
undesirable actions (e.g. making bogus submissions to an application's user
registration page).
don't resubmit forms - This option configures the criteria for
individuating forms (action URL, method, fields, values). When Burp Spider
processes each form, it will check these criteria to determine if the form is
"new". Forms which are not new will not be submitted.
make simple GET requests to all dynamic pages - If checked, Burp
Spider will make a non-parameterised request to all in-scope form action URLs.
Dynamic pages usually respond differently if the GET or POST data expected is
not received, and this option may successfully detect additional site content.

Request headers - This section allows customised HTTP headers to be configured which will be
used in all requests. This may be useful to meet specific requirements of
individual applications - e.g. to emulate an expected user agent.
Results window
The results window contains two views of Burp Spider's results, in tree and table format. To display the
Spider results window, use the "show results" button
in the run tab.
Tree tab
This tab displays all enumerated resources in a tree format representing the
directory structure of the target web site.
Usage tips:
- Click on any item to display details in the right hand pane.
- Hold the mouse over any item to display the page title, full URL and other
details.
- Double-click on any linked from / links to item, to navigate to that item
in the tree.
- Right-click on any item in the tree or link lists to display a context
menu with numerous options.
- Unrequested items and unsubmitted forms are displayed in grey.
- Non-200 HTTP codes are displayed in red.
- If Burp Spider is configured to detect custom "not found"
responses, these are marked with [??].

Tree legend
The tree view uses the following icons to represent
individual results
Standard web page.
HTML form.
A "dynamic" page which processes client-submitted GET and POST
parameters.
A URL submitting a preset query string to a dynamic page.
A form response submitting GET or POST data to a dynamic page.
An item which could not be requested due to network or other problems.
An image resource.
A JavaScript file.
An object resource, e.g. Java applet or Microsoft ActiveX control.
A sound resource.
Table tab
This tab displays all requested resources in a table format. For each
requested resource, the table shows the server, URL, whether SSL was used,
whether the item is a dynamic page, whether the resource is session-dependent,
the HTTP status code, whether the resource was not found (this may apply to
non-404 responses if custom "not found" detection
is configured), the response size in bytes, whether the response set a cookie,
and the page title of HTML pages.
Usage tips:
- Click on any column heading to sort the table using that column
(shift-click to sort descending).
- Right-click on any item to display a context menu
with numerous options.

Context menus
In both the tree and table views, right-clicking on any item displays a popup
menu containing numerous options relevant to the selected item, for example:

The following options are available on the popup menu, depending on the item
selected:
- expand / collapse branch
- find in table / tree
- add to / remove from scope
- new seed URL
- (re-)request item (this is not available if the item is already queued to
be requested)
- request all items / entire branch (applies to directories)
- request all links / all in-scope links
- copy URL to clipboard
- (re-)submit form
- send the request or form submission to other Burp tools
|