Home

 

 

Blog

 

Burp suite
About
Screenshots
Help
Download
Professional

 

Burp scanner

 

Burp intruder

 

Burp proxy

 

Books

 

Misc

 

 

RSS

 







Burp Sequencer help

Contents

What is Burp Sequencer?

How the randomness tests work
    Character-level analysis
    Bit-level analysis

Obtaining a sample of tokens
    Performing a live capture
    Performing a manual load

Analysis results

Analysis options

 

What is Burp Sequencer?

Burp Sequencer is a tool for analysing the degree of randomness manifested by a sample of data items. In the context of web applications, the sampled data will typically consist of session tokens, anti-XSRF nonces, or other items on whose unpredictability the application depends for its security.

Burp Sequencer can be run in two modes:

  • live capture - The sample of tokens is acquired in real-time from the application. The analysis of randomness can be performed and updated while the capture is in progress.
  • manual load - The sample of tokens has already been acquired and is loaded into the tool. The analysis of randomness is then performed.

Burp Sequencer contains various options which can be configured to control both the live capture of tokens and their subsequent analysis. Individual tests for randomness can be turned on or off according to your requirements. In some cases, it may be necessary to understand the nature of the tests performed, and any unusual characteristics of your sample, in order to use Burp Sequencer most effectively.

How the randomness tests work

Burp Sequencer employs standard statistical tests for randomness. These are based on the principle of testing a hypothesis against a sample of evidence, and calculating the probability of the observed data occurring, assuming that the hypothesis is true:

  • The hypothesis to be tested is: that the tokens are randomly generated.
  • Each test observes specific properties of the sample that are likely to have certain characteristics if the tokens are randomly generated.
  • The probability of the observed characteristics occurring is calculated, working on the assumption that the hypothesis is true.
  • If this probability falls below a certain level (the "significance level") then the hypothesis is rejected and the tokens are deemed to be non-random.

The significance level is a key parameter in this methodology. Using a lower significance level means that stronger evidence is required to reject the hypothesis that the tokens are randomly generated, and so increases the chance that non-random data will be treated as random. There is no universally "right" significance level to use for any particular purpose: scientific experiments often use significance levels in the region of 1% to 5%; the standard FIPS tests for randomness (which are implemented within Burp Sequencer) use significance levels in the region of 0.002% to 0.03%. Burp Sequencer lets you choose what significance level you wish to use to interpret its findings:

  • Each individual test reports the computed probability of the observed data occurring, assuming that the hypothesis is true. This probability represents the boundary significance level at which the hypothesis would be rejected, based solely upon this test.
  • Aggregated results from multiple tests are presented in terms of the number of bits of effective entropy within the token at various significance levels, ranging from 0.001% to 10%. This summary enables you to see how your choice of significance level affects the "quantity" of randomness deemed to exist within the sample. In typical cases, this summary demonstrates that the choice of significance level is a moot point because the tokens possess either a clearly satisfactory or clearly unsatisfactory amount of randomness for any of the significance levels that you may reasonably choose.

Some important caveats arise with any statistical-based test for randomness. The results may contain false negatives and positives for the following reasons:

  • Data that is generated in a completely deterministic way may be deemed to be random by statistical-based tests. For example, a well-designed linear congruential pseudo-random number generator, or an algorithm which computes the hash of a sequential number, may produce seemingly random output even though an attacker who knows the internal state of the generator can extrapolate its output with complete reliability in both forwards and reverse directions.
  • Data that is deemed to be non-random by statistical-based tests may not actually be predictable in a practical situation because the patterns that are discernible within the data do not sufficiently narrow down the range of possible future outputs to a range that can be viably tested.

Because of these caveats, the results of using Burp Sequencer should be interpreted only as an indicative guide to the randomness of the sampled data.

The tests performed by Burp Sequencer divide into two levels of analysis: character-level and bit-level.

Character-level analysis

The character-level tests operate on each character position of the token in its raw form. First, the size of the character set at each position is counted - this is the number of different characters that appear at each position within the sample data. Then, the following tests are performed using this information:

  • Character count analysis. This test analyses the distribution of characters used at each position within the token. If the sample is randomly generated, the distribution of characters employed is likely to be approximately uniform. At each position, the test computes the probability of the observed distribution arising if the tokens are random.
  • Character transition analysis. This test analyses the transitions between successive tokens in the sample. If the sample is randomly generated, a character appearing at a given position is equally likely to be followed in the next token by any one of the characters that is used at that position. At each position, the test computes the probability of the observed transitions arising if the tokens are random.

Based on the above tests, the character-level analysis computes an overall score for each character position - this is the lowest probability calculated at each position by each of the character-level tests. The analysis then counts the number of bits of effective entropy for various significance levels. Based on the size of its character set, each position is assigned a number of bits of entropy (2 bits if there are 4 characters, 3 bits if there are 8 characters, etc.), and the total number of bits at or above each significance level are calculated.

Bit-level analysis

The bit-level tests are more powerful than the character-level tests. To enable bit-level analysis, each token is converted into a set of bits, with the total number of bits determined by the size of the character set at each character position. If any positions employ a character set whose size is not a round power of two, some information within the sample will be lost in the conversion to a bit sequence. This loss is typically very small and does not materially affect the accuracy of the bit-level results.

When each token has been converted into a sequence of bits, the following tests are performed at each bit position:

  • FIPS monobit test. This test analyses the distribution of ones and zeroes at each bit position. If the sample is randomly generated, the number of ones and zeroes is likely to be approximately equal. At each position, the test computes the probability of the observed distribution arising if the tokens are random. For each of the FIPS tests carried out, in addition to reporting the probability of the observed data occurring, Burp Sequencer also records whether each bit passed or failed the FIPS test. Note that the FIPS pass criteria are recalibrated within Burp Sequencer to work with arbitrary sample sizes, however the formal specification for the FIPS tests assumes a sample of precisely 20,000 tokens. Hence, if you wish to obtain results that are strictly compliant with the FIPS specification, you should ensure that you use a sample of 20,000 tokens.
  • FIPS poker test. This test divides the bit sequence at each position into consecutive, non-overlapping groups of four, and derives a four-bit number from each group. It then counts the number of occurrences of each of the 16 possible numbers, and performs a chi-square calculation to evaluate this distribution. If the sample is randomly generated, the distribution of four-bit numbers is likely to be approximately uniform. At each position, the test computes the probability of the observed distribution arising if the tokens are random.
  • FIPS runs tests. This test divides the bit sequence at each position into runs of consecutive bits which have the same value. It then counts the number of runs with a length of 1, 2, 3, 4, 5, and 6 and above. If the sample is randomly generated, the number of runs with each of these lengths is likely to be within a range determined by the size of the sample set. At each position, the test computes the probability of the observed runs occurring if the tokens are random.
  • FIPS long runs test. This test measures the longest run of bits with the same value at each bit position. If the sample is randomly generated, the longest run is likely to be within a range determined by the size of the sample set. At each position, the test computes the probability of the observed longest run arising if the tokens are random. Note that the FIPS specification for this test only records a fail if the longest run of bits is overly long. However, an overly short longest run of bits also indicates that the sample is not random. Therefore some bits may record a significance level that is below the FIPS pass level even though they do not strictly fail the FIPS test.
  • Spectral tests. This test performs a sophisticated analysis of the bit sequence at each position, and is capable of identifying evidence of non-randomness in some samples which pass the other statistical tests. The test works through the bit sequence and treats each series of consecutive numbers as coordinates in a multi-dimensional space. It plots a point in this space at each location determined by these co-ordinates. If the sample is randomly generated, the distribution of points within this space is likely to be approximately uniform; the appearance of clusters within the space indicates that the data is likely to be non-random. At each position, the test computes the probability of the observed distribution occurring if the tokens are random. The test is repeated for multiple sizes of number (between 1 and 8 bits) and for multiple numbers of dimensions (between 2 and 6).
  • Correlation test. Each of the other bit-level tests operates on individual bit positions within the sampled tokens, and so the amount of randomness at each bit position is calculated in isolation. Performing only this type of test would prevent any meaningful assessment of the amount of randomness in the token as a whole: a sample of tokens containing the same bit value at each position may appear to contain more entropy than a sample of shorter tokens containing different values at each position. Hence, it is necessary to test for any statistically significant relationships between the values at different bit positions within the tokens. If the sample is randomly generated, a value at a given bit position is equally likely to be accompanied by a one or a zero at any other bit position. At each position, this test computes the probability of the relationships observed with bits at other positions arising if the tokens are random. To prevent arbitrary results, when a degree of correlation is observed between two bits, the test adjusts the significance level of the bit whose significance level is lower based on all of the other bit-level tests.
  • Compression test. This test does not use the statistical approach employed by the other tests, but rather provides a simple intuitive indication of the amount of entropy at each bit position. The test attempts to compress the bit sequence at each position using standard ZLIB compression. The results indicate the proportional reduction in the size of the bit sequence when it was compressed. A higher degree of compression indicates that the data is less likely to be randomly generated.

Based on the above tests, the bit-level analysis computes an overall score for each bit position - this is the lowest probability calculated at each position by each of the bit-level tests. The analysis then counts the number of bits of effective entropy for various significance levels.

Obtaining a sample of tokens

Before it is possible to analyse the randomness of the tokens generated by an application, it is necessary to obtain a suitable sample of tokens. This can be done in two ways: by performing a live capture of tokens directly from the target, or by loading a sample of tokens that you have already acquired.

Performing a live capture

To perform a live capture, you need to locate a request within the target application which returns somewhere in its response the session token or other item that you want to analyse. You can do this using the "send to sequencer" option within any of the other Burp tools:

Now switch to the "live capture" tab of Burp Sequencer. The tool maintains a list of all the requests that have been sent to it. If the request you are interested in is not already selected, click on it in the list of requests. The response to the selected request is displayed within the "token location" panel.

The next step is to identify the location of the token you are interested in. If the token appears as the value of a Set-Cookie directive, or the value of a form field, you can select the relevant item from one of the drop-down lists. Alternatively, you can select an arbitrary position within the response where the token appears. If you do this, Burp Sequencer automatically identifies a unique prefix and delimiter which encapsulates the portion of the response you have selected. In most cases, the values automatically identified will work correctly. In some situations, you may wish to tweak these by specifying your own unique prefix or offset, or your own delimiter or token length.

When you have identified the location of the token within the application's response, you can configure various options affecting the live capture by switching to the "capture options" tab. Here you can control the speed of token acquisition, by specifying a number of request threads and a time throttle to pause between requests. In general, you should try to obtain samples as quickly as possible given the speed of your network connection and the target application, to minimise the "loss" of tokens issued to other application users.

You can also instruct Burp Sequencer to ignore tokens whose length deviates by a given threshold from the average token length. This can be useful if the application occasionally returns an anomalous response containing a different item in the location where the token normally appears.

When you have configured any required live capture options, click the "start capture" button to begin the live capture. Burp Sequencer will repeatedly issue your request and extract the relevant token from the application's responses:

You can use the "pause" and "stop" buttons to control the progress of the live capture. You can use the "copy" and "save" buttons to retrieve the current sample of tokens, for use in any other tool.

As soon as 100 tokens have been captured, you can perform an analysis of the tokens, to get an initial rough indication of the quality of their randomness. Click the "analyse now" button to do this. If you check the "auto analyse" box, Burp Sequencer will automatically perform an analysis and update the results periodically during the live capture.

Obviously, a larger sample size enables a more reliable analysis. A sample of 5,000 tokens is sufficient to perform a reliable analysis for most purposes. The live capture continues until 20,000 tokens have been captured, which is sufficient to perform FIPS-compliant statistical tests. 

Performing a manual load

To perform a manual load, you first need to obtain your own sample of tokens from the target application through some means, such as your own script or the output from an earlier live capture. The tokens need to be in a simple newline-delimited text format.

Go to the "manual load" tab of Burp Sequencer and use the "load" or "paste" button to load your tokens into the tool. The loaded tokens, together with details of their size, are displayed for you to sense-check that the sample has loaded correctly:

To perform the analysis of the loaded tokens, click the "analyse now" button.

Analysis results

The results window contains full details of all of the tests performed. The summary tab is the first place to look to get an overall conclusion about the degree of randomness in the sample. It includes a chart showing the number of bits of effective entropy at or above each significance level. This provides an intuitive verdict on the number of bits that pass the randomness tests for different possible significance levels. In the example shown, a large number of bits pass the tests even at the strictest significance level of 10%:

Within the "character-level" and "bit-level" tabs, you can drill down into the detail of each type of test, to gain a deeper understanding of the properties of the sample, to identify the causes of any anomalies, and to assess the possibilities for token prediction. Within each group of tests, there is a summary tab showing the overall score achieved by each position within the token, and also a tab for each individual test, reporting the results of that test and the details of any anomalies identified. For example, the following shows the results of the FIPS monobit test:

Within the bit-level analysis, there is also a tab showing how the character-level data was converted into a sequence of bits to enable the bit-level tests. This will enable you cross-reference individual bits within the token back to the original character positions, if you need to.

Analysis options

Burp Sequencer lets you configure which individual tests are performed, and how the raw token data should be interpreted, in the "options" tab. If the tokens produced by the application have variable length, these will need to be padded to enable the statistical tests to be performed. You can choose whether the padding should be applied at the start or the end of each token, and you can specify the token that will be used for padding. In most situations, padding the start of tokens with the '0' character is the most appropriate option, but you should examine the tokens produced by the application to determine whether a different setting is more effective. You can also tell Burp Sequencer to Base64-decode the raw tokens before analysing them, if that is necessary.

The analysis results windows also has an "options" tab which shows the options that were used to generate the current analysis. You can modify these within the results window and then click the "redo analysis" button to re-perform the analysis with your new settings. For example, this enables you to tweak the analysis options mid-way through a live capture, to reflect your better understanding of the tokens' characteristics, or to isolate the effects of any unusual characteristics manifested by your sample.

 

 

Copyright (c) 2010 PortSwigger Ltd. All rights reserved. Email us.