3. Known Vulnerabilities
The Payment Card Industry Data Security
Standard (PCI DSS), through requirement 3.4, does offer one-way
encryption as a valid option in storing the primary account number,
which is considered sensitive, in a database. The caveat is that the
one-way encryption must use a strong algorithm. Among the algorithm
options that are available to one-way encryption in SQL Server, PCI DSS
defines the SHA1 algorithm as being considered "... an example of an industry-tested and accepted hashing algorithm.", which is an acknowledgement that SHA1 meets this criteria.
The following sections review a few of the most common known vulnerabilities, when using one-way encryption:
Dictionary Attack Vulnerability
A dictionary attack is one in which a list of values
are hashed and then compared to the hash values stored in the target
data table. This method is often used in an attempt to reveal passwords
that are protected using one-way encryption.
By way of an example, consider an attempted dictionary attack is on the Borrower_Identification table of our HomeLending database, which we've protected using one-way encryption. Within the Identification_Value column are the hash values of Social Security Numbers that are generated through one-way encryption.
The attack, depicted in Figure 2, is executed as follows:
The attacker has created an "Attack
Dictionary" of hash values that are based upon a sequence of plain text
Social Security Numbers, ranging from "555-86-0622" through
"555-86-0626".
Each of the hash values in the attack dictionary is compared to the hash values stored in the Borrower_Identification table.
A match is identified in the Borrower_Identification table with the attack hash value of 0xC36F02D9AC32B2E3813EFF9B 6C23D99D6038FD9A revealing that the plain text value of "555-860625" is a valid Social Security Number within the database.
With this knowledge, the attacker gains access to associated information such as the borrower's name, address and birth date.
A dictionary attack takes advantage of the inherent
nature of one-way encryption by performing the same action that is used
when a user searches one-way encrypted data, but on a larger scale.
In our example, the attacker knows he is looking for
Social Security Numbers which, in their plain text form, have a
standard pattern. It is also known to the attacker that Social Security
Numbers are commonly stored without the dash ("-") character.
Therefore, the attacker has a finite set of base values that will
likely return some matches.
If the DBA added a series of characters to the value
of the Social Security Number, before it was encrypted, the resulting
hash value would be different than the hash value resulting from
encrypting the real Social Security Number, and would increase the
number of possible character combinations required to return a positive
match.
Rainbow Table Attack Vulnerability
Database Administrators are not the only people
interested in efficiency. Those who are interested in attacking a
database to reveal sensitive data that is protected through one-way
encryption are also interested in the efficiency of their efforts. In
order to initiate a dictionary attack on a database containing millions
of records, the attacker would require a large attack dictionary to
cover the possible combinations of plain text and hash values. This
would result in a long running attack that requires a lot of resources
from the database server, therefore increasing the risk of the attack
being detected.
Therefore, the rainbow table attack was developed. The key player in this game is the rainbow table.
The rainbow table consists of a series of rows holding two columns of
data. The first column contains the plain text values that are being
sought, for example a Social Security Number. The second column
contains a value that is the ending hash of a reduction chain. A reduction chain is the result of taking the plain text value in the first column of our rainbow table and creating an initial hash;
then, a portion of the initial hash, such as its first six digits, is
obtained and another hash value is generated. This process continues
for a number of iterations until an ending hash is derived.
The ending hash that is stored in the rainbow table
represents an array of hash values that can be programmatically derived
and iterated in an attack, through the reversal of the reduction chain
building process. This approach provides a very efficient means of
storing the seed values that are used to mount an attack on one-way
encrypted data.
Let's consider an example of how this type of an
attack can affect the sensitive data that is protected with one-way
encryption. As before, we'll assume that a rainbow table attack is in
progress on the Borrower_Identification table of our HomeLending database.
The attacker has created a rainbow table, with a
reduction chain represented by each record's last link, based upon a
sequence of plain text Social Security Numbers ranging from
"555-86-0622" through "555-86-0626", as shown in Table 1.
Table 1. The rainbow table.
Plain Text | Ending Hash in the Reduction Chain |
---|
555-86-0622 | 0x9AEE648230046F795612E1B04171F65CA164E4E1 |
555-86-0623 | 0x6CCCDD87AC087D7BDD92641EEF44635B6F4B7943 |
555-86-0624 | 0x8432814F76099E8EEE0BAA67F39CC44D5F254405 |
555-86-0625 | 0x277847607C4D9984C636628418FDFAEBA26B8B34 |
555-86-0626 | 0xE86571ACB27E193820DCEA9556F7556771F50D26 |
Each of the final reduction chain link hash values in the rainbow table is compared to the hash values stored in the Borrower_Identification
table. This step is basically identical to a dictionary attack. In our
specific example, the result of this stage does not indicate a
successful match for any of our ending hashes.
In the next stage of the attack, we revisit the
process that created our ending hash: the chain reduction process. In
our example, the reduction chain was generated based on the first six
digits of each hash. In the execution of this attack we reverse the
reduction chain by taking the first six digits of the hashed value that
is the subject of our attack. Each subsequent link in the reversed
reduction chain is compared to the ending hash that is stored in the
rainbow table. In our case, a successful match occurs for the plain
text value of "555-86-0625".
Much like a dictionary attack, a rainbow table
attack can be reasonably mitigated through the use of a salt on your
plain text, prior to applying the one-way encryption. These attacks
rely on the perpetrator having anticipated a series of plain text
values, hashing these values and then comparing the resulting hashes to
the values stored within the database. The use of a salt increases the
complexity of the plain text and reduces the likelihood that the
anticipated value is among the plain text values sought by the attacker.
Hash Collision Vulnerability
A hash collision occurs when two unique plain text
values produce an identical hash value. An example would be
both"555-86-1234" and "555-86-5298" returning the identical hash value.
Since a value secured using one-way encryption is not decrypted, and
its underlying plain text is revealed through the comparison of hash
values, a hash collision presents a situation in which the actual plain
text value cannot be determined.
The algorithm selected for the encryption process is
critical in reducing the likelihood of hash collisions. Algorithms that
produce lengthy hashes increase the array of possible values, and so
reduce the probability of a hash collision.
Of course, the larger the volume of records to which
these algorithms are applied, the higher is the risk of a hash
collision. A mathematic problem called "The Birthday Paradox"
is commonly referenced as a formula that can be used to determine the
probability of hash collisions. While not specific to determining the
probability of hash collisions, the Birthday Paradox formula can be
modified to provide this information.
For those who are not mathematics or statistics majors, let's boil this issue down to its basics.
The possible unique combination of values for a
single bit is 2 since a bit is either a 1 or a 0. The possible unique
combination of values for a single byte, which is eight bits, would be
256, represented as 28. The algorithm options that are
provided with one-way encryption return either a 128 bit or a 160 bit
hash value. The possible unique combination of a 128 bit hash would be
340,282,36 6,920,938,460,000,000,000,000,000,000,000, represented as 2128.
The possible unique combination of a 160 bit hash would be
1,461,501,637,330,902,900,000,000,000,000,000,000,000,000,000,000,
represented as 2160.
In order for the possibilities of a hash collision to occur in the Identification_Value column of the Borrower_Identification
table, using a 128 bit algorithm, to reach a meager 0.1% it would
require a volume of 830,000,000,000,000,000 records; each containing a
unique plain text value.
There are other factors that come into play that
have influence on the actual possibilities of a hash collision, such as
the internal processing that takes place within the algorithm.
Regardless, the vulnerability for the occurrence of a hash collision is
real and should be carefully considered.
With the selection of the hashing
algorithm, inclusion of a salt prior to encryption, and by avoiding use
of one-way encryption in tables that have an extremely high volume of
rows, the potential vulnerabilities of the technique can be mitigated,
and it becomes a worthy option to consider when protecting sensitive
data.