Welcome to AppFail
You last visited: never

Welcome to AppFail

Posted on 2009-06-01

Security Post

Hashing is a mathematical function that takes any string, and turns it into a relatively small number of a fixed size. This number is often displayed as a hexadecimal string to make it easier to display. In effect, a hash divides an infinite number of strings of infinite length, into a finite domain of a fixed length.

Examples ("Hashing is fun"):
MD5 decimal: 61562249911893293772428344986678853632
MD5 hexadecimal = 2e507543164679c12c8ba41b28e0686a
SHA1 decimal: 732053671603385471841390244631656351572288339968
SHA1 hexadecimal = 803a6c04abd60201db756b27ddca07ca8b32ebdc
SHA256 = d4461e1f33db9190c72806ea55a693497b564cef03adbe9e1d59519b77eaff49

Why is hashing good

Hashing provides a number of features that are useful to us:

  • It is deterministic, the same input always generates the same output
  • It works on a string of any length, whether it be your password, or a 20GB hard drive image, the hashing algorithm will always output its fixed length string
  • It is a one way function, once you have the hash, you cannot get the original string back. If the passwords were stored with ordinary encryption, there would be one "master password", that would be able to decrypt all of the users' passwords, and this password would need to be stored on the server, in such a way that the server could decrypt a users password to verify that an attempted login was valid. This would make it very easy to steal

Why we use hashes

What makes hashing so useful, is that the output for the same input is always exactly the same, but if the input changes even slightly, the output is changed entirely.

MD5 ("Hashing is fun") = 2e507543164679c12c8ba41b28e0686a
MD5 ("hashing is fun") = 855b004cfb430625cab791206079fc33
MD5 ("Hashing is fun.") = eca27a4cb2eec11f55129e1172bed0d0
MD5 ("Hashing is fum") = 9343baea3f3645418029e96bd8fa7317

Identifying the Algorithms

There are a number of popular password hashing algorithms, to make supporting many of these at once easier, a standard was adopted, this modular system allows new password hashing algorithms to easily be added to existing systems, without completely breaking backwards compatibility. The beginning of a password tells the system which algorithm, and if applicable, which settings for that algorithm are to be used for computing the hash. These hashes also include what is known as a "salt", this is added to the password to provide further randomness, extend the key space and to prevent two users with the same password, from having the same hash.

What they look like

Password hashing strings look slightly different than just a regular MD5 string, because of the inclusion of some additional information. Here are a bunch of hashes for the password "password":

		MD5 with 2 different salts (randomly generated):
		Blowfish with random salts, 6 and 7 rounds respectively:

The first field (separated by $) specifies the hashing algorithm, 1 being MD5 and 2a being blowfish. In md5 the next field is the salt, and then the password hash. With blowfish it is the number of rounds (logarithmic), then the salt is a fixed length, then the rest is the password hash. The underlined sections are the salt.

A third algorithm

HMAC-SHA1 with 24680 as the hinted # of rounds:

With the HMAC-SHA1 algorithm, the first part is the identifier (sha1), then that is followed by the number of rounds used in the hashing, in passwd.conf you specify a "hint" that can be anywhere from 0 to 2^32, but this exact number is not used, but rather just a close approximation, this means each password is encrypted with a different number of rounds. The next field is the salt, followed by the actual password hash.

How this comes in to play

Now we must apply this concept to how we actually manage authentication with these algorithms. As we know, hashing is a one-way algorithm, and cannot be reversed, so this means once the password is hashed and stored in the master.passwd file, the operating system has no way of getting the original password back. So that leads to the question, how does the operating system know if you've given it the correct password?

Verifying a password hash

When you attempt to login, you send the operating system your username, and your password (in plain text). The operating system looks up your user record in master.passwd and sees that your hash is: $1$uBTV6Z.r$pzZDWejVK5uCfsa.5Bu1E/ This means you have an MD5 hashed password, and the salt for that password is "uBTV6Z.r" So the operating system takes the password you have given and hashes that, using the salt from the hash in the password database. If your password is correct, it will generate the same MD5 hash string, if the password is incorrect, it will generate a string like this:
and because this does not match, your login attempt will fail

Brute Force Cracking

Using a brute force cracking tool like "John the Ripper", we will take the password file from the system, and attempt to find the plaintext password, so that we can login to the system.
To do this, John generates random strings (statistically modified to be more like what people will use for their passwords, rather than guessing aaaa, aaab, aaac, etc, it guesses all of those combinations, but it guesses strings with the letter S and E and so on first).
What John must do to crack a password, is take the string it wants to guess (from a wordlist, or its generator), and hash that using the salt specified in the password file, and keep guessing until it finds a hash that matches.

Combating Brute Force

This is where the configuration options for the algorithms come in, with blowfish for example, the default number of "rounds" is 7, but if you increase this to 8, generating each hash (guess) will take 10 times longer. Of course, at some point this becomes to expensive for day to day operations (if it takes 60 seconds to generate each hash, then that means every users has to wait that long every time you want to login). It is also hard on the system load, especially for a multi-user system. However, this configurability allows the system to scale as computers get faster and faster, and as more and more servers start to incorporate cryptography offloading cards.

Rainbow Tables

Rainbow tables are another approach to cracking passwords, using what is known as the "time-memory tradeoff". Instead of computing the hash for each possible password, they use a table, where the hash for each password has already been computed and stored. The problem with this type of system, is that these tables can be very large, and they only cover a finite character set, and password length. See the next slide for some examples of the limitations

Rainbow Tables - LM Hash

LMHash Rainbow Table #1 Charset:
LMHash Rainbow Table #2 Charset:
LMHash Rainbow Table Full Charset:

This should how quickly extending the character set can increase the size of a rainbow table. Just by adding the numbers 0 through 9 we have made the rainbow table almost 5 times larger. And you have to remember, the LMHash algorithm that Windows uses, is case insensitive and breaks your password into 2 separate 7 character passwords.

Rainbow Tables - MD5

MD5 Rainbow Table #1 Charset:

This rainbow table only covers passwords hashed with MD5 up to 8 characters in length, and only lowercase alpha-numeric. It also does not include salts.
Now, consider that the MD5 hashes used in modern operating systems have salts of 8 characters, and that the salts include numbers, upper and lower case letters, plus dot, slash and other characters. So, an 8 character password, plus the 8 character salt, and you need to extend the character set to include all possible characters, and the rainbow table is now over a terabyte

Key Points:

  • The salts used in password hashes should be random
  • The salts for each user should be different, if it were the same, a rainbow table could be created that included that one salt, and this would negate the entire advantage of a salt
  • The salt extends the length, and the character set of the password, making them harder to crack

Written by 
Near Source IT
Blog Post

AppFail.com profiles some of the biggest and most spectacular failures of (Web) Applications, Infrastructure, Design and Security on the internet. A wise man learns by the mistakes of others, a fool by his own.

Apps Post Welcome to AppFail.com We will test your apps, websites, security and see how bad it will fail. If you have a site that is on digg and you go down when a link is posted... You need us.

First Previous
  1 2 3 4 5

Cuiusvis hominis est errare; nullius nisi insipientis in errore perseverare - Any man can make a mistake; only a fool keeps making the same one.

Digg Proof Hosting
The key to surviving Digg and Slashdot is Infrastructure. You can't get it from a regular web host, it requires experience. The High Load Hosting Experts at ScaleEngine can make your site thrive, and avoid having your site featured on AppFail.

Cyber Security Alerts

Page Generated in 93ms