Securing Passwords with Salt, Pepper, and Rainbows

You have heard again and again that storing passwords in plain text is a bad idea. So now you store your passwords as MD5 or SHA1 hashes. If someone steals your password database, your users' passwords are safe, right?

Actually, no. They are never totally safe. You can, however, make the effort required to break into an individual account too large for all but the most dedicated attacker.

Unfortunately, most web applications I get a chance to examine don't bother making their password storage more secure, which is a shame — because it really is not that hard.

For completeness, let's start from the bottom and work our way up.

Plain Text Passwords

Anathema to account security. If your password database is compromised, every account is wide open. Congratulations, you just handed over the details of your entire user base.

It gets worse. Anyone listening to the traffic between your application and the database can pluck passwords right out of the air. Very few people secure their database connections with TLS or SSH. On an unswitched network, spying on something like MySQL traffic is as easy as running one command:

# tcpdump -l -i eth0 -w - src or dst port 3306 | strings

Queries like SELECT users.id FROM users WHERE password = 'foo' or results from SELECT users.* FROM users will show up in plain text, and you won't even know your passwords have been stolen.

In a switched environment it is possible to trick the switch into sending you traffic (although this can be detectable). Depending on the hardware, a switch failure may cause it to fail open and behave like an unswitched network anyway.

Simple Hashed Passwords

Hashing is generally seen as the solution. No passwords are stored in plain text, and it is hard to guess a password that matches a given hash. Even if the database is compromised or snooped, you should be fine.

That may have been true once, but hashes for many common words, passwords, and passphrases have already been calculated. Translating from those hashes back to a matching password is trivial. Remember: since the original password is not stored, all you need is any input that produces the same hash.

How easy is it to crack an account protected by an MD5-hashed password? Say we attacked a site and found this table:

Username : Hashed Password
Alice    : a34bc26f864ed5f404eac5b7a20cd9aa
Bob      : 7a75a532aaab234ad4bd33ed67e67242
Malory   : 39579c8d4a536eb092f959b4a3d14aa8
Zebedee  : 57208d910b63e879d2bae3b3a5f8366d

Take each hashed password and look it up in a rainbow table for the appropriate hash algorithm. Given that these are 32 hex characters, they are almost certainly MD5. Using something like GData to search an MD5 rainbow table gives us:

Username : Password
Alice    : alphabets
Bob      : ch1cken
Malory   : blue41
Zebedee  : ?????

Only Zebedee is safe, and that is only for two reasons: (1) he is a freaky little spring creature with a magnificent moustache who can do magic things, and (2) nobody has added his password — or a collision for it — to the rainbow table yet.

Rainbow tables exist for several hashing algorithms including MD5 and SHA1. If the hash is not in the table, causing a collision for a specific account would cost around USD$2,000 and take about a day for MD5.

Multiply Hashed Passwords

Rainbow tables take a long time to populate, and that time can be made longer by running the hash function multiple times before storing the result:

MD5(alphabets)                        = a34bc26f864ed5f404eac5b7a20cd9aa
MD5(a34bc26f864ed5f404eac5b7a20cd9aa) = dd3f1bf5a36529705d08fe50b966d41a
MD5(...)                              = ...
MD5(...)                              = b5fdbbd055fcbfd3958a28f15661aea0

Each iteration takes CPU time, so generating a rainbow table for these hashes costs more. But CPU time is cheap these days. The advantage is that the attacker doesn't know how many times you applied the hash function unless they also have your code. Unfortunately, they can brute-force that number by starting at the hash and working backwards through generated rainbow tables until they find a value that logs into the site. Once that magic number is established, you have just a few days before the rest of your accounts are compromised.

Peppered Hashes

Rainbow tables can be generated reasonably fast, and while they are not trivially cheap, they are no longer prohibitively expensive either. How do we make rainbow tables a less viable attack vector?

Rainbow tables are simply maps from hashes to the inputs that generate them. If we require that every password includes a bit of extra data — a piece of spice, let's call it pepper — that we define in our application, then existing rainbow tables become useless. An attacker would have to generate entirely new tables where every input includes the pepper.

The pepper lives in your application code and never reaches the database except as part of a hash. In this way it behaves much like the magic number in the multiply-hashed approach. And like that approach, once the pepper is discovered it can be used against all accounts.

Someone could — and if they are determined enough, will — calculate a new rainbow table given time. But if you pick a strong, unique pepper, at least there is no off-the-shelf table that works.

Spicy Hashes

By combining the pepper with multiple rounds of hashing, we force the attacker to guess two things: the number of iterations and the pepper.

pepper = ...aliesc3ifCTAasd4$af...
MD5(pepper + password)    = ...b5f34...
MD5(pepper + ...b5f34...) = ...ea28c...
MD5(pepper + ...ea28c...) = ...

SELECT users.id FROM users WHERE hashed_password = ...

I am not entirely sure this buys much over just applying the hash function many times, but it sure looks pretty — and I really wanted an excuse to make a pun about using lots of pepper to make hashes spicy. Sorry.

Salted Hashes

The pepper and the multiply-hashed approaches share a weakness: they use a single value for the entire database. What if there were a different value for each account? A small, unique-per-account value mixed in the same way as the pepper — a salt.

With a per-account salt, a rainbow table generated to crack one account is useless for cracking the next.

Where do we store the salt? I quite like tucking it into the first few characters of the hashed password field, though you might prefer a separate column. Yes, the salt lives right there in the password database. Sounds like it would make cracking easier, right? Not really. All the salt tells an attacker is that the password is somehow combined with this value to produce the hash. The how is still hidden in your application code, and a valid password is still several iterations of rainbow table generation away — for each individual account.

Safe Now?

Not even close. With a solid combination of the above — strong salts, a good pepper, and a decent number of hashing rounds — you have made it unlikely that someone who steals your password database can use it to access accounts. But that doesn't mean your users will pick sane passwords, that your system is bug-free, or that there aren't other ways to find those passwords.