Complex topics explained simply: Hashing

July 14, 2015 Brett van Zuiden

Why is it that websites require you to reset your password instead of just reminding you what it is? After all, they check your password when you enter it, can’t they just look it up? It turns out they can’t actually – websites don’t know your password and instead use a technique called “hashing” to check if what you log in with matches what you set when you created your account.

This is strange! It seems like if I’m going to tell you whether or not something matches your password, I have to know what your password is, right? Well, what about this: say your password is “rabbit,” but instead of just writing down that your password is “rabbit,” I instead write down that your password:

is an english word,
starts with an ‘R,’
has six letters,
contains two Bs,
and the third letter comes just after the second letter in the alphabet.

Now if you came to me and said your password was “frog,” I’d say nope, just like if you told me your password was “trophy” or “ribbed.” But if you asked me what your password was, I couldn’t tell you – I can only tell you whether some word you give me matches the formula I’ve written down. In essence, I’ve created a “hash” of your password.

Now, our “hash” isn’t a particularly good one – if you’re clever, you could probably come up with another word besides “rabbit” that would match the formula we wrote down (try it out!). In hashing terms, this is called a “collision.” Interestingly, while computers use a lot of complex math that makes it very, very rare that there is a collision, it is possible that you could go to Facebook, enter something other than your password, and get in. You could spend a million lifetimes trying and probably never find anything, but it’s not impossible – after all, Facebook doesn’t know your exact password, it only knows the hash – a formula about what your password “looks like” to a computer.

The other reason why our “hash” isn’t particularly good is that given the formula we wrote down, you could probably figure out what the original password was. In hashing terms, this is called “reversing” or “cracking” the hash. It’s not as easy as you might think to reverse a hash, though. The formula we wrote seems easy because you know we started out with the password “rabbit,” but try coming up with the password given the hash:

is an english word,
contains four of the six vowels,
has eight common english words “inside” it (i.e. “bit” is a word inside “rabbit,” but “rat” is not),
is eight letters long,
has no duplicate letters,
and there are no other eight-letter words possible from rearranging its letters.

See if you can come up with the answer yourself. As it turns out, the way you probably tried to figure it out is similar to how computers crack passwords: come up with a list of likely words and try them out to see if they match. Security folks call this a “dictionary attack” or a “rainbow table,” and it’s why people recommend using long, hard-to-guess passwords. Just like in trying to avoid collisions, computers use a lot of complex math to make it very, very difficult to reverse hashes, and many of the modern ones even protect against dictionary attacks by making it hard to guess a bunch of passwords at once. After all, if you could get your hands on a hash and reverse it, you could steal someone’s password and log into their account!

There’re many different ways to create a hash of a password, and the best methods make it incredibly unlikely to have a collision and unfathomably difficult to reverse. What they all have in common is that they allow computers to recognize passwords without having to store the password itself – so if you call us up here at Clever ask us to remind you what you chose for your password, while we’d love to be able to help you out, we simply can’t tell you what it is!

Originally posted on Brett’s personal site.

Complex topics explained simply: Hashing

More to read

Subscribe to receive news and updates from Clever.