Frank talks about Storing Passwords in MySQL. He does, however, miss something that’s really, really important. I’m talking about the salting of passwords.
If I want to find out what 5d41402abc4b2a76b9719d911017c592 or 015f28b9df1bdd36427dd976fb73b29d MD5s mean, the first thing I’m going to try is a dictionary attack (especially if i’ve seen a table with only user and password columns). Guess what? A list of words and their MD5SUMS can be used to very quickly find what these hashes represent.
I’ll probably have this dictionary in a MySQL database with an index as well. Try it yourself – you’ll probably find a dictionary with the words “hello” and “fire” in it to help. In fact, do this:
mysql> create table words (word varchar(100));
Query OK, 0 rows affected (0.13 sec)
mysql> load data local infile ‘/usr/share/dict/words’ into table words;
Query OK, 98326 rows affected (0.85 sec)
Records: 98326Â Deleted: 0Â Skipped: 0Â Warnings: 0
mysql> alter table words add column md5hash char(32);
Query OK, 98326 rows affected (0.39 sec)
Records: 98326Â Duplicates: 0Â Warnings: 0
mysql> update words set md5hash=md5(word);
Query OK, 98326 rows affected (3.19 sec)
Rows matched: 98326Â Changed: 98326Â Warnings: 0
mysql> alter table words add index md5_idx (md5hash);
Query OK, 98326 rows affected (2.86 sec)
Records: 98326Â Duplicates: 0Â Warnings: 0
mysql> select * from words where md5hash=’5d41402abc4b2a76b9719d911017c592′;
+——-+———————————-+
| word | md5hash                         |
+——-+———————————-+
| hello | 5d41402abc4b2a76b9719d911017c592 |
+——-+———————————-+
1 row in set (0.11 sec)
mysql> select * from words where md5hash=’015f28b9df1bdd36427dd976fb73b29d’;
+——+———————————-+
| word | md5hash                         |
+——+———————————-+
| fire | 015f28b9df1bdd36427dd976fb73b29d |
+——+———————————-+
1 row in set (0.00 sec)
$EXCLAMATION I hear you go.
Yes, this is not a good way to “secure” passwords. Oddly enough, people have known about this for a long time and there’s a real easy solution. It’s called salting.
Salting is prepending a random string to the start of the password when you store it (and when you check it).
So, let’s look at how our new password table may look:
mysql> select * from passwords;
+——+——–+———————————-+
| user | salt  | md5pass                         |
+——+——–+———————————-+
| u1Â Â | ntuk24 | ce6ac665c753714cb3df2aa525943a12 |
| u2Â Â | drc,3Â | 7f573abbb9e086ccc4a85d8b66731ac8 |
+——+——–+———————————-+
2 rows in set (0.00 sec)
As you can see, the MD5s are different than before. If we search these up in our dictionary, we won’t find a match.
mysql> select * from words where md5hash=’ce6ac665c753714cb3df2aa525943a12′;
Empty set (0.01 sec)
instead, we’d have to get the salt and do an md5 of the salt and the dictionary word and see if the md5 matches. Guess what, no index for that! and with all the possible values for salt, we’ve substantially increased the problem space to construct a dictionary (i won’t go into the maths here).
mysql> create view v as select word, md5(CONCAT(‘ntuk24′,word)) as salted from words;
Query OK, 0 rows affected (0.05 sec)
mysql> select * from v where salted=’ce6ac665c753714cb3df2aa525943a12’;
+——-+———————————-+
| word | salted                          |
+——-+———————————-+
| hello | ce6ac665c753714cb3df2aa525943a12 |
+——-+———————————-+
1 row in set (2.04 sec)
mysql> create or replace view v as select word, md5(CONCAT(‘drc,3′,word)) as salted from words;
Query OK, 0 rows affected (0.00 sec)
mysql> select * from v where salted=’7f573abbb9e086ccc4a85d8b66731ac8’; +——+———————————-+
| word | salted                          |
+——+———————————-+
| fire | 7f573abbb9e086ccc4a85d8b66731ac8 |
+——+———————————-+
1 row in set (2.12 sec)
So we’ve gone from essentially instantaneous retreival, to now taking about 2 seconds. Even if I assume that one of your users is going to be stupid enough to have a dictionary password, It’s going to take me 2 seconds to check each user – as the salt is different for each user! So it could take me hours just to find that user. Think about how many users are in your user table – with 1000 users, it’s over 1/2hr. For larger systems, it’s going to be hours.
Stewart,
I think what you need to mention is what kind of attack are you trying to protect here. If someone has stolen your database they also have salt and this means it does not really help. If your API is exported and does not protect from trying to brute force the password it also does not help. It really only helps if you only got MD5s from your passwords.
This is if you’re trying to find password for _single_ account.
Now if you want to find password for _any_ account salting also helps, as you mention it but for this case you better not to lose your database.
If you have system with 100.000+ users it is likely going to take couple of passwords to find someone, especially if it is something like web service where people trend to use simple passwords.
This does not mean you should not use salting and other techniques of course :)
Pingback: Sheeri Kritzer » Blog Archive » Real Password Security - My-ess-queue-ell vs. My-see-quell
Thanks for pointing that out Stewart as salting will make the SHA1 and MD5 passwords relatively “more secure”, however, I agree with Peter that in case the data is stolen, a unlikely but possibly event, the hash will be available to the hacker. Unless of course the salt isn’t stored with the data.
Frank
Hi …
For a somehow big md5 database you may take a look at my project at http://md5.rednoize.com. Currently you can search in 5,482,473 md5 strings. E.g. you will find out the matching string for 0b73943118d782b789a9ed910b79e40e there ;)
marcel
if any person has access to your db don’t you think password hashed or not is the least matter of concern.
If write access it’s a different story than if it’s just read access – or they get a dump of it – e.g. are able to copy the file from the file system.
Also, non-salted passwords are *really* easy to crack, so with just a copy of the data from the db (e.g. an exploit that lets you run a SELECT query on it) you can then get the full access of that user.
Pingback: Pythian Group Blog » Log Buffer #7: a Carnival of the Vanities for DBAs
I like the concept of salting. In fact I was thinking about it without knowing the term. I was looking at various articles on the web over this.
How about adding some pepper too? How about using multiple levels of salting and again the pattern of the elements salted is hashed and to make it even complex use the MD5hash of one of the salting elements.
As we increase the complexity of the passwords one of the things that is going to be of importance is the storage of the function which verifies the password.
If this is not secure which basically means, if you have a very insecure password for your website all this discussion is just a “moo point”…
Pingback: Log Buffer #7: A Carnival of the Vanities for DBAs