A hacker recently released 32 million passwords obtained through an attack on a "legacy" server at rockyou.com. The folks at Imperva analyzed this password list and came to the usual conclusions and offered the usual well-meaning advice.
I think most of their conclusions are wrong or irrelevant. They are traditional password complaints, and I have made some of these recommendations myself, but the threats have changed, and people haven't, and it is time to fix this.
The size of the data spill is unprecedented. It reminds me of password sniffing successes discovered on ISP monitoring computers in 1996. At that time, traffic loads were light enough that a Sun workstation positioned on a network backbone could sniff all the cleartext passwords that passed it. A number of these maintenance machines were hacked and sniffers installed. The malware was discovered when disk partitions mysteriously filled up. Investigation revealed millions of captured passwords. (It is no surprise that attacks on .gov and .mil sites increased at that time.) These passwords were not published, but they were analyzed
Research over three decades has shown that people do not pick passwords that are resistant to dictionary attacks, and they don't like being forced to do so.
The Imperva folks used NASA password advice to analyze the data. The analysis is fine, but NASA's rules are not. Let's look at NASA's list:
I call these eye-of-newt password rules, because the recipes are reminiscent of magical potions. The rules vary per authentication service, which means we have to create different ones in different ways for different sites. The sites don't usually list their rules to those logging in, so users have to consult notes.
Imperva adds their own advice for users echoing security maven Bruce Schneier:
Rule 3 is generally good advice, but it depends on the third party. Spouses who share bill-paying may need to share the banking password or other authentication token. But it is a bad idea to give one web service access to another on your behalf: you are adding potentially weak links to the chain.
Bruce suggests in rule 2 that it is okay to write down passwords, or better, write a reminder about them, and I agree. Previous admonitions against writing down passwords contemplated local attacks—people reading your Post-it notes on your terminal in the office for example. Most attacks come from distant malefactors, and they will never see your terminal. But do beware of leaks to family members, like curious teenagers or a divorcing spouse. Of course, it would be better if the password were memorable, or generated by a token.
Item one has a reasonable suggestion for creating a password resistant to dictionary attacks. But the result is not easy to type, not easy to associate with a particular service, and may not contain enough types of characters (see NASA rule 2) to be accepted.
But most importantly, dictionary attacks are no longer a common threat to most Internet users. Passwords are usually obtained by keyboard sniffing software or phishing websites. Under these threat models, all the eye-of-newt rules are what many users suspected: annoying, tedious bureaucratic rules that don't actually help security!
Consider: it didn't matter what kind of password the 32 million people chose: the bad guys got all of them, strong and weak, for reasons beyond the users' control. It was the site administrators who screwed up and succumbed to an old and well-known SQL injection attack.
(The folks at RockYou have explained the measures they have taken since the attack at http://www.rockyou.com/help/securityMessage.php. It is interesting to note the URL of this security notice: the message was generated by PHP, a popular and easy language to use. According to my sources, PHP has fundamental security problems, and is easily misconfigured. A large percent of the break-ins to Linux and *BSD servers are through PHP.)
Let's look at Imperva's further advice to administrators:
Users will choose weak passwords, but trying to get them to do otherwise is simply poor engineering. It doesn't work very well, is inconvenient as hell, and doesn't stop the Bad Guys.
This is fine advice. There is certainly no point in leaking cleartext passwords: we have SSL and ssh. This doesn't address shoulder-surfing, keyboard sniffing, and phishing, but it does raise the bar. Most sites have been getting this right for a long time.
It's not a bad idea to digest passwords for database storage. This means that theft of the database comes with the added need for dictionary attacks on the individual passwords. This was probably not a bad idea for Unix passwords in the late 1970s, but it was proved less than useful a long time ago. A much more useful alternative rule: Make your authentication server highly resistant to attacks. If you can't or won't, use the server of a third party who will. Authentication service is not a job for amateurs.
This, finally, is excellent advice. See below for further discussion. However, it is not clear how well the CAPTCHA idea is going to work: there is an arms race there that we seem to be losing.
Password change policies are over-used, and probably a bad idea in general. Good passphrases are hard to think up and remember, and trying to remember changes is a problem. Of course, it is a good idea to change your password when a breach is suspected.
This is good. Passphrases are better than the eye-of-newt passwords: they are easier to remember and type, and perhaps harder to shoulder surf. On devices with spelling correction (like the iPhone when not in password mode), they can simplify password entry. Many authentication servers reject perfectly good high entropy passphrases because they fail the eye-of-newt tests.
How do we get out of this mess? We know that grandma is not going to tend to pick the high-entropy authentication tokens we want. (And this is true for all values of grandma: my children's grandma wrote disk controller software for the UNIVAC 1.)
I see two general solutions. The best is a hardware or software token that provides a one-time password on demand, given a PIN. The RSA time-based token seems to have won the battle. It is available as a small hardware device hung on a keychain, or as an app on a smart phone. There is no password picking, no eye-of-newt, no password changes, and even eavesdropping is probably okay.
The problem is that you have to carry these around, and maybe carry many of them. The software solution isn't bad: most iPhone users always have the device with them. I am not sure if it handles multiple authenticators, but it certainly could. But many of us may have a dozen or more sites that require logins.
This would seem to be the perfect place for a trusted authentication service as a business. Why shouldn't sites like Facebook or RockYou trust some well-respected company offering this service? There might be a few competing services, like MasterCard, Visa, AT&T, and RSA. One would have to have tokens for each.
This has been tried, but it hasn't caught on. It seems to me that it is time to try again. Perhaps previous solutions were too expensive. I would suggest that the service be offered to small sites (fewer than 30 authentications a month, say) for free, through a well-documented interface similar to OpenID or OpenAuth.
Since the early 1970s, the US banking system has used four-digit PINs to access ATM machines. While eight-digit PINs are available in Europe along with disparagement of the difference, the US standard has stood up well.
The four-digit PIN is okay because the user only has a few tries to get it right. If the banks let you choose a PIN, it may not be a moronic one. By limiting the number of tries, and locking the account, grandma can have a nearly hassle-free password. Here is one rule to rule them all:
Don't pick a moronic password: one that a friend or relative can guess in a few tries, or a shoulder surfer can easily spot as you type.
This gets rid of the useless eye-of-newt rules. Password length is not constrained. Dictionary words are okay, even encouraged. You might want to change the password occasionally, but pick another word in a dictionary if you'd like. Pet and spouse names are out. So is "1234", "password", and "facebook". Grandma can understand these rules, and the need for them. Humans get out of the high-entropy game.
The cost here is locked accounts. I suggest that this inconvenience is less likely to happen with simpler passwords. When it does, allow a few further attempts after increasing lengths of time. This scheme has passed the test of time. This should cut support costs, without reducing security.
A password hint is not a bad idea, but it should refer to the one true password, not a secondary password. Mother's-maiden-name passwords seldom pass the no-moron rule.
Password counting and timed-backoff require an authentication server, but not necessarily a heavyweight one. On Unix systems, there is a moribund PAM module named pam_tally that has been around for over a decade that counts login attempts. It should be dusted off and used widely.
The prevalence of repeated security problems has become boring. We don't seem to learn lessons taught long ago. Legacy solutions have legacy drawbacks. The solutions I have proposed above are not new, not radical, and not particularly expensive or inconvenient. Single sign-on can reduce the authentication load on the user in many cases.
Experts have shown the ability to run major Internet offerings without major catastrophes. Most of the published data leakages have come from smaller providers who may have had less experience with the security problems involved.
We can leverage solutions to actually make things better. This is how normal engineering works, and has been hard to fine in computer security.
Besides, we have bigger fish to fry. Threats like supply chain attacks are now real. We seem to be slowly losing the virus detection battle. Malicious users quickly analyze and exploit weaknesses derived from software patches. Nation states are involved in offensive operations. How are we going to solve these if we can't get something like authentication right?
Almost 16% of the passwords released were PINs (i.e. all digits). Here's a quick analysis.
The top ten were clearly moronic:
Count | PIN | |
---|---|---|
1 | 290,729 | 123456 |
2 | 79,076 | 12345 |
3 | 76,789 | 123456789 |
4 | 21,725 | 1234567 |
5 | 20,553 | 12345678 |
6 | 13,984 | 654321 |
7 | 13,272 | 111111 |
8 | 13,028 | 000000 |
9 | 9,516 | 123123 |
10 | 8,676 | 1234567890 |
It took some brief thought to figure out some in the top 100. These are much less clear to me, and are probably less moronic than your anniversary or birthday:
Count | PIN | |
---|---|---|
20 | 4,576 | 159753 |
37 | 2,491 | 5201314 |
69 | 1,345 | 14344 |
85 | 1,076 | 1435254 |
The PIN lengths were distributed thus:
digits | Count |
---|---|
1 | 55 |
2 | 149 |
3 | 1,048 |
4 | 20,661 |
5 | 213,543 |
6 | 2,278,919 |
>6 | 2,678,615 |
My least important password appeared more than two dozen times. I could find no others that I recall using in the past twenty years.