In December 2009, a critical data breach in the Internet has been experienced. Around 32 million user passwords of rockyou.com web portal was stolen by a hacker which had used SQL injection for his attack. He got all passwords and made them anonymously (i.e. without usernames) available in the Internet to download.
Security experts started analyzing the passwords and Imperva released a study regarding the security level
of the passwords. They have come up with the following results:
Key findings | The most commonly used 20 passwords | ||
---|---|---|---|
|
| ||
Password Length Distribution | |||
As the figure shows, ca. 60% of the passwords are quite insecure and contain either only lower case/only upper case characters or numeric values. The remaining 40% of the passwords are more secure and contain mixed letters, numeric and/or even special characters.
As security experts always repeat, a secure password must contain lower and upper case letters, numbers and special characters. This makes passwords more secure against brute-forcing and dictionary attacks.
At this point, the following question is raised. Do two passwords, which have the same length and both contain the same number of lower/upper case letters, numbers and special characters, provide the same security level? The answer of the question is NO. Consider the following two passwords: “z6iFk#rdlr” and “Password1.“. Both passwords contain 7 lower case characters, 1 upper case character, 1 number and 1 special character. But, the first one is more secure than the latter one, since it seems it was randomly generated. On the other hand, the second password contains some kind of pattern which can jeopardize its security. If passwords share the same pattern, this then can be misused to execute automated attacks similar to dictionary attacks. This password pattern consists of the following aspects:
- The first letter is a capital letter.
- The password is based on a dictionary word.
- A number and a special character are appended to the dictionary word respectively.
People with security in mind would like to follow the recommendations for choosing secure passwords. But they are also not capable of remembering randomly generated complicated passwords. My feeling was always that they have found a middle way. They take into consideration to choose a mixed password but easily remember it at the same time. This idea has led them to apply “password patterns”. In order to check my ideas about this issue, I made further analysis on the 32.6 million passwords. The aim of my analysis is to define some security patterns and check their usage ratio within the password list.
The Analysis
For the analysis, I imported 32.6 million passwords into a database table (exact number is 32,603,348). I used [:alpha:], [:digit:] and [:punct:] definitions to group different character sets within passwords. These definitions represent the following character sets:
[:alpha] | Any alpha character A to Z or a to z |
---|---|
[:digit:] | Only the digits 0 to 9 |
[:punct:] | Punctuation symbols (i.e. . , ” ‘ ? ! ; : # $ % & ( ) * + – / < > = @ [ ] \ ^ _ { } | ~) |
Password Patterns
The first pattern I analyzed is “concatenation”of different character sets. According to this pattern, people append one character set with another set or sets (as examples, “password.” or “password1.”). The first one is an example of alpha+punct dual concatenation. The latter one is an example of alpha+digit+punct triple concatenation password pattern.
The second pattern I analyzed is “replacement” of certain alpha letters. According to this pattern, people replace certain alpha letters in passwords with a digit or punctuation character. As an example, “passw0rd” can be given (the letter o is replaced with the number zero).
1. Concatenation Password Pattern
People concatenate different character sets to each other. For example, they append a single number (mostly 1) or “.” symbol to the dictionary words. In the following sections the frequencies of all possible concatenations between different character sets are given.
1.1. No Concatenation
For the sake of completeness, I analyzed “no
concatenation” case as well. That means I searched for the passwords contaning
only alpha, digit or punctuation characters. The following table shows the
occurrence quantity in the password list for each character set. According to
the results, 44% of passwords contain only alpha characters (i.e. lower or/and
upper case letters).
alpha | 14,366,751 (44%) |
---|---|
digit | 5,192,998 (16%) |
punct | 4,860 (0.015%) |
1.2. Dual Concatenation
In this pattern, I searched for the passwords that belong to any “alpha+digit”, “alpha+punct” or “digit+punct” concatenations (their reverse combinations as well). For the alpha characters, it is not considered if it is a dictionary word or not. But it can be said that the majority belongs to dictionary words. The following table shows the frequencies of the possible concatenations.
Alpha+Digit | Alpha+Punct | Digit+Alpha | Digit+Punct | Punct+Alpha | Punct+Digit |
---|---|---|---|---|---|
9,834,095 (30%) | 240,993 (0.74%) | 895,916 (2.75%) | 12,646 (0.04%) | 16,090 (0.05%) | 3,395 (0.01%) |
mekster11, khas8950, emilio1, holiday2, caitlin1, cats13, toohott69, cheer99, may2204, betteroff6, love1129 | olives!, skittles?, cheaphat!, skating., junkbox!, easymac*, itsmiller!, balboa!, bobbiedee!, hotbitch., password!, sowhat?, iloveyou!, redbag., yankees!, princess!, iluvyou! | 04maxima, 33orange, 12344321a, 1234567a, 118jefferson, 98101ef, 36987l, 1sweetness, 1simpleplan, 1loveyou, 5pointstar, 98765432q, 12345a, 1capital, 123xyz, 16inches, 50cent | 78963., 13659*, 83593113$$, 123456], 369*, 1977.., 022590!!, 8825##, 92102310., 3636369., 1457., 963., 24824** | *forever, !cheeky, $tevenrules, *phsyco, -angel, []dauoa, !qwert, !loveu , $prite, .com, *Twist, $upersonic, *jordan, $tennis , *jessica | ,123456, /8520, *41681, .31331, $$$4369, +2511161897, .09164232572, -11185, !034780, ~@~@~@123, *13961, ****1, ~123456, {0106860511 |
1.3. Triple Concatenation
In this pattern, I searched for the passwords that belong to any of the following triple combinations: “alpha+digit+punct”, “alpha+punct+digit”, “digit+alpha+punct”, “digit+punct+alpha”, “punct+alpha+digit” or “punct+digit+punct”. For the alpha characters, it is not considered if it is a dictionary word or not. But it can be said that the majority belongs to dictionary words.
Alpha + Digit + Punct | Alpha + Punct + Digit | Digit + Alpha + Punct | Digit + Punct + Alpha | Punct + Alpha + Digit | Punct + Digit + Alpha |
---|---|---|---|---|---|
82,151 (0.25%) | 185,610 (0.57%) | 13,298 (0.04%) | 18,218 (0.06%) | 9,940 (0.03%) | 12,592 (0.04%) |
teenager1@, abc123., karl143., windowsxp1!, kelvin258/, jessie18;, pretti7*, jordans07., JUNE24,, briana20., softball4!, blue42!, space1*, class08!, sonny21., mkjoy8!, Mas28@*, abc123!, roach89!, any83* | kaitlyn.1, poopp<3, t=48697123, franco_1, dude!2, chris#6, tommy.2359, iloveyou*1, Summer#5, watru^2, beautiful_01 | 1hawaiian!, 1wish!, 072305AJ$, 1TIKA!!, 4evergreen!!, 123abc., 1love!, 707sucks!, 123loveme!, 1fighter/, 50cent., 1andonly., 1srael** | 11!!JesusS, 6.five, 555-oup, 7-boss, 1!iloveyou, 1*princess, 305-boy, 123!qaz, 100%jumper, 1986@Jessica, 15-red, 1-Love | .disney2, @$$baba82, *k123456, $hortii88, *supergirl12, *ILOVEYA7, *june7, $iloveu40, !batman76, @love2, $outh408, .loveable1, `cpecan10, *martin23. | #1CHRIZ, #1kingsfan, <3ilovemanuel, !11Mom, *789ab, #1hawaiian, #1carlos, #1lover, #1lady |
Based on the statistics for concatenation, the most
commonly used dual combination is “alpha+digit” and the most
commonly used triple combination is
“alpha+punct+digit”.
2. Replacement Password Pattern
The second security pattern is replacement. People tend to replace certain letters in words with digits or punctuation characters. For example, “o” is replaced with “zero (0)”, “S” is replaced with “$” or “five (5)”. In the following table, some examples of replacement pattern is given. The numbers given in the second column are not exact numbers since there are false positives.
Alpha letter replaced with a digit | ||
---|---|---|
o -> zero (0) | 30,485 | il0veyou, ge0rge, m0vie, br0ken, passw0rd, c0llege, br0ther, n0thing, t0psecret, m0nkey, 1o/22/2003 |
i/l -> one (1) | 57,456 | 1loveyou, P1ayer, mel1ssa, stup1d, denn1s, w1lliams, f1lipana, pr1ncess, 1srael** |
s -> five (5) | 9,867 | du5tin,ju5tin, east5ide,augu5t, it5easy, eclip5e |
b/g -> six (6) | 7,059 | straw6erry,soccer6irl, short6one, hun6ry |
g -> nine (9) | 6,599 | an9els, en9ine |
Alpha letter replaced with a punctuation character | ||
s -> $ | n.a. | $prite, be$tfriend, ju$tin, two$hort, $pecial,$ummer, $upersonic, $tevenrules, $outh |
i/l -> | | n.a. | love|y, my|ove, actual|y, M|ChElLe |
3. Additional Patterns
There are also some additional interesting password patterns within the list that can be taken into consideration:
Dates | 4,167 | 4/30/04, 12/02/03, 06/27/00, 19/03/1988 |
---|---|---|
Keyboard sequences | n.a. | 123456 (in top 10), 12345678 (in top 10), qwerty (in top 20), qwertz (97), asdf(157), asdfg(1,190), asdfgh(2,908) |
Keyboard reverse sequences | n.a. | 654321 (in top 20), trewq (14), ytrewq (160), |
Starting with #1 | 8,617 | #1kingsfan |
Ending with 1. | 3,047 | dark1. |
The Symbols
People prefer using certain symbols more commonly
compared with the other symbols. The most commonly used punctuation character is
point (.) with 0.7%. The second one is underscore (_) with 0.58% and the third
one is exclamation mark (!) with 0.55%. The frequency of each punctuation symbol
in the password list is given in the following table.
. | 226,980 (0.7%) |
, | 27,722 (0.09%) |
“ | 3,172 (0.01%) |
‘ | 16,097 (0.05%) |
? | 24,744 (0.08%) |
---|---|---|---|---|---|---|---|---|---|
! | 179,666 (0.55%) |
; | 14,378 (0.044%) |
: | 7,239 (0.022%) |
# | 60,016 (0.18%) |
$ | 31,501 (0.1%) |
% | 11,282 (0.03%) |
& | 28,553 (0.088%) |
( | 16,557 (0.05%) |
) | 18,349 (0.056%) |
* | 95,400 (0.3%) |
+ | 24,000 (0.073%) |
- | 126,908 (0.39%) |
/ | 37,836 (0.12%) |
< | 11,856 (0.036%) |
> | 2,755 (0.008%) |
= | 18,741 (0.057%) |
@ | 10,4336 (0.32%) |
[ | 7,722 (0.02%) |
] | 10,731 (0.033%) |
\ | 4,149 (0.013%) |
^ | 5,863 (0.018%) |
_ | 187,603 (0.58%) |
{ | 1,056 (0.003%) |
} | 933 (0.003%) |
| | 506 (0.002%) |
~ | 5,823 (0.018%) |
Conclusion
In my pattern analysis, the following statistical results have come out:
- The most commonly used special character is . (point).
- The most commonly used dual concatenation of alpha-digit-punct characters is “alpha+digit” with 30%.
- The most commonly used triple concatenation of alpha-digit-punct characters is “alpha+punct+digit” with 0.57%.
- For the replacement pattern, replacing the letter i or l with the number 1 is the most commonly used pattern.
Password patterns might be the next generation of dictionary attacks. Revealing common password patterns, hackers can enhance their tools to enforce pattern-based brute-force attacks.
Finally, I suggest you the following aspects against password patterns:
- Do not choose and use any password based on a common pattern!
- Let a random password generator (e.g. pwgen firefox add-on) create strong passwords for you.
- If you bad at remembering passwords, create a single strong password (i.e. master password), remember it and use a password manager (e.g. sxipper, keepass) protected with the master password. Then, let the password manager generate strong unique passwords and store them for you.
댓글