Nilsson took a close look at the accounts that included an email address as the username. Many are webmail addresses (15,000 being hotmail and 2000 being gmail). 6046, including some invalid entries, were country specific; and of those, 5736 used a ‘.com.br’ address.
“So, almost 95% of the country-specific e-mails are from Brazil (.com.br)!” he wrote. “I think this is probably the result of either a leak of a big Brazilian hacked website, or a Brazil-targeted phishing, combined with 9000 Twitter-spam accounts.” Later, he added, “the e-mail ones seem to originate from the June 2011 leak by Lulzsec.”
Part of Nilsson’s analysis involved the use of Pipal, a password analyzer developed by DigiNinja (Robin Wood, whose day job is a senior security engineer at RandomStorm). The purpose of the analyzer is to help security audit companies (such as RandomStorm) to develop new approaches to password auditing – which basically involves testing their strength by trying to crack them. “Quite often,” he explained to Infosecurity, “when you grab passwords and try to crack them you don't manage to get all of them. Running a Pipal analysis on the ones cracked can give ideas on what to try when trying to crack the rest. We also look at the trends and build them into our general cracking strategy for the next audit we do.”
He ran his own Pipal test on the dump. “The base words section,” he told Infosecurity, “shows words that passwords are built from.” This showed the top ten base words included ‘junior’, ‘brasil’, ‘rafa’, ‘carlos’ and ‘junior’. “From these base words here I would have thought that most of the passwords in the list came from a Spanish or Latin American country,” he added, with the inclusion of ‘brasil’ supporting Nilsson’s Brazil-hack conclusion.
Wood also looked at the use of years within the passwords: they peak between 2008 and 2010. “The years are interesting,” he said. “2008-2010 are the most popular, which backs up Nilsson’s assertion that the leak happened a while ago.”
Wood believes that his analysis of the strength of passwords shows that Twitter could do better. “Twitter does have a minimum requirement of 6 characters,” he said, “but that isn't really much with the speed of brute forcing or cracking passwords.” It’s almost a personal crusade. “I regularly try to name and shame sites which have bad, or no, password policies. I had a big argument with Paypal and Amex about this as they both cap at about 20 characters. There is no technical reason for this if they are storing their passwords correctly; that is, hashing them not storing in clear text. Hashes all come out at the same length in the end.”