Welcome to HBH! If you have tried to register and didn't get a verification email, please using the following link to resend the verification email.

Statistically improving password guessing


ranma's Avatar
Member
0 0

Lulzsec has released a large password list, and previous such lists already exist.

I was wondering whether any scientific research has been conducted on this data. For example, I am thinking of turning the data into an n-gram model. An n-gram is a statistical model of string occurrences. A unigram model is appearance of a single word or character. Bigram is for sequences of characters or words. And so on until n-gram. After tallying up results, you can "smooth" the counts to give better estimations of the actual data in the world (there are different ways to do this).

I am not exactly sure how the info would be used, but it could facilitate password guessing.

Furthermore, there are machine learning models which can be used to extract patterns in raw data (called Boltzmann Machines). I was wondering whether any scientific statistical ideas have been applied to speed up password guessing. As I learn more about these models, I will try to apply them to the password data out there.