samedi 21 janvier 2012

Breaking MD5 hashs

Hello world!

Here is an easy to understand article about breaking md5 hashs. Breaking md5 hashs is a subject really discussed on the Internet. If you are new in the hacking community, the first thing that you learn to do is to break md5 hashs. Actually a lot of developers use the md5 hash function to hash passwords before to store it in a database. So if a hacker is able to compromise a website or a system and he is able to retrieve the hashs, he has to break md5 hashs to recover plaintexts.
To break a md5 hash there are a lot of technics, for example :
  • Dictionary attack
  • Rainbow table attack
  • Brute force attack
For courses I had a challenge where I had to break 5 md5 hashs. So I wanted to write up how I proceeded to break this hashs. In our case, we know that the hashs are randomly generated so it will be difficult to retrieve the plaintext with a dictionary attack. Dictionary attacks are used
when we know that people use common word of the dictionary for their password. We also know that the length of the passwords is between 6 and 10 characters. On the one hand we will see what material I used to break this challenge, on the other hand we will see how I proceeded to recover the plaintexts and then again we will see the good practices to have for a secure password.

Material for the challenge

To brute force this challenge, a simple personal computer was used. Here is the configuration of the computer :
  • Intel Core i5 CPU 760 2.80Ghz 8 threads, 4 CPUs
  • ATI Radeon HD 6870
  • 8.00Go of RAM
So it's not an amazing configuration, it's a personal computer. You can buy this configuration for 600 euros. To solve this challenge I chose to use my CPU
and also my GPU because we know that the GPU is more efficient for brute force. With my configuration I noticed that with just my CPU I was able to test 1 000 000 hashs per second and with my GPU I was able to test 5 000 000 000 md5 hashs per second which is not negligible. We know that passwords are between
6 and 10 characters length. Here is a comparison of performance to compute all passwords of 6 - 7 - 8 - 9 - 10 characters length with a charset of 81 characters using this computer.

So as we can see with a naive brute force and this computer, it's annoying to brute force a password superior or equal of 8 characters length.
And it's also impossible to brute force a password of 10 characters length.

Break MD5 hashs

It exists a lot of password breaker programs ( or in a more ethical way "password recovery" programs), for example John the ripper, Durandal, cuda-md5-breaker,
hashcat and so on... For my part I decided to use oclHashcat-plus because this one supports OpenCL (for my ATI Radeon GPU card), it's available on Windows and Linux, and I chose this program especially because it's easy to create his own heuristic to speed up the computation ( masking attack ). oclHashcat has also
a really good wiki with a lot of documentation.
When you want to break a md5 hash, the first thing to do is to check that this one doesn't exist in the database of an online md5 breaker (for example : In our case, thanks to this online md5 breaker I was able to recover the first hash :

With a brute force on the digit charset, it would have taken less than 1 second with our machine. So this password is really weak, the entropy of the password is :
With oclHashcat-plus we can create our own heuristic easily. What I call heuristic is this :
And you can use it like this :
It's also possible to specify his own charset for each position of the password, which is really convenient to speed up the brute force. According to the document Consumer Password Wost Practices of Imperva, the distribution of the password length is this one :
We also know that the repartition of characters is this one :

According to the document Hacker Intelligence Initiative, Monthly Trend Report 7, there are typical schemes:

It's easy to deduce typical schemes for common passwords and to build good heuristics.
Example of an advanced masking attack using a specified charset :

The entropy of this masking attack is :
 It's possible to do this masking attack in 5 minutes :

We can compare it with a naive brute force attack, the entropy is :
The computation time to do the attack is 4 days :
The time saved is not negligible and that's why when there is a low entropy on passwords, heuristic attacks are really powerfull.
Knowing these statistics, I was able to recover all the passwords really fast. Here is the table summarizing my masking attacks :
Note : the charset used for "?1" is the same charset of the above masking attack example.
So with heuristic attacks we were able to retrieve quickly all the plaintexts.

Good practices

A good password is a password with a good margin of security. What I mean by a good margin of security is even with a super computer ( a super computer
can do ~ 10^16 tests per second ), you need all your life to brute force the password. So to set up a good password we have to fix the minimal
length :
So the minimal length for your password should be 13 characters. The charset is the charset of 81 characters defined previously. But of course you can use
a more complicated charset to increase the security margin. But this is not sufficient, the scheme of you password should be unpredictable, so with
a maximal entropy. A way to generate your password is to take randomly 13 chars in the given charset. It's important to be sure of the security of your
random function.

Example of code to generate passwords :
Here is an example of password generated with this code : "THQ>Y83Lq*.6d". This password is a secure password.

The entropy of the password :
But this one is really difficult to learn and you will write it on a post-it. Doing that your system will be compromised by your
colleagues or the housekeeper. The password should be easy to learn!
An example to generate a good password, with the same entropy, is to take your favorite sentence and to translate it into a password :
You can also mix between uppercase and lowercase, take a letter every 3 letters and so on. It's easy to learn this password, the charset used is :
The entropy of the password :
The entropy of this "easy to learn" password is even superior of our complicated password. We did a lot of suppositions like taking a minimal charset to compute
the entropy, and even with these suppositions this password is better! The scheme of this password is really difficult to predict, that's why it's a good password.
If you have no information about the password it will be really difficult to break it!


To conclude, we saw that it's easy to break weak passwords. A weak password is a password with a small length and with low entropy. We saw that we can use "strategies" to speed up our brute-force, and even passwords of "10" characters length with predictable schemes are breakable. It's just a story of entropy. We also saw what is a good password and how to learn it easily.

2 commentaires:

  1. I am surprised to learn about these three working methods which are used to break md5. This article taught me that its necessary to set a password that is difficult to break.
    electronic signature software

  2. An effective SEO tool that is useful in encoding passwords, credit card numbers and any other sensitive data in regard to.
    md5 online generator