lundi 17 décembre 2007

[Going internal] Server resources

The first problem encountered while coding the project was the size of the database.
As you can easily compute, md5 are 128 bits hashes, which makes 2^128 different hashes. Only for storing the hashes without the corresponding plain text it will take 2^128 * 16 bytes = 2^132 bytes = 2^92 terabytes = ~5.4*10^27 terabytes. In other words : just unfeasible (at least for me actually :))

But the main purpose of this project is to recover user entered texts from a databases, which would only be composed of printable characters. According to the ASCII table, printable characters go from 0x20 (space) to 0xfe (~), that is 94 possibilities.

Going deeper with the length of the passwords, we can guess (or hope) that standard password length is about 8 characters, so complexity for this new scope of hashes is 94^8, which complexity is about 2^52. That's quite more accessible, but...

After the database grew up to 3GB, the small webserver I used to host this project began to swap and overload, so I migrate it quickly on a dedicated server hosted at home, reachable through my ADSL line. That way I can add as much RAM and HDD as I want for almost free, and the requirement in bandwidth is not so huge actually.

New external databases has just been added three more external databases
  • looks up now into 13 external databases plus its own internal database, which count more than 80'000'000 hashes !

You can retry the hashes that was not found in a previous search, or submit the with your email address to the project and get contacted once they will be known.

dimanche 16 décembre 2007

Aggregating existing MD5 databases

The first entry point of the project, started in 2005 and dedicated to md5 hashes only, was to aggregate other existing databases. So in one query you could query several databases and then get more chance to recover a plain text.

So I begin to browse the web to find other similar projects. The external databases the project aggregates are given here below : AuthSecu has a real impressive md5 database, so my goal is to make huger databases ;).

The next planned databases to aggregate in a short future are -, and

If you know another database which "knows" millions of hashes feel free to drop me an email, I would be pleased to add it to my project.

What is a reverse hash lookup database ?

A hash function is a one-way function, meaning that if you know only the output it is hard to recover the original input.
So every input has an ouput. If you can store in a database every couple (input, output) you can then search through output to recover the input. This is what is called reverse lookup database.

The goal of the project RMD5DB (Reverse MD5 DataBase) is to help to store as much plain text/ corresponding hash couple in a database in order to query for a particular hash and recover (one of) its possible plain text. The hash functions targeted are mainly md5 and sha1.

The project is available at the following address : Reverse Hash DataBase