MD5 (Message-Digest) is an algorithm commonly used as cryptographic hash function. It takes an arbitrary length of data and produces a 128-bit (16 bytes) hash value. MD5 is not recommended for secure applications any more because attack messages can be constructed to produce collision. However, without intentional attack, MD5 is extremely unlikely to produce a collision and the computation is fast (later part will provide test code). Therefore it is still useful as a hash function in situations where security is not a concern and collision is not desired.

Reference 2 provides links to lots of implementations of MD5 in various programming languages. This post takes the code from L. Peter Deutsch implementation added a test function to test how fast MD5 computes.

The test uses a a list of url strings stored in a file. It will read the string from file and compute the md5 for each URL. The test function is as below,

static int do_n_test(void) {

    FILE *testFile;

    int UNIQUE_URLS = 1000;

    unsigned char url[UNIQUE_URLS][4096];

    int numOfUrlsTested = 0;

    int i = 0, j, READ_FILE_TIMES = 1;   

    struct timeval stTime, edTime;

 

    testFile = fopen("./urllist.txt", "r");

    if (testFile == NULL) {

        printf("cannot open test file\n");

        return;

    }

    for (; i < UNIQUE_URLS; ++i) {

        fgets(url[i], 4096, testFile);

    }

    fclose(testFile);

    gettimeofday(&stTime, NULL);

    for (i = 0; i < READ_FILE_TIMES; ++i) {

        for (j = 0; j < UNIQUE_URLS; ++j) {

            md5_state_t state;

            md5_byte_t digest[16];

    

            char hex_output[16*2 + 1];

            int di;

            md5_init(&state);

            md5_append(&state, (const md5_byte_t *)url[j], strlen(url[j]));

            md5_finish(&state, digest);

            /*for (di = 0; di < 16; ++di)

                sprintf(hex_output + di * 2, "%02x", digest[di]);

            printf("%d: %s\n", numOfUrlsTested, hex_output);*/

            ++numOfUrlsTested;

        }

    }

    gettimeofday(&edTime, NULL);

    printf("%d: %u:%u\n", numOfUrlsTested, (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

}

The test file contains more than 3000 urls, but loading all URLs from the file may exceed the stack memory you can use. You can set the “UNIQUE_URLS” variable to adjust the number of unique URLs to load.  You can also adjust the “READ_FILE_TIMES” variable to set number of MD5 computations to run. The total number of MD5 computations will be “UNIQUE_URLS”x“READ_FILE_TIMES”.

To compile the code, use the command,

gcc -o test md5.c md5main.c –lm

To run test,

./test –test

On my Ubuntu machine with Intel i5 Quad Core, 1197 MHz, 4GB Memory, one MD5 computation takes about 4 macro seconds.

You can download the entire code from here. Note that the main MD5 computation code is from reference 2 by L. Peter Deutsch.

References:
1. MD5 Wikipedia Page: http://en.wikipedia.org/wiki/Md5
2. MD5 Homepage (unofficial): http://userpages.umbc.edu/~mabzug1/cs/md5/md5.html
3. Hash functions: An empirical comparison: http://www.strchr.com/hash_functions

 

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Set your Twitter account name in your settings to use the TwitterBar Section.