The Usage of memset

memset has the following definition,

void* memset(void* s, int c, size_t n);

The method set each of the first n bytes of memory pointed by s to c and return s. Note that although the second parameter has type of int, it’s converted to unsigned char before the memory operation is carried out. Therefore, c has to be in the range of [0, 255].

memset works on bytes. If you use memset for array of ints, or doubles, most of the case it won’t work. memset works on the following situations (there may be more cases),

  • initialize char, int, float arrays to 0s.
  • initialize char to any arbitrary value.
  • initialize int to values with same byte value for all 4 bytes. (e.g. 0×01010101 = 16843009)

But we seldom encounter this case.

memset and C Arrays

If the array is static, the memory is allocated at stack. The memory is continuous for both 1-D and multi-dimensional arrays.

If the array is dynamic, the memory is allocated at heap. If you’re allocating array memory using malloc, malloc can only assue the memory allocated on each call is continous. So if the array allocated is 1-D, the memory is continuous, otherwise, it’s not guaranteed (since you’re calling malloc multiple times).

memset works on a block of continuous memory. For static array, say int a[10][10], you can simply use memset(a, 0, sizeof(a[0][0])*10*10) to initialize all elements to 0. For dynamic array, the example below gives the idea,

#include <stdio.h>

#include <string.h>

#include <stdlib.h>

int main() {

   int **a;

   int i, j;

   a = (int **)malloc(sizeof(int *)*10);

   for (i = 0; i < 10; ++i) {

       a[i] = (int*)malloc(sizeof(int)*10);

   }

   //memset to all 0s

   for (i = 0; i < 10; ++i) {

       memset(a[i], 0, sizeof(a[i][0])*10);

   }

   //print out the value

   for (i = 0; i < 10; ++i) {

       for (j = 0; j < 10; ++j) {

           printf("%d ", a[i][j]);

       }

       printf("\n");

   }

   return 0;

}

Essentially, one memset is needed for one malloc.

Use memset instead of Loops

memset is usually optimized by compilers to produce more efficient binary code than loops. Below is a testing code to compare the performance of loop initialization and memset.

/*

this is a small test program tests which method to get the file size is faster

*/

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#define D1 50

#define D2 90

#define D3 90

unsigned char (*pData)[D1][D2][D3];

unsigned char data1[D1][D2][D3];

unsigned char data2[D1][D2][D3];

int main() {

   struct timeval stTime, edTime;

   int LOOP_COUNT = 100;

   int cnt, i, j, k;

   memset(data1, 0x01, D1*D2*D3);

   memset(data2, 0x00, D1*D2*D3);

   pData = &data1;

   //using memset 

   gettimeofday(&stTime, NULL);

   /*for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       for (i = 0; i < D1; ++i) {

           for (j = 0; j < D2; ++j) {

       memset(&(*pData)[i][j][0], 1, sizeof(unsigned char)*D3);

           }

       }

   }*/

   for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       memset(&(*pData[0][0][0]), 1, sizeof(unsigned char)*D1*D2*D2);

   }

   gettimeofday(&edTime, NULL);

   printf("%u:%u\n", (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

   //using loop

   pData = &data2;

   gettimeofday(&stTime, NULL);

   for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       for (i = 0; i < D1; ++i) {

           for (j = 0; j < D2; ++j) {

       for (k = 0; k < D3; ++k) {

           (*pData)[i][j][k] = 1;

       }

           }

       }

   }

   gettimeofday(&edTime, NULL);

   printf("%u:%u\n", (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

   //(*pData)[_stFrame][_edH][_edW] = 88;    //uncomment see it fails the check below

   //to check two methods get same results

   for (i = 0; i < D1; ++i) {

       for (j = 0; j < D2; ++j) {

   for (k = 0; k < D3; ++k) {

               if (data1[i][j][k] != data2[i][j][k]) {

                   printf("it doesn't work\n");

                   return;

               }

           }

       }

   }

}

Compile the code with “gcc -o test test.c” and run the code with “./test” on a Levono Laptop Dual Core 2.0 GHz, 3G memory Ubuntu 10.04 system gives the results below,

0:6470
0:121570

memset is almost 20 times as fast as the loop method. Note that the commented code contains a test for doing memset for only the inner most loop, it actually still beats the loop method. This means even for dynamic allocated array which we can only use memset for inner most loop, it can still improve the performance over all loops method.

References:

Arrays and Pointers: http://www.lysator.liu.se/c/c-faq/c-2.html

 

Leave a Reply

Your email address will not be published.

You may use these HTML tags and attributes: <a href="" title=""> <abbr title=""> <acronym title=""> <b> <blockquote cite=""> <cite> <code> <del datetime=""> <em> <i> <q cite=""> <strike> <strong>

Set your Twitter account name in your settings to use the TwitterBar Section.