Use memset to Avoid Loops in C

The Usage of memset

memset has the following definition,

void* memset(void* s, int c, size_t n);

The method set each of the first n bytes of memory pointed by s to c and return s. Note that although the second parameter has type of int, it’s converted to unsigned char before the memory operation is carried out. Therefore, c has to be in the range of [0, 255].

memset works on bytes. If you use memset for array of ints, or doubles, most of the case it won’t work. memset works on the following situations (there may be more cases),

  • initialize char, int, float arrays to 0s.
  • initialize char to any arbitrary value.
  • initialize int to values with same byte value for all 4 bytes. (e.g. 0x01010101 = 16843009)

But we seldom encounter this case.

memset and C Arrays

If the array is static, the memory is allocated at stack. The memory is continuous for both 1-D and multi-dimensional arrays.

If the array is dynamic, the memory is allocated at heap. If you’re allocating array memory using malloc, malloc can only assue the memory allocated on each call is continous. So if the array allocated is 1-D, the memory is continuous, otherwise, it’s not guaranteed (since you’re calling malloc multiple times).

memset works on a block of continuous memory. For static array, say int a[10][10], you can simply use memset(a, 0, sizeof(a[0][0])*10*10) to initialize all elements to 0. For dynamic array, the example below gives the idea,

#include <stdio.h>

#include <string.h>

#include <stdlib.h>

int main() {

   int **a;

   int i, j;

   a = (int **)malloc(sizeof(int *)*10);

   for (i = 0; i < 10; ++i) {

       a[i] = (int*)malloc(sizeof(int)*10);

   }

   //memset to all 0s

   for (i = 0; i < 10; ++i) {

       memset(a[i], 0, sizeof(a[i][0])*10);

   }

   //print out the value

   for (i = 0; i < 10; ++i) {

       for (j = 0; j < 10; ++j) {

           printf("%d ", a[i][j]);

       }

       printf("n");

   }

   return 0;

}

Essentially, one memset is needed for one malloc.

Use memset instead of Loops

memset is usually optimized by compilers to produce more efficient binary code than loops. Below is a testing code to compare the performance of loop initialization and memset.

/*

this is a small test program tests which method to get the file size is faster

*/

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#define D1 50

#define D2 90

#define D3 90

unsigned char (*pData)[D1][D2][D3];

unsigned char data1[D1][D2][D3];

unsigned char data2[D1][D2][D3];

int main() {

   struct timeval stTime, edTime;

   int LOOP_COUNT = 100;

   int cnt, i, j, k;

   memset(data1, 0x01, D1*D2*D3);

   memset(data2, 0x00, D1*D2*D3);

   pData = &data1;

   //using memset 

   gettimeofday(&stTime, NULL);

   /*for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       for (i = 0; i < D1; ++i) {

           for (j = 0; j < D2; ++j) {

       memset(&(*pData)[i][j][0], 1, sizeof(unsigned char)*D3);

           }

       }

   }*/

   for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       memset(&(*pData[0][0][0]), 1, sizeof(unsigned char)*D1*D2*D2);

   }

   gettimeofday(&edTime, NULL);

   printf("%u:%un", (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

   //using loop

   pData = &data2;

   gettimeofday(&stTime, NULL);

   for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       for (i = 0; i < D1; ++i) {

           for (j = 0; j < D2; ++j) {

       for (k = 0; k < D3; ++k) {

           (*pData)[i][j][k] = 1;

       }

           }

       }

   }

   gettimeofday(&edTime, NULL);

   printf("%u:%un", (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

   //(*pData)[_stFrame][_edH][_edW] = 88;    //uncomment see it fails the check below

   //to check two methods get same results

   for (i = 0; i < D1; ++i) {

       for (j = 0; j < D2; ++j) {

   for (k = 0; k < D3; ++k) {

               if (data1[i][j][k] != data2[i][j][k]) {

                   printf("it doesn't workn");

                   return;

               }

           }

       }

   }

}

Compile the code with “gcc -o test test.c” and run the code with “./test” on a Levono Laptop Dual Core 2.0 GHz, 3G memory Ubuntu 10.04 system gives the results below,

0:6470
0:121570

memset is almost 20 times as fast as the loop method. Note that the commented code contains a test for doing memset for only the inner most loop, it actually still beats the loop method. This means even for dynamic allocated array which we can only use memset for inner most loop, it can still improve the performance over all loops method.

References:

Arrays and Pointers: http://www.lysator.liu.se/c/c-faq/c-2.html

Simple TCP Socket Client and Server Communication in C Under Linux

This post doesn’t provide details about how Linux socket works, its design etc. It mainly for providing source code of simple TCP socket client and server in C. I’m writing this because I found myself need simple TCP client and server for testing from time to time. 🙂
1. The TCP Server Code

You can refer to TCP Server code below (save it as tcpserver.c),

#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#include <unistd.h>

#include <fcntl.h>

#include <sys/types.h> 

#include <sys/socket.h>

#include <netinet/in.h>

 

void nonblock(int sockfd)

{

    int opts;

    opts = fcntl(sockfd, F_GETFL);

    if(opts < 0)

    {

        fprintf(stderr, "fcntl(F_GETFL) failedn");

    }

    opts = (opts | O_NONBLOCK);

    if(fcntl(sockfd, F_SETFL, opts) < 0) 

    {

        fprintf(stderr, "fcntl(F_SETFL) failedn");

    }

}

 

int main(int argc, char *argv[])

{

     int BUFLEN = 2000;

     int sockfd, newsockfd, portno;

     socklen_t clilen;

     char buffer[BUFLEN];

     struct sockaddr_in serv_addr, cli_addr;

     int n, i;

     int one = 1;

 

     if (argc < 2) {

         fprintf(stderr,"please specify a port numbern");

         exit(1);

     }

     sockfd = socket(AF_INET, SOCK_STREAM, 0);

     if (sockfd < 0) {

        perror("ERROR create socket");

        exit(1);

     }

     setsockopt(sockfd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof one);    //allow reuse of port

     //bind to a local address

     bzero((char *) &serv_addr, sizeof(serv_addr));

     portno = atoi(argv[1]);

     serv_addr.sin_family = AF_INET;

     serv_addr.sin_addr.s_addr = INADDR_ANY;

     serv_addr.sin_port = htons(portno);

     if (bind(sockfd, (struct sockaddr *) &serv_addr, sizeof(serv_addr)) < 0) {

        perror("ERROR on bind");

        exit(1);

     }

     //listen marks the socket as passive socket listening to incoming connections, 

     //it allows max 5 backlog connections: backlog connections are pending in queue

     //if pending connections are more than 5, later request may be ignored

     listen(sockfd,5);

     //accept incoming connections

     clilen = sizeof(cli_addr);

     newsockfd = accept(sockfd, (struct sockaddr *) &cli_addr, &clilen);

     //nonblock(newsockfd);        //if we want to set the socket as nonblock, we can uncomment this

     if (newsockfd < 0) {

        perror("ERROR on accept");

        exit(1);

     }

     printf("connection acceptedn");

     for (i = 0; i < 5; ++i) {

         bzero(buffer,BUFLEN);

         n = read(newsockfd,buffer,BUFLEN);

         if (n < 0) {

            perror("ERROR read from socket");

         }

         printf("received: %s",buffer); 

         n = write(newsockfd, buffer, n);

         printf("sent: %s", buffer);

         if (n < 0) {

            perror("ERROR write to socket");

         }

     }

     close(newsockfd);

     close(sockfd);

     return 0; 

}

The code first create a socket of SOCK_STREAM type in AF_INET domain. SOCK_STREAM corresponds to TCP, and AF_INET refers to IPv4.

It calls setsockopt to make the socket resuable. For example, your server open up a socket at a port number, and then it exits. Without this line of code, sometimes you may not bind the socket to same port.

The program then binds the socket to a local IP address with a specific port number. listen(sockfd, 5) will set the socket as passive socket listening to incoming connections. 5 means the maximum number of backlog is 5. Backlog connections are pending connect request in queue. If pending request are more than 5, later request may be ignored.

The code then calls accept method to accept incoming connections. This is a block call, and it returns when a new connect request is received, or an error occurred.

In the for loop, the tcp server listens to incoming packets and echo the message back. Note that the nonblock(newsockfd) has been commented out. You can uncomment it to enable nonblock read and write, but you may want to modify the read and write part to make it work properly.

2. The TCP Client Code
The TCP client code is as below (save it as tcpclient.c),

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

#include <string.h>

#include <fcntl.h>

#include <sys/types.h>

#include <sys/socket.h>

#include <netinet/in.h>

#include <netdb.h> 

 

void nonblock(int sockfd)

{

    int opts;

    opts = fcntl(sockfd, F_GETFL);

    if(opts < 0)

    {

        fprintf(stderr, "fcntl(F_GETFL) failedn");

    }

    opts = (opts | O_NONBLOCK);

    if(fcntl(sockfd, F_SETFL, opts) < 0) 

    {

        fprintf(stderr, "fcntl(F_SETFL) failedn");

    }

}

 

int main(int argc, char *argv[])

{

    int BUFLEN = 2000;

    int sockfd, portno, n;

    struct sockaddr_in serv_addr;

    struct hostent *server;

    int i;

 

    char buffer[BUFLEN];

    if (argc < 3) {

       fprintf(stderr,"usage: %s hostname_or_ip portn", argv[0]);

       exit(0);

    }

    portno = atoi(argv[2]);

    sockfd = socket(AF_INET, SOCK_STREAM, 0);

    if (sockfd < 0) {

        perror("ERROR creating socket");

        exit(1);

    }

    //get the address info by either host name or IP address

    server = gethostbyname(argv[1]);

    if (server == NULL) {

        fprintf(stderr,"ERROR, no such hostn");

        exit(1);

    }

    bzero((char *) &serv_addr, sizeof(serv_addr));

    serv_addr.sin_family = AF_INET;

    bcopy((char *)server->h_addr, (char *)&serv_addr.sin_addr.s_addr, server->h_length);

    serv_addr.sin_port = htons(portno);

    if (connect(sockfd,(struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0)  {

        perror("ERROR connecting");

        exit(1);

    }

    printf("connection establishedn");

    //nonblock(sockfd);    //uncomment this line if we want to make the socket non-block

    for (i = 0; i < 5; ++i) {

        printf("Please enter the message: ");

        bzero(buffer,BUFLEN);

        fgets(buffer,BUFLEN,stdin);

        n = write(sockfd,buffer,strlen(buffer));

        printf("sent: %s", buffer);

        if (n < 0) {

             perror("ERROR writing to socket");

        }

        bzero(buffer,BUFLEN);

        n = read(sockfd,buffer,BUFLEN);

        if (n < 0) {

             perror("ERROR reading from socket");

        }

        printf("received: %s",buffer);

    }

    close(sockfd);

    return 0;

}

The code creates a TCP socket. It then tries to connect to a server address with specified IP and port number. The connect is a block call, it returns when the connection is established, which means it receives the ACK from tcp server, which is third packet of TCP 3-way handshake (SYN, ACK+SYN, ACK).

In the loop the tcp client sends and receives messages. Also the nonblock code has been disabled.

3. Compile and Run
To compile the code, use the commands below,

gcc -o client tcpclient.c
gcc -o server tcpserver.c

A sample run could be,

./server 12345 (at one terminal)
./client localhost 12345 (at another terminal)

Understand Call Stack–With Examples in C

Stack is a data structure that used widely in computer programming. The implementation of stack is simple, and its FILO (First-In-Last-Out) rule with PUSH/POP/TOP operations are also straight forward.

This article focuses on a special stack used in C programming runtime memory environment, called Call Stack.

Call Stack stores the information about function calling. These information (including input parameters, return address, previous stack frame pointer, and local variables) are called activation record or stack frame.

The stack frame is essential for the program execution to switch to a called function (callee) context and restore to the caller function context after the callee is finished execution. I’ll illustrate with a simple Hello World C program as below,

#include <stdio.h>

 

void foo(int x) {

    char str[12] = "hello world";

    printf("%s:%dn", str, x);

}

 

int main() {

    foo(1);

    return 0;

}

C and other high-level programming languages usually hides call stack from programmers. In order to understand what happens underneath, we’ll compile the code into assembly language with the following command,

gcc –S test.c

The –S option will stop the build process (compile->assemble->link) after the compilation stage, and output a test.s file containing the assembly code. In my machine, the assembly code looks like below,

#The assembly code for test.c

    .file    "test.c"        #indicate the source file name generates this assembly file

    .section    .rodata     #read only data

.LC0:

    .string    "%s:%dn"

    .text                   #this indicates the code section

.globl foo                  #indicate foo is global label, can be seen by other parts of the program

    .type    foo, @function

foo:

    pushl    %ebp            #push ebp to stack => save stack frame pointer for main method context

    movl    %esp, %ebp      #save current stack pointer to ebp

    subl    $40, %esp        #decrease the stack pointer by 40=>as stack in Intel processor grows 

                            #from higher address to lower address, this reverse 40 bytes memory

    movl    %gs:20, %eax    #these and the next two lines of code are probably for memory check

    movl    %eax, -12(%ebp) #(canary words??)i'm not 100% sure

    xorl    %eax, %eax      #

    movl    $1819043176, -24(%ebp)  # =0x6C6C6568 => 68 65 6C 6C  => h e l l

    movl    $1870078063, -20(%ebp)  # =0x6F77206F => 6F 20 77 6F  => o   w o 

    movl    $6581362, -16(%ebp)     # =0x00646C72 => 72 6C 64 00  => r l d NULL

    movl    $.LC0, %eax     #move address of "%s:%dn" to eax register

    movl    8(%ebp), %edx   #move input parameter x to edx

    movl    %edx, 8(%esp)   #move input parameter x to certain address at stack

    leal    -24(%ebp), %edx #load address of "hello world" to edx register

    movl    %edx, 4(%esp)   #copy address of "hello world" to certain address at stack (right before x)

    movl    %eax, (%esp)    #copy address of "%s:%dn" to memory pointed by stack pointer

    call    printf          #call printf pre-defined routine, this will save the return address to stack

                            #after this pre-defined routine return, the return address should be popped 

                            #out from stack. The return address should be the address for next  

                            #instruction, which is the address for next line

    movl    -12(%ebp), %eax #this three lines (current + next 2 lines) are also probably for memory 

    xorl    %gs:20, %eax    #stack overflow check. again not 100% sure???

    je    .L3                 #stack check and if pass, call leave and ret

    call    __stack_chk_fail#call __stack_chk_fail in case stack check failed        

.L3:

    leave                 #pop local variable str, also pop saved ebp value back to ebp

    ret                   #return control back to main routine by popping the saved instruction 

                          #pointer (the address next instruction) from stack

    .size    foo, .-foo

.globl main                 #it tells the assembler that "main" is a globl label, allows other parts see it

    .type    main, @function

main:                     #main method is seen as a subroute for the startup code

    pushl    %ebp          #save previous frame stack (stack frame of the start up code) 

                          #pointer to call stack

    movl    %esp, %ebp    #move stack pointer to ebp register, now ebp stores the base address 

                          #for referencing the local variables declared for main routine

    andl    $-16, %esp    #esp & 0xFFFFFFF0 (=-16 in 2's complement form), this operation reset 

                          #the last 4 bits of esp to 0, makes sure esp is dividable by 16.  

                          #This align the stack with next lowest 16-byte boundary

    subl    $16, %esp     #reserve 16 bytes at the stack

    movl    $1, (%esp)    #move 1 to the memory location pointed by stack pointer, which is the 

                          #input parameter for subroutine foo()

    call    foo           #now call foo subroutine, it will save the address of the next 

                          #instruction(next line) to stack, which is the return address of foo subroutine. 

                          #upon return of the foo function, pops input parameters for foo, which is value 1

    movl    $0, %eax      #put the return value 0 to eax register

    leave                 #no local variables decleared, pop saved ebp value back to ebp

    ret                   #returns control to start up code by popping the saved instruction pointer 

                          #(the address for next instruction) from stack

    .size    main, .-main

    .ident    "GCC: (Ubuntu 4.4.3-4ubuntu5) 4.4.3"

    .section    .note.GNU-stack,"",@progbits

Note: if you find the program difficult to follow, skip the code and pictures below. Read Things Occurred at Function Call part and Return first and refer back to the assembly code.

The assembly code has detailed comments for each step. The overall picture for the Call Stack of the program can be illustrated as 2 diagrams,

push

Figure 1. When Stack Frame Info are Pushed

pop

Figure 2. When Stack Frame are Popped

Things Occurred at Function Call

To summarize the analysis of the code above.

1. What a Stack Frame/Activation Record Looks Like

The figure below illustrates what a stack frame looks like for a function/method/routine,

stackframe

Figure 3. Stack Frame/Activation Record for a Function

2. What Happens at a Function Call

Upon calling, the calling function will,

Save registers

Push parameters on call stack

Push return address on call stack

Jump to the beginning of the called function

The called function will,

Push stack frame pointer (ebp) of the calling function to call stack, and set stack frame pointer (ebp) to stack top (esp).

Reserve memory for local variables and push local variables to call stack

Upon Return of a function call, the called function will,

Set return values

Pop out local variables from call stack

Pop out calling function stack frame pointer and restore it to ebp register

Pop out return address and jump to the return address on stack

The calling function will,

Get return values

Pops input parameters for called function out from call stack

Restore saved registers

To be Continued

Followed Posts will discuss stack overflow and its protection, and how understanding of call stack help us to debug C/C++ programs.

References:

1. X86 Assembly/GAS Syntax: http://en.wikibooks.org/wiki/X86_Assembly/GAS_Syntax

Command Line Arguments Parsing in glibc

Many Linux programs are command line based and sometimes the options can be complicated. Luckily, the GNU C library glibc provides some APIs to simplify the command line option parsing.

Specifically, there’re two methods for parsing the commands, getopt and getopt_long. getopt() is used to parse the single character option, and getopt_long() works with both long options and single-character options. It’s recommended to use getopt_long().

Explanation of getopt_long() Prototype

The function getopt_long has the following prototype,

int getopt_long (int argc, char *const *argv, const char *shortopts, const struct option *longopts, int *indexptr)

argc and argv are the argument count and argument vector, the same as the two input parameters in the standard main prototype.

shortopts is a string specifying the option characters that are valid as input options. An option character can be followed by a colon (‘:’) to indicate it takes a required argument. The character can also be followed by two colons (‘::’) to indicate the argument is optional. If nothing follows the option character, the option doesn’t take any arguments.

longopts describes the long options to accept, which is an array of struct option. longopts must be terminated by a struct option of all 0s.

The struct option describes a single long option, which is defined as below,

struct option {

const char *name;    //the name of the option

int has_arg;  //can be no_argument, required_argument, or optional_argument

int *flag; //for flag option only, set to NULL for non-flag options

int val;  //for flag, val is the value to store in the flag when the flag option is seen;

              //for other options, val is a value that is uniquely identify the long option

};

indexptr takes address of an integer variable, and getopt_long will store the index of the matched long option in the array longopts into it.

The Command Line Parse in Mini Firewall

#include <stdio.h>

#include <stdlib.h>

#include <getopt.h>

 

#define print_value(x) (x==NULL?"-" : x)

 

static struct mf_rule_struct {

    int in_out;

    char *src_ip;

    char *src_netmask;

    char *src_port;            //default to -1 

    char *dest_ip;

    char *dest_netmask;

    char *dest_port;

    char *proto;

    char *action;

} mf_rule;

 

static struct mf_delete_struct {

    char *cmd;

    char *row;

} mf_delete;

 

void send_rule_to_proc()

{

    printf("TODO: send_rule_to_procn");

}

 

void send_delete_to_proc()

{

    printf("TODO: send_delete_to_procn");

}

 

void print_rule()

{

    printf("TODO: print_rulen");

    return;

}

 

int main(int argc, char **argv)

{

    int c; int action = 1;    //1: new rule; 2: print; 3: delete

    mf_rule.in_out = -1; mf_rule.src_ip = NULL; mf_rule.src_netmask = NULL; mf_rule.src_port = NULL;

    mf_rule.dest_ip = NULL; mf_rule.dest_netmask = NULL; mf_rule.dest_port = NULL;mf_rule.proto = NULL;

    mf_rule.action = NULL;

    while (1) 

    {

        static struct option long_options[] = 

        {

        /*set a flag*/

            {"in", no_argument, &mf_rule.in_out, 1},

            {"out", no_argument, &mf_rule.in_out, 0},

        /*These options don't set a flag.

            We distinguish them by their indices.*/

            {"print", no_argument, 0, 'o'},

            {"delete", required_argument, 0, 'd'},

            {"srcip", required_argument, 0, 's'},

            {"srcnetmask", required_argument, 0, 'm'},

            {"srcport", required_argument, 0, 'p'},

            {"destip", required_argument, 0, 't'},

            {"destnetmask", required_argument, 0, 'n'},

            {"destport", required_argument, 0, 'q'},

            {"proto", required_argument, 0, 'c'},

            {"action", required_argument, 0, 'a'},

            {0, 0, 0, 0}

        };

        int option_index = 0;

        c = getopt_long(argc, argv, "od:s:m:p:t:n:q:c:a:", long_options, &option_index);

        /*Detect the end of the options. */

        if (c == -1)

            break;

        action = 1;

        switch (c)

        {

            case 0:

              printf("flag option: %s, mf_rule.in_out = %dn", long_options[option_index].name, mf_rule.in_out);

              break;

            case 'o':

                action = 2;    //print

              break;

            case 'd':

              action = 3;       //delete

              mf_delete.cmd = (char *)long_options[option_index].name;

              mf_delete.row = optarg;

              break;

            case 's':

              mf_rule.src_ip = optarg;  //src ip

              break; 

            case 'm':

              mf_rule.src_netmask = optarg; //srcnetmask:

              break;

            case 'p':

              mf_rule.src_port = optarg;    //srcport:

              break;

            case 't':

              mf_rule.dest_ip = optarg;     //destip:

              break;

            case 'n':

              mf_rule.dest_netmask = optarg;    //destnetmask

              break;

            case 'q':

              mf_rule.dest_port = optarg;    //destport

              break;

            case 'c':

              mf_rule.proto = optarg; //proto

              break;

            case 'a':

              mf_rule.action = optarg;//action

              break;

            case '?':

              /* getopt_long printed an error message. */

              break;

            default:

              abort();

        }

    if (c != 0)

        printf("%s = %sn",  long_options[option_index].name, optarg);

    }

    if (action == 1) {

        send_rule_to_proc();

    } else if (action == 2) {

        print_rule();

    } else if (action == 3) {

        send_delete_to_proc();

    }

    if (optind < argc)

    {

        printf("non-option ARGV-elements: ");

        while (optind < argc)

        printf("%s ", argv[optind++]);

        putchar('n');

    }

}

 

In order to understand the code above, you may need to read the section below,

How getopt_long Matches the Input Options

For short option, getopt_long returns the character code for the option, and stores the option’s argument (if it has one) in optarg.

For long option, getopt_long takes action based on flag and val fields of the option matched.

If the flag is NULL, meaning the option is not a flag. getopt_long returns val. Normally you can use the short option’s character code in val if long option is equivalent to the short option.

If the flag is not NULL, meaning it’s a flag option. getopt_long will return 0, and the flag value will be set accordingly.

For all options, getopt_long stores the matched option’s index in array longopts to *indexptr. You can access the name of the option by longopts[*indexptr].name.

getopt_long will also put the argument value in optarg if the option has one. Otherwise, optarg is set to NULL.

When getopt_long has no more options to handle, it returns –1. The index of next remaining argument in argv is stored in variable optind.

Sample Input and Output of the Mini Firewall Command Line Parsing

Firstly, save the program as cmd.c and compile the code using the command below,

gcc -o cmd cmd.c

Input 1:

./cmd --in --srcip 10.2.10.2 --srcnetmask 255.255.255.0 --srcport 1000 --destip 72.23.13.23 --destnetmask 255.255.0.0 --destport 80 --proto 6 --action 0

Output 1:

flag option: in, mf_rule.in_out = 1

srcip = 10.2.10.2

srcnetmask = 255.255.255.0

srcport = 1000

destip = 72.23.13.23

destnetmask = 255.255.0.0

destport = 80

proto = 6

action = 0

TODO: send_rule_to_proc

Input 2:

./cmd --delete 1

Output 2:

delete = 1

TODO: send_delete_to_proc

Note:

This post is part of the tutorial: How to write a Linux Firewall in Less than 1000 Lines

Part 1: Overview

Part 2: Command Line Arguments Parsing in glibc

Part 3.1: Linux Kernel Module Basics and Hello World

Part 3.2: Linux Kernel Programming – Linked List

Part 3.3 Linux Kernel Programming – Memory Allocation

Part 4.1: How to Filter Network Packets using Netfilter – Part 1 Netfilter Hooks

Part 4.2 How to Filter Network Packets using Netfilter – Part 2 Implement the Hook Function

Part 5: Linux procfs Virtual File System

Part 6: Put Everything Together

Reference:

http://www.gnu.org/s/libc/manual/html_node/Getopt.html

How to Reset USB Device in Linux–using libusb

This is a follow up of the previous entry How to Reset USB Device in Linux. The previous blog covers a method to reset usb device using ioctl system call. This blog gives an alternative approach to reset usb device, using libusb.

libusb is a cross-platform library giving user level application uniform access to usb devices. You’ll need to install the development package first. Simply typing the following commands to your terminal (using Ubuntu as an example),

sudo apt-get install libusb-1.0-0-dev

After installation, you can use the methods in libusb. Below is the C source code to reset usb device using libusb API,

#include <stdio.h>
#include <stdlib.h>
#include <libusb-1.0/libusb.h>

//compile: gcc usbreset.c -o usbreset -lusb-1.0
//usage: ./usbreset 2 6
//use lsusb to check out the bus number and device number

struct libusb_device_handle *devh;
struct libusb_device *dev;
struct libusb_device **devs;

void resetUSB() {
    int success;
    int bpoint = 0;
    do {
        success = libusb_reset_device(devh);
        if ((bpoint % 10) == 0) {
            printf(".");
        }
        ++bpoint;
        if (bpoint > 100) {
            success = 1;
        }
    } while (success < 0);
    if (success) {
        printf("nreset usb device failed:%dn", success);
    } else {
        printf("nreset usb device okn");
    }
}

struct libusb_device* search_device(int _busNum, int _devNum) { 
    libusb_device *l_dev;
    int i = 0;
    int l_busNum, l_devNum;
   
    while ((l_dev = devs[i++]) != NULL) {
        printf("check against %d devicen", i);
        l_busNum =(int) libusb_get_bus_number(l_dev);
        l_devNum =(int) libusb_get_device_address(l_dev);
        printf("bus number: %d; device number: %dn", l_busNum, l_devNum);
        if ((l_busNum == _busNum) && (l_devNum == _devNum)) {
            printf("found devicen");
            return l_dev;
        }
    }
    return NULL;
}

int main(int argc, char **argv) {
    //parse the input parameters to get the bus number and device number
    int l_busNum, l_devNum;
    int l_ret;

    printf("program started!n");
    if (argc < 3) {
        printf("not enough arguments!n");
        printf("usage: ./usbreset <bus number> <dev number>n");
        return 0;
    }
    printf("bus number: %sn", argv[1]);
    printf("dev number: %sn", argv[2]);

    l_busNum = atoi(argv[1]);
    l_devNum = atoi(argv[2]);
    printf("bus number: %d; dev number: %dn", l_busNum, l_devNum);

    l_ret = libusb_init(NULL);
    if (l_ret < 0) {
        return l_ret;
    }
    l_ret = libusb_get_device_list(NULL, &devs);
    if (l_ret < 0) {
        return (int) l_ret;
    }
    dev = search_device(l_busNum, l_devNum);
    if (dev == NULL) {
        printf("device not foundn");
        return 0;
    }
    l_ret = libusb_open(dev, &devh);    
    if (l_ret == 0) {
        printf("got the usb handle successfully.n");
    } else {
        printf("error getting usb handle.n");
    }
    //reset the usb device
    resetUSB();
    //free the device list
    libusb_free_device_list(devs, 1);
    libusb_exit(NULL);

    return 0;
}


The code can be compiled using

gcc usbreset.c -o usbreset -lusb-1.0

To use the complied program usbreset, you can simply use,

sudo ./usbreset <bus number> <device number>

To check the bus number and device number of a usb device, you can use lsusb command.

To check if a device is resetted or not, you can use tail -f /var/log/messages.

An detailed example is given in How to Reset USB Device.

Depth-First Graph Traversal using Stack — a C Implementation

There’re two common ways of visiting all nodes of a graph, depth-first search and breadth-first search. These two graph traversal approaches relate to two popular data structures, stack and queue respectively. (One can also do it in a recursive manner. It’s more intuitive but less efficient.)

This post covers depth-first search with a stack data structure. The algorithm expressed in pseudo code is as below,

DFS (Graph, root):
   create a stack S
   push root node to S
   while S is not empty:
       pop an itme v from S
       mark the item v as visited
       for each node w that is directed from v:
           if w is not visited:
               push w to S

The first step of the implementation is to implement a stack. The stack implemented below has the basic functions like push, pop, ifEmpty, etc.

1. Stack Implementation

The header file stack.h, defines the data structures used in Stack and Graph.

#include <stdio.h>
#include <stdlib.h>
typedef struct Point {
   int h;
   int w;
} Point;
typedef struct GraphNode {
   struct Point pvalue;
   struct GraphNode* left;
   struct GraphNode* right;
} GraphNode;
typedef struct StackElement {
   struct GraphNode value;
   struct StackElement *up;
   struct StackElement *down;
} StackElement;
typedef struct Stack {
   struct StackElement* topP;
   struct StackElement* bottomP;
} Stack;
void initStack(struct Stack *s);
void push(struct Stack *s, struct GraphNode _value);
void pop(struct Stack *s);
struct GraphNode top(struct Stack *s);
int ifEmpty(struct Stack *s);

The actual implementation stack.c

/**
a C stack implementation using linked list
*/
#include "stack.h"
/*initialize the stack*/
void initStack(struct Stack *s) {
   s->topP = NULL;
   s->bottomP = NULL;
}
/*add an element to the top of the stack*/
void push(struct Stack *s, struct GraphNode _value) {
   //allocate a new StackElement for _value
   StackElement *newElement;
   newElement = (StackElement*) malloc(sizeof(StackElement));
   newElement->value = _value;
   newElement->up = NULL;
   newElement->down = NULL;
   if (s->topP == NULL) {
        //first element
       s->topP = newElement;
        s->bottomP = newElement;
   } else {
        //push it to the top
        newElement->down = s->topP;
        s->topP = newElement;
   }
}
/*delete the top element from the stack*/
void pop(struct Stack *s) {
   StackElement *element;
   if (s->topP == NULL) {
        //empty stack
        return;
   } else {
        element = s->topP;
        s->topP = element->down;
        if (s->topP != NULL)
             s->topP->up = NULL;
        free(element);
   }
}
/*get the top value of the stack, but don't delete it*/
struct GraphNode top(struct Stack *s) {
   return s->topP->value;
}
/*check if the stack is empty*/
int ifEmpty(struct Stack *s) {
   return (s->topP == NULL ? 1:0);
}

Note that the top() function above returns the top element of the stack and the pop() function deletes the top element from the stack. Some stack implementations has the pop() function does both return and delete. But I personally think separating them is more convenient as sometimes you just want to take a look at the stack but not delete anything from it.

The Stack above will take input element of GraphNode type, but you’re free to modify it to fit your needs.

2. The Depth-First Graph Traversal

For simplicity, an example of visiting all nodes of a binary tree is given below. The tree structure is included in the comment of the source code file dfs.c,

#include <stdio.h>
#include <stdlib.h>
#include "stack.h"
/* a simple graph (binary tree)
         (0,    3)   <– root node
          /     
      (1, 2)    (1,4)
       /       /   
   (2,1) (2,2) (2,3) (2,5)
*/
typedef struct Graph {
   struct GraphNode* root;
} Graph;
struct GraphNode initNode(int _h, int _w) {
   struct GraphNode node;
   node.pvalue.h = _h;
   node.pvalue.w = _w;
   node.left = NULL;
   node.right = NULL;
   return node;
}
int main() {
   struct Graph g;
   struct GraphNode currentNode;
   /*read in the graph from top to bottom*/
   g.root = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root = initNode(0, 3);
   //level 2 nodes
   g.root->left = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->left = initNode(1, 2);
   g.root->right = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->right = initNode(1, 4);
   //level 3 nodes
   (*g.root->left).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).left = initNode(2, 1);
   (*g.root->left).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).right = initNode(2, 2);
   (*g.root->right).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).left = initNode(2, 3);
   (*g.root->right).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).right = initNode(2, 5);
   /*DFS traversal*/
   Stack s;
   initStack(&s);
   push(&s, *g.root);
   while (!ifEmpty(&s)) {
       currentNode = top(&s);
        pop(&s);
        //mark as visited, here we simply print the graph node value out
       printf("(%d, %d)n", currentNode.pvalue.h, currentNode.pvalue.w);
        //at most two children, left and right. As the binary tree is acyclic,
        // both nodes should not be visited yet, for cyclic graphs, 
        //one need additional methods to check if the nodes is visited or not before push
       if (currentNode.left != NULL) {
           push(&s, *currentNode.left);
       } 
       if (currentNode.right != NULL) {
           push(&s, *currentNode.right);
        } 
   }
   return 0;
}

3. Output

To compile the code in Linux, type the following command,

gcc stack.c dfs.c -o dfs

Then run the binary dfs,

        ./dfs

The output we’ll get is,

(0, 3)

(1, 4)

(2, 5)

(2, 3)

(1, 2)

(2, 2)

(2, 1)

which corresponds to the expected nodes visiting order in depth-first search. As we push the left node first, the traversal is from right to left.

Breadth-First Graph Traversal using Queue – a C Implementation

There’re two common ways of visiting all nodes of a graph, depth-first search and breadth-first search. These two graph traversal approaches relate to two popular data structures, stack and queue respectively. (One can also do it in a recursive manner. It’s more intuitive but less efficient.)

This post covers breadth-first search with a queue data structure. The algorithm expressed in pseudo code is as below,

BFS (Graph, root):
    create a queue Q
    enqueue root node to Q
    while Q is not empty:
        dequeue an itme v from Q
        mark the item v as visited
        for each node w that is directed from v:
            if w is not visited:
                enqueue w to Q

The first step of the implementation is to implement a queue. The queue implemented below has the basic functions like enqueue, dequeue, ifEmpty, etc.

1. A Queue Implementation

The header file queue.h, defines the data structures used in Queue and Graph.

#include <stdio.h>
#include <stdlib.h>
typedef struct Point {
   int h;
   int w;
} Point;
typedef struct GraphNode {
   struct Point pvalue;
   struct GraphNode* left;
   struct GraphNode* right;
} GraphNode;
typedef struct QueueElement {
   struct GraphNode value;
   struct QueueElement *next;
} QueueElement;
typedef struct Queue {
   struct QueueElement* head;
   struct QueueElement* tail;
} Queue;
void initQueue(struct Queue *q);
void enqueue(struct Queue *q, struct GraphNode _value);
void dequeue(struct Queue *q);
struct GraphNode front(struct Queue *q);
int ifEmpty(struct Queue *q);

The source file queue.c,

/**
a C queue implementation using linked list
*/
#include "queue.h"
/*initialize the queue*/
void initQueue(struct Queue *q) {
   q->head = NULL;
   q->tail = NULL;
}
/*insert an element to the end of the queue*/
void enqueue(struct Queue *q, struct GraphNode _value) {
   //allocate a new QeueuElement for _value
   QueueElement *newElement;
   newElement = (QueueElement*) malloc(sizeof(QueueElement));
   newElement->value = _value;
   newElement->next = NULL;
   if (q->head == NULL) {
        //first element
       q->head = newElement;
       q->tail = newElement;
   } else {
        //put it to the tail
        q->tail->next = newElement;
        q->tail = newElement;
   }
}
/*delete the first element from the queue*/
void dequeue(struct Queue *q) {
   QueueElement *element;
   if (q->head == NULL) {
        //empty queue
        return;
   } else {
        element = q->head;
        q->head = q->head->next;
        free(element);
   }
}
/*get the front value of the queue, but don't delete it*/
struct GraphNode front(struct Queue *q) {
   return q->head->value;
}
/*check if the queue is empty*/
int ifEmpty(struct Queue *q) {
   return (q->head == NULL ? 1:0);
}

Note that the front() function above returns the first element of the queue and the dequeue() function deletes the first element from the queue. Some queue implementations has the dequeue() function does both return and delete. But I personally think separating them is more convenient as sometimes you just want to take a look at the queue but not delete anything from it.

The Queue above will take input element of GraphNode type, but you’re free to modify it to fit your needs.

2. The Breadth-First Graph Traversal

For simplicity, an example of visiting all nodes of a binary tree is given below. The tree structure is included in the comment of the source code file bfs.c,

#include <stdio.h>
#include <stdlib.h>
#include "queue.h"
/* a simple graph (binary tree)
         (0,    3)   <– root node
          /     
      (1, 2)    (1,4)
       /       /   
   (2,1) (2,2) (2,3) (2,5)
*/
typedef struct Graph {
   struct GraphNode* root;
} Graph;
struct GraphNode initNode(int _h, int _w) {
   struct GraphNode node;
   node.pvalue.h = _h;
   node.pvalue.w = _w;
   node.left = NULL;
   node.right = NULL;
   return node;
}
int main() {
   struct Graph g;
   struct GraphNode currentNode;
   /*read in the graph from top to bottom*/
   g.root = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root = initNode(0, 3);
   //level 2 nodes
   g.root->left = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->left = initNode(1, 2);
   g.root->right = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->right = initNode(1, 4);
   //level 3 nodes
   (*g.root->left).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).left = initNode(2, 1);
   (*g.root->left).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).right = initNode(2, 2);
   (*g.root->right).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).left = initNode(2, 3);
   (*g.root->right).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).right = initNode(2, 5);
   /*BFS traversal*/
   Queue q;
   initQueue(&q);
   enqueue(&q, *g.root);
   while (!ifEmpty(&q)) {
       currentNode = front(&q);
       //dequeue(&q); //you can either put dequeue here or at the end of while
        //mark as visited, here we simply print the graph node value out
       printf("(%d, %d)n", currentNode.pvalue.h, currentNode.pvalue.w);
        //at most two children, left and right. As the binary tree is acyclic,
        // both nodes should not be visited yet, for cyclic graphs, 
        //one need additional methods to check if the nodes is visited or not before enqueue
       if (currentNode.left != NULL) {
           enqueue(&q, *currentNode.left);
       } 
       if (currentNode.right != NULL) {
           enqueue(&q, *currentNode.right);
        } 
        dequeue(&q);
   }
   return 0;
}

3. Output

To compile the code in Linux, type the following command,

gcc queue.c bfs.c -o bfs

Then run the binary bfs,

        ./bfs

The output we’ll get is,

(0, 3)

(1, 2)

(1, 4)

(2, 1)

(2, 2)

(2, 3)

(2, 5)

which corresponds to the expected nodes visiting order in breadth-first search, level 0 node, follows by level 1 nodes, and finally level 2 nodes.

 

Pre-increment and Post-increment

Side Note: First draft on Mar 26 2011. Back in undergraduate, I was in a programming competition team. We were writing small programs that have to be extremely efficient as there’re time constraint for running these programs. One night, a teammate of mine told me that “you know, pre-increment is more efficient than post-increment”.

Well. The different of pre-increment and post-increment is not difficult to tell. People has basic programming background usually know pre-increment returns the value after increment (think it as increment first, then return), while post-increment returns the value before increment (think it as return the value first, then increment). For example (in C programming language),

#include <stdio.h>

int main(int argc, char **argv)
{
    int x = 5;
 
    printf("x=%dn", ++x);	//pre-increment
    printf("x=%dn", x++);	//post-increment
    printf("x=%dn", x);
 
    return 0;
}

The program will give you the following result,

x = 6
x = 6
x = 7

That’s the difference most people know about. If we don’t care about the value returned by the increments, instead using it for increasing the value only, is any difference between them?

The answer is YES. And this goes down to the implementation of the two operations. Here we just use C syntax to illustrate the difference, the actual implementation for different languages may vary, but you should get the idea that pre-increment is more efficient.

int pre_increment(int* _input) {
    *_input = *_input + 1;
    return *_input;
}

int post_increment(int* _input) {
    int temp = *_input;
    *_input = *_input + 1;
    return temp;
}

As you can tell from the code, post_increment needs to introduce a temporary variable to hold the value before the increment operation. That’s why it is less efficient.

So next time, you are writing a loop increment operation, you know pre-increment is a better choice. Smile

for (int i = 0; i < N; ++i) {
    //do some processing
}

Some Sorting Algorithms

Side Note: I found this on my computer. I wrote this some time in 2009. Just to put everything to a single place.

Insertion sort:
Basic Idea:

        1. starting from index 1, going through the array and every iteration insert the current element into a proper position at the already sorted previous positions.
        2. The reason the outer loop starts from 1 instead of 0 is that the first element is considered sorted itself.

Source code in C++

#include <iostream>
using namespace std;

class InsertionSort {
  public:
    static void insertionSort(int * a, int n) {
        int i, j, t;
        for (i = 1; i < n; ++i) {
        j = i;
            t = a[i];
            while (j > 0 && a[j-1] > t) {
            a[j] = a[j-1];
                --j;
        }
            a[j] = t;
    }
    }   
}; int main() {
    int test[5] = {2, 3, 8, 4, 7};
    InsertionSort::insertionSort(test, 5);
    for (int i = 0; i < 5; ++i) {
        cout << test[i] << endl;
    }
    return 0;
}

   
Selection Sort:

Basic Ideas:

        1. Starting from index 0, going through the entire array, every time select the smallest element from the unsorted part and swap it with the current element.

Source code in C++:


#include <iostream>

using namespace std;

class SelectionSorter {

  public:

    static void selectionSort(int *a, int n) {

        int i, j, k, t;

        for (i = 0; i < n; ++i) {

        t = a[i];

        k = i;

        for (j = i+1; j < n; ++j) {

            if (a[j] < t) {

           t = a[j];

           k = j;

        }

        }

        a[k] = a[i];

        a[i] = t;

    }

    }

};

int main() {

    int test[5] = {3, 4, 2, 5, 8};

    SelectionSorter::selectionSort(test, 5);

    for (int i = 0; i < 5; ++i) {

        cout << test[i] << endl;

    }

    system("pause");

    return 0;

}

Bubble sort: 

Basic Ideas:

        1. It starts from index 0, by swapping the neighboring elements if they’re not in order. At every iteration, one element is put to the right position at the tail of the array.

Source code in C++:


#include <iostream>

using namespace std;

class BubbleSorter {

  public:

    static void bubbleSort(int *a, int n) {

        int i, j, t;

        for (i = 0; i < n; ++i) {

        for (j = 1; j < n-i; ++j) {

            if (a[j-1] > a[j]) {

           t = a[j-1];

           a[j-1] = a[j];

           a[j] = t;

        }

        }

    }

    }

};

int main() {

    int test[5] = {2, 3, 8, 4, 7};

    BubbleSort::bubbleSort(test, 5);

    for (int i = 0; i < 5; ++i) {

        cout << test[i] << endl;

    }

    system("pause");

    return 0;

}

C Bitwise Operations

C provides six bitwise operations. They can apply to integer data type only, including char, short, int and long. The advantage of using bitwise operators is speed. It generally makes your program faster.

The six bitwise operators are,
&  AND (eg. 0011&0101 = 0001)
|  OR (inclusive, eg. 0011|0101 = 0111)
^  XOR (exclusive, eg. 0011^0101 = 0110)
~  ONE'S COMPLEMENT(unary, eg. ^01 = 10)
<<  LEFT SHIFT
>>  RIGHT SHIFT

The right operands for shift operators must be non-negative. For example, x >> 2 will shift the variable x right by 2 digits, which is equivalent to multiply the variable by 4. A worth-noting point for shift operators is that if the left operand is unsigned, zeros will be fit into the vacated bits. While for signed, it’s machine-dependent, some use zeros, some use the sign bit. The applications for bitwise operators,

To "set" a bit (where n is the bit number, and bit 0 is
the least significant bit):
unsigned char a |= (1 << n);
To "clear" a bit:
unsigned char b &= ~(1 << n);

 

To "toggle" a bit:
unsigned char c ^= (1 << n);
To "test" a bit:
unsigned char e = d & (1 << n);
To check a variable is even or odd:
var&1, if it's true, then var is odd, otherwise, it's even.

 


To swap two variables (This is cool!):
int a, int b;   a = a^b; b = a^b; a = a^b;