Slice Resynchronization in MPEG4

MPEG4 Part-2 introduced three error resilience tools, including Resynchronization, Data Partitioning and Reversible VLC. This post discusses Resynchronization only.

The Problem

The bitstream of a MPEG4 video frame (and lots of other video codecs) is encoded using VLC (Variable Length Coding). Because the number of bits for each coefficient varies and the length is implicit, VLC bitstream is sensitive to errors. If an error causes wrong number of bits to be decoded for a coefficient, the bits for the next coefficient will be affected, and so on. The decoder essentially loses synchronization with the encoder. In this way, the error propagates and the video quality suffers.

GOB (Group of Blocks) in H.261 & H.263

H.261 and H.263 organize the macroblocks into groups, called Group of Blocks. Each GOB contains one or more rows of macroblocks and a GOB header with a resynchronization marker and other information that can be used to resynchronize the decoder.

The GOB approach is based on spatial periodic resynchronization — a resynchronization marker and other info of the GOB header is inserted when a particular macroblock position is reached at encoding. This approach resulted in different number of bits in each GOB because the encoded bits for each macroblock varies. In picture areas where more bits are used to encode the scene, the resynchronization markers are more sparse, thus makes it more difficult to conceal the error at those areas.

Slice in MPEG4 (Packet-Based Resynchronization)

MPEG4 adapts a video packet based resynchronization scheme. In the encoding process, a frame is divided into one or more video packet (also called slice sometimes). The length of each slice/packet is not based on number of macroblocks. Instead, if the number of bits exceeds a predetermined threshold, current slice is ended and a new slice is created at the start of next macroblock.

The structure of a slice is as below,

Resync Marker MB_number quant_scale HEC MB data

A resync marker is used to indicate the start of a new slice. It’s different from all possible VLC code words and the VOP start code. In addition, information that necessary to restart the decoding process is provided, including,

macroblock_number: macroblock position of the first macroblock in the video packet, which facilitates spatial resynchronization.

quantization_scale: quantization parameters needed to decode the first macroblock, which facilitates resynchronization of differential decoding.

HEC: Header Extension Code. A single bit indicating if additional information is following it.  When set to 1, additional info is available in the video packet header: modulo_time_base, vop_time_increment, vop_coding_type, intra_dc_vlc_thr, vop_fcode_forward and vop_fcode_backward.

Note that when HEC is equal to 1, the slice header contains all necessary information to decode the slice, thus the slice can be decoded independently. If HEC is set to 0, the decoder still needs some information from somewhere else to decode the slice.

When slice resynchronization tool is used, some of the encoding tools are modified to remove the dependencies among any two video packets.  One example is the predictive encoding must be confined within a video packet to prevent propagation of errors. In other words, a slice boundary is treated as a VOP boundary at AC/DC predication and motion vector predication.

Fixed-Interval Resynchronization

Packet-based Resynchronization produces video packets of similar length, but not exactly the same length. In case the error happens to result in a bit pattern same as resync marker, the decoder won’t be able to tell. This is normally known as start codes emulations.

To avoid this problem, MPEG4 also adopts a method called fixed interval resynchronization. It requires VOP start codes and video packet resynchronization markers appear only at legal fixed interval position in the bitstream. The fixed interval is achieved by stuffing the video packet with a leading ‘0’ and zero or more ‘1’s.

At decoding, the decoder only needs to search for VOP start code and resynchronization marker at the beginning of each fixed interval. Therefore, emulating a VOP start code or resynchronization marker in the middle of a fixed interval cannot confuse the decoder.


1. The MPEG-4 Book, by Fernando C.N. Pereira, Touradj Ebrahimi
2. MPEG-4 Standard, Part 2, Annex E.1 Error Resilience

TCP TIME_WAIT State and Address Already in Use Error

0. The Problem
Recently I am working on a project consists of TCP socket programming on Linux. I encountered errno 98 (address already in use) and 99 (cannot assign requested address) frequently. I wrote a small test program to reproduce the issue. The test code is as below,

#include <stdio.h>

#include <stdlib.h>

#include <unistd.h>

#include <string.h>

#include <errno.h>

#include <sys/types.h>

#include <sys/socket.h>

#include <sys/select.h>

#include <sys/time.h>

#include <fcntl.h>

#include <netinet/in.h>

#include <netdb.h> 

#include <arpa/inet.h>

#include <pthread.h>

#include <signal.h>

#include <sys/time.h>


unsigned short PROXY_SERVER_PORT = 0x5B77;        

#define PROXY_CLIENT_ST_PORT 0x6B77                

unsigned int bindError = 0;


int main(void) {

    int socketFd;

    int one = 1;

    int mapIdx = 0;

    int i, j; 

    int conTry = 0;

    struct sockaddr_in localAddr, serv_addr;

    struct addrinfo hints, *res;

    int err;

    //char *hostname = "";

    char *hostname = "";

    memset(&hints, 0, sizeof(hints));

    hints.ai_socktype = SOCK_STREAM;

    hints.ai_family = AF_INET;


    if ((err = getaddrinfo(hostname, NULL, &hints, &res)) != 0) {

        printf("error %dn", err);

        return 1;



    bzero((char *) &serv_addr, sizeof(serv_addr));

    serv_addr.sin_family = AF_INET;

    serv_addr.sin_addr.s_addr = ((struct sockaddr_in *)(res->ai_addr))->sin_addr.s_addr;    

    serv_addr.sin_port = htons(80);


    printf("ip address : %sn", inet_ntoa(serv_addr.sin_addr));



    printf("before loop...n");

    for (i = 0; i < 3; ++i) {

        socketFd = socket(AF_INET, SOCK_STREAM, 0);

        //if the line below is enabled, error occurs at connect

        setsockopt(socketFd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof one);

        if (socketFd < 0)  {

            printf("ERROR opening proxy tcp client socket %d: %d", mapIdx, errno);



        memset(&localAddr, 0, sizeof(struct sockaddr_in));

        localAddr.sin_family = AF_INET;

        localAddr.sin_addr.s_addr = INADDR_ANY;

        localAddr.sin_port = htons((PROXY_CLIENT_ST_PORT + mapIdx + bindError)%65535);


        for (j = 1; ;++j) {

            if (bind(socketFd, (struct sockaddr *)&localAddr, sizeof(struct sockaddr_in)) < 0) {

                printf("Error %d, tcp proxy client binding to %dn", errno, ntohs(localAddr.sin_port));

                perror("error binding tcp proxy client: ");


                localAddr.sin_port = htons((PROXY_CLIENT_ST_PORT + mapIdx + bindError)%65535);

            } else {    




        printf("bound socket %d to tcp client port %un", socketFd, ntohs(localAddr.sin_port));


        if (connect(socketFd,(struct sockaddr *) &serv_addr,sizeof(serv_addr)) < 0) {

            printf("error code: %dn", errno);

            perror("error connecting to ori server from proxy tcp client: ");

            if (errno == EADDRNOTAVAIL) {

                printf("The specified address %d is not available from the local machine.n", ntohs(localAddr.sin_port));




        printf("connected: %dn", mapIdx);




    return 0;


Save the code to test.c, then you can compile it using “gcc -o test test.c”. Running the program multiple times will likely to give you one error, enable/disable line “setsockopt(socketFd, SOL_SOCKET, SO_REUSEADDR, &one, sizeof one);“ will give you the other error.

When the error occurs, run netstat command “netstat | grep 27511” (from the program output, I know the error occurs at port 27511). Below is the screenshot,

Figure 1. Cannot Assign Request Address

As shown in figure 1, the tcp port is in TIME_WAIT state.

1. The TCP State Transition at Closure

To close a established TCP connection, both endpoints send FIN packets to indicate there’s no more data. Upon receiving the other party’s FIN packet, both endpoints need to ACK it.

The FIN packets are sent when a program calls exit(), close() or shutdown(). The ACKs are handled by the kernel after close() is completed. Therefore, it is possible that the program finishes before the kernel releases the associated network resource. And another process won’t be able to use it until kernel has freed it.

Below is a figure of detailed state transitions for an endpoint when TCP connection closes. It follows different paths depending on which side initiated the closure.

Figure 2. TCP State Transition at Closure (diagram from reference 4)

Note that TIME_WAIT only occurs at the endpoint which initiated the closure.


After the TCP connection is closed, there might still be live packets in the network. If a new connection is established with the exact same (client IP, client port, server IP, server port) tuple, the packets from the previous connection will be treated for the new connection.

To avoid this, TIME_WAIT time is generally set to twice the packets maximum age. The value is long enough that the packets for the old connection will be dead after the time expires. Note that setting TIME_WAIT at one endpoint would be enough to make sure no two exactly same (client IP, client port, server IP, server port) tuples appear.

3. How to Avoid the Problem

TIME_WAIT only occurs at the side which initiates the TCP connection closure, so a natural solution would be avoid calling close(). If you have control over both client and server, you may want to let the client close first, so the server won’t ends of lots of TIME_WAIT ports.

As indicated in the testing program, setsockopt() with SO_REUSEADDR allows you to bind the a socket to a port which in TIME_WAIT. If you use the socket as a client side, and try connecting to the same (server address, server port) tuple, you’ll fail at connect stage. However, connecting to other (server address, server port) is allowed.

If you use the socket as a server socket, you can also use SO_REUSEADDR. I’ve not tested if the same (client address, client port) tries to connect, what will happen. But I guess the connection request will be denied.

It’s also possible to modify the TIME_WAIT values on some operating systems.


1. Setting TIME_WAIT TCP, stackoverflow:

2. TIME_WAIT and its design implications for protocols and scalable client server systems:

3. The TIME_WAIT state in TCP and Its Effect on Busy Servers:

4. Bind: Address Already in Use or How to Avoid this Error when Closing TCPConnections

Video Converter Android Has > 50,000 Downloads

Video converter android has reached 50,000 downloads on Jan 21 2012. It reaches this number in slightly more than one month.

Though it’s still in beta release with lots of issues and not so good reviews (~3.0), the download figure makes me believe it’s an app that people need. The app is even listed at Top 50 in Video&Media category for about two weeks.

The next release will be around end of Jan 2012. I’ll also release a paid version for ARMV7 device with NEON function. The new features include more settings for video conversion, including resolution, bitrate, etc. A basic ffmpeg command line interface will be provided.

Stay tuned!

Android Custom Notification with Progress Bar

When your android app is doing something that user needs to wait, it’s better to tell the user how much has been done and how long user needs to wait. Progress bar comes into handy. Sometimes user goes to other app, but still wants to check out the task your app is doing, then a custom notification with progress bar is your choice.

It’s used in Android Market downloading. When you’re downloading an app, a notification is post to the status bar with progress bar indicates the progress. This post is how to make your own custom notification with progress bar and what you should avoid. The screenshot is as below,

Figure 1. Custom Notification with Status Bar

If you simply want the code, go to the end of the post and download it. Android doc (reference 1) provides a good tutorial about how to post notification and create custom notification in a step-by-step manner. It’s not repeated here.

The Code

The code to create the notification bar is as below,

nm = (NotificationManager) this.getSystemService(Context.NOTIFICATION_SERVICE);

CharSequence tickerText = "hello";

long when = System.currentTimeMillis();

noti = new Notification(R.drawable.ic_launcher, tickerText, when);

context = this.getApplicationContext();

Intent notiIntent = new Intent(context, ProgressBarNotificationActivity.class);

PendingIntent pi = PendingIntent.getService(context, 0, notiIntent, 0);

noti.flags |= Notification.FLAG_AUTO_CANCEL;

CharSequence title = "Downloading initializing...";

RemoteViews contentView = new RemoteViews(getPackageName(), R.layout.noti);

contentView.setImageViewResource(, R.drawable.ic_launcher);

contentView.setTextViewText(, title);

contentView.setProgressBar(, 100, 0, false);

noti.contentView = contentView;

noti.contentIntent = pi;


You’ll need to update the progress bar using a background thread. The code is as below,

new Thread(new Runnable() {

   public void run() {

        int mCount = 0;

        mRun = true;

        while (mRun) {



        CharSequence title = "Downloading: " + mCount%100 + "%";

        noti.contentView.setTextViewText(, title);

        noti.contentView.setProgressBar(, 100, mCount%100, false);

        nm.notify(STATUS_BAR_NOTIFICATION, noti);




The XML layout is noti.xml,

<?xml version="1.0" encoding="utf-8"?>

<RelativeLayout xmlns:android=""




    <ImageView android:id="@+id/status_icon"






    <RelativeLayout android:layout_width="fill_parent"




        <TextView android:id="@+id/status_text" 





        <ProgressBar android:id="@+id/status_progress"






            style="?android:attr/progressBarStyleHorizontal"  />






What to Avoid

If your task is long running, DO NOT update the progress very frequently. Frequent custom notification consumes lots of CPU and may hurt your app performance. A post here describes the detail.


You can download the source file here or get it from my github.

1. Android Developer Doc: Status Bar Notification, available at

Frequent Custom Status Bar Notification is Evil on Android

The Story Behind

In my recent app, video converter for Android, I add a notification progress report on the status bar. So users can view the video conversion progress any time. I also did several other changes, and then start running the final test before release.

Then I found the app is much slower compared with previous version. I couldn’t think of why. I turned on Power Tutor app, and found that the system process is consuming a lot of CPU. Notification bar is called through NotificationManager, which a system service. I guess it’s because I’m updating the status bar progress every second.

In the end, I changed the status progress update interval and app works as normal. But just to demonstrate frequent custom status bar notification is really evil, I programmed a simple testing app does nothing but status bar notification update.

Why I Say it’s Evil

The key part of the code is as below (you can get the entire code at the end of the post),


CharSequence title = "Freq noti is evil: " + mCount;

CharSequence content = "Freq notification update takes too much CPU";


    noti.contentView.setTextViewText(, title);

    noti.contentView.setProgressBar(, 100, mCount%100, false);

} else {

    Intent notiIntent = new Intent(context, StatusBarNotificationActivity.class);

    PendingIntent pi = PendingIntent.getService(context, 0, notiIntent, 0);

    noti.setLatestEventInfo(context, title, content, pi);


//nm = (NotificationManager) getSystemService(Context.NOTIFICATION_SERVICE);

//nm = (NotificationManager) getApplicationContext().getSystemService(getApplicationContext().NOTIFICATION_SERVICE);


The testing app supports both custom and default status notification. The custom status bar notification contains an icon, a text view and a progress bar. It updates the notification once every second. 

The Power Tutor CPU power consumption curve is as below for default status notification,

Figure 1. CPU Power Consumption with Frequent Default Status Bar Notification Update

The left one is before testing app starts, the right one is when testing app is running. There’s almost no difference.

The Power Tutor CPU power consumption curve is as below for custom status notification,


Figure 2. CPU Power Consumption with Frequent Custom Status Bar Notification

The first one is before testing app starts, the second one is when testing app is running, and the third one is after testing app is finished. Clearly frequent update of the custom notification status consumes lots of CPU resources.

I also tried without updating the progress bar (so only the title text of the notification is updated), the consumption is less, but still much higher than default one.

Figure 3. Custom Notification without Progress bar

You can download the testing app here or from my github. Note that you might want to run the testing app for a while to see the results. At some of my tests, the CPU consumption is low initially, but raise high after a while.

The testing is done on Nexus One device with Android 2.3, the android sdk used for development is 2.1 and 4.0.

Power Tutor–Android App Measures Power Consumption of App

Mobile computing and smart phones cannot be what we see today without powerful and energy efficient batteries. Mobile application developers, however, don’t get used to think of building energy efficient apps. Sometimes, it can affect the application a lot.

I recently found a tool, Power Tutor, that runs on Android devices to measure the power performance of the device. This post is about how Power Tutor works and how it can help to develop Android apps. It doesn’t cover every detail of the theory (one can refer to the research paper in reference 1 if interested.), but briefly describes how it works in general.

Component-Based Measurement

Power Tutor measures the energy consumption by components, including CPU, WIFI, 3G, Display etc. Each component has different states, and the power consumption rate at each state differ. Power Tutor includes a set of parameters for each state describing how fast power is consumed at the state. The parameters are obtained by their developers from offline experiments. Use these parameters, they construct a power measurement model to measure the energy. Below are screenshots of Power Tutor measurement for a Android phone.

Figure 1. Power Measurement for Android Phone

The parameters they use in current app is obtained from several HTC phones, so it works best for those models (refer to reference 1 for model details). The method Power Tutor developers/researchers proposed is able to obtain the parameters on other phones, but the app doesn’t provide such a function.

Power Consumption for Each App

Power Tutor has a “Application Viewer” function to view the power consumption for each app.

On Android, each app/process is considered as a separate user with its own UID. Under /proc/uid_stat/<UID>/ directory, lots of information are available about the app/process, including data transmitted, memory usage etc. Power Tutor map process id to UID, then based on the stats found under the <UID> folder to decide the states for each component. Then based on the power measurement model, Power Tutor computes the energy consumption.

Power Tutor thinks the energy consumption of each app is independent. In other words, Power Tutor assumes app A consumes the same amount of energy with or without app B running. In this way, based on statistics for each component for each UID/app, Power Tutor obtained the power consumption for each UID/app. Then summing them up to get the power consumption for the entire system.

If your app is running as super user program, the Android system assigns a UID of 0 to it. In this case, it breaks the Android rule of “each app is treated as a separate user”. Power Tutor shows UID 0 as “kernel” process.

Below are a screenshot of the stats for all running apps and a screenshot of the video converter app power consumption,

Figure 2. App-Based Power Measurement

How to Use it to Help Development

First of all, it can used to measure the power performance of your app. Because Power Tutor breaks down the power consumption to components, you can also get an idea what kind of activity is consuming the battery in your app. Is it network communication through WiFi? or too much CPU consumption?

Also Power Tutor measures the power based on statistics obtained from the /proc and /sys directories. It indirectly reflects the usage of hardware components. Is your app CPU bound, if your app is constantly consumes a lot of CPU component power? I give an example in another post here.

1. Power Tutor website:

Use memset to Avoid Loops in C

The Usage of memset

memset has the following definition,

void* memset(void* s, int c, size_t n);

The method set each of the first n bytes of memory pointed by s to c and return s. Note that although the second parameter has type of int, it’s converted to unsigned char before the memory operation is carried out. Therefore, c has to be in the range of [0, 255].

memset works on bytes. If you use memset for array of ints, or doubles, most of the case it won’t work. memset works on the following situations (there may be more cases),

  • initialize char, int, float arrays to 0s.
  • initialize char to any arbitrary value.
  • initialize int to values with same byte value for all 4 bytes. (e.g. 0x01010101 = 16843009)

But we seldom encounter this case.

memset and C Arrays

If the array is static, the memory is allocated at stack. The memory is continuous for both 1-D and multi-dimensional arrays.

If the array is dynamic, the memory is allocated at heap. If you’re allocating array memory using malloc, malloc can only assue the memory allocated on each call is continous. So if the array allocated is 1-D, the memory is continuous, otherwise, it’s not guaranteed (since you’re calling malloc multiple times).

memset works on a block of continuous memory. For static array, say int a[10][10], you can simply use memset(a, 0, sizeof(a[0][0])*10*10) to initialize all elements to 0. For dynamic array, the example below gives the idea,

#include <stdio.h>

#include <string.h>

#include <stdlib.h>

int main() {

   int **a;

   int i, j;

   a = (int **)malloc(sizeof(int *)*10);

   for (i = 0; i < 10; ++i) {

       a[i] = (int*)malloc(sizeof(int)*10);


   //memset to all 0s

   for (i = 0; i < 10; ++i) {

       memset(a[i], 0, sizeof(a[i][0])*10);


   //print out the value

   for (i = 0; i < 10; ++i) {

       for (j = 0; j < 10; ++j) {

           printf("%d ", a[i][j]);




   return 0;


Essentially, one memset is needed for one malloc.

Use memset instead of Loops

memset is usually optimized by compilers to produce more efficient binary code than loops. Below is a testing code to compare the performance of loop initialization and memset.


this is a small test program tests which method to get the file size is faster


#include <stdio.h>

#include <stdlib.h>

#include <string.h>

#define D1 50

#define D2 90

#define D3 90

unsigned char (*pData)[D1][D2][D3];

unsigned char data1[D1][D2][D3];

unsigned char data2[D1][D2][D3];

int main() {

   struct timeval stTime, edTime;

   int LOOP_COUNT = 100;

   int cnt, i, j, k;

   memset(data1, 0x01, D1*D2*D3);

   memset(data2, 0x00, D1*D2*D3);

   pData = &data1;

   //using memset 

   gettimeofday(&stTime, NULL);

   /*for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       for (i = 0; i < D1; ++i) {

           for (j = 0; j < D2; ++j) {

       memset(&(*pData)[i][j][0], 1, sizeof(unsigned char)*D3);




   for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       memset(&(*pData[0][0][0]), 1, sizeof(unsigned char)*D1*D2*D2);


   gettimeofday(&edTime, NULL);

   printf("%u:%un", (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

   //using loop

   pData = &data2;

   gettimeofday(&stTime, NULL);

   for (cnt = 0; cnt < LOOP_COUNT; ++cnt) {

       for (i = 0; i < D1; ++i) {

           for (j = 0; j < D2; ++j) {

       for (k = 0; k < D3; ++k) {

           (*pData)[i][j][k] = 1;





   gettimeofday(&edTime, NULL);

   printf("%u:%un", (unsigned int)(edTime.tv_sec - stTime.tv_sec), (unsigned int)(edTime.tv_usec - stTime.tv_usec));

   //(*pData)[_stFrame][_edH][_edW] = 88;    //uncomment see it fails the check below

   //to check two methods get same results

   for (i = 0; i < D1; ++i) {

       for (j = 0; j < D2; ++j) {

   for (k = 0; k < D3; ++k) {

               if (data1[i][j][k] != data2[i][j][k]) {

                   printf("it doesn't workn");







Compile the code with “gcc -o test test.c” and run the code with “./test” on a Levono Laptop Dual Core 2.0 GHz, 3G memory Ubuntu 10.04 system gives the results below,


memset is almost 20 times as fast as the loop method. Note that the commented code contains a test for doing memset for only the inner most loop, it actually still beats the loop method. This means even for dynamic allocated array which we can only use memset for inner most loop, it can still improve the performance over all loops method.


Arrays and Pointers:

How to Associate Process with Port Number using lsof — in Linux and Android

Sometimes it’s good to know port numbers used by certain processes. For developers, we can use this information to do application based packet filtering, network statistics detailed for each app, etc. For admins, we can use it to diagnose the network, detect malware etc.

Unfortunately, there’s no direct API or tools that provided by Linux can do this. But Linux does provide a tool called lsof (list open files), which lists out file information opened by processes. And in Linux’s philosophy, everything is a file. Network socket are no exception to this rule.

Yes, we can use lsof to associate process with port number.

For example, if I what to know which port my Chrome browse is using. First I use ps -ax to get the process id (let’s say, it’s 6903). Then I use the following command to get the UDP port opened by it,

lsof -p 6903 | grep UDP

Below is a screenshot shown the results,

The command gets the files open by process 6903, and then filtered results with “UDP”, so only the files open by process 6903 contains “UDP” will be displayed.

For TCP traffic,

lsof -p 6903 | grep TCP

And the screenshot,
As shown in figures, both the source and destination port numbers are displayed.

Lsof on Android

My phone is rooted and installed with BusyBox, so I don’t know whether lsof comes with Android by default or installed by BusyBox. But anyway, the one on my phone doesn’t work. Below is a screenshot,

Lots of things cannot be displayed properly and are replaced by question mark. Fortunately I found a workable binary from reference 2. You can also download the binary here.
Use adb command to put the binary on your phone and change it with executable permission. A sample execution gives the screenshot below,

Note that the “2>/dev/null” redirection is to discard some error messages printed by lsof.


1. Inside Geinimi Android Trojan. Chapter Two: How to check remotely the presence of the trojan.
2. Linux man page for lsof:

Bug Caused by pthread_create Input Parameters Passing

0. The Context

I was working on a multi-threaded application using pthread library on Linux. One part of the application uses a thread pool with 100 threads. Each thread is supposed to finish its job and exit. The execution noramlly takes less than several seconds (we’ll call it short-living thread).

1. Locate the Bug

The application behaves abnormal, and I added a few printf statements to find what went wrong. It seems the input parameters passed into the short living thread changes inside the thread execution function. I know that usually means I’m not handling the memory correctly.

2. The Reason and Fix

The input parameter is a data structure, I passed the pointer of the data structure to the thread execution function, and the content of the data structure is copied in the thread execution function. Then the data structure instance is reused.

Wait a minute. What if there’re multiple threads created by pthread_create at almost the same time, before the copy operation is carried out within each thread, the content can be changed by some thread already.

I realized that I’m not handling the parameter passing in pthread correctly. I should allocate separate data structure instance for each short living thread in the thread pool, then the data can never be messed up between threads.

3. How to Avoid Such Bugs

Every body who has programmed a multi-threaded application knows how difficult debugging can be sometimes. The shared data, the scheduling, dead-lock avoidance, there’s lots to be careful.

Program slowly, think before changing the code, and don’t leave everything to debugging.