Priority Queue and an Implementation using Heap in C

You may want to read the post “heap” first.

Priority Queue

A priority queue is a data structure that supports at least the following two operations,

1. insert: insert an item with priority

2. pull_min_element/pull_max_element: depends on your priority queue type (min priority-queue or max-priority-queue), the operation find the item with min/max priority, return the item and remove it from the queue.

A priority queue is usually implemented using a heap, but it doesn’t have to be.
A priority queue can be implemented using a simple linked list/array. Every time an insert operation occurs, we simply append it to the end of the list/array and when a pull request is issued, we iterate through the list/array and find the element to return and remove. Alternatively, we can keep the linked list/array sorted and simply remove the first item every time.

A priority queue can also be implemented using a self-balancing binary search tree (e.g. red-black tree). In this case, both insert and pull take O(logn) worst case running time.
But the most commonly used data structure for priority queue is heap. The simplest heap is binary heap. For the convenient of discussion, the description below assumes min binary heap.

When binary heap is used to implement a priority queue, both insert and pull can be done in O(logN) worst case time. Insert has an average of constant time. To build the heap from already exist N items, it takes O(N) worst case time.

C Implementation using Heap

The implementation of priority queue is just a simple wrapper of the heap implementation in the post “heap”.

#include <stdio.h>

#include <stdlib.h>

 

typedef struct heapData {

    //dummy

} heapData;

 

typedef struct heapNode {

    int value;

    heapData data;               //dummy

} heapNode;

 

typedef struct PQ {

    heapNode* heap;

    int size;

} PQ;

 

void insert(heapNode aNode, heapNode* heap, int size) {

    int idx;

    heapNode tmp;

    idx = size + 1;

    heap[idx] = aNode;

    while (heap[idx].value < heap[idx/2].value && idx > 1) {

    tmp = heap[idx];

    heap[idx] = heap[idx/2];

    heap[idx/2] = tmp;

    idx /= 2;

    }

}

 

void shiftdown(heapNode* heap, int size, int idx) {

    int cidx;        //index for child

    heapNode tmp;

    for (;;) {

        cidx = idx*2;

        if (cidx > size) {

            break;   //it has no child

        }

        if (cidx < size) {

            if (heap[cidx].value > heap[cidx+1].value) {

                ++cidx;

            }

        }

        //swap if necessary

        if (heap[cidx].value < heap[idx].value) {

            tmp = heap[cidx];

            heap[cidx] = heap[idx];

            heap[idx] = tmp;

            idx = cidx;

        } else {

            break;

        }

    }

}

 

heapNode removeMin(heapNode* heap, int size) {

    int cidx;

    heapNode rv = heap[1];

    //printf("%d:%d:%dn", size, heap[1].value, heap[size].value);

    heap[1] = heap[size];

    --size;

    shiftdown(heap, size, 1);

    return rv;

}

void enqueue(heapNode node, PQ *q) {

    insert(node, q->heap, q->size);

    ++q->size;

}

 

heapNode dequeue(PQ *q) {

   heapNode rv = removeMin(q->heap, q->size);

   --q->size;

   return rv; 

}

 

void initQueue(PQ *q, int n) {

   q->size = 0;

   q->heap = (heapNode*)malloc(sizeof(heapNode)*(n+1));

}

 

int main(int argc, char **argv) {

    int n; 

    int i;

    PQ q;

    heapNode hn;

    n = atoi(argv[1]);

    initQueue(&q, n);

    srand(time(NULL));

    for (i = 0; i < n; ++i) {

        hn.value = rand()%10000;

        printf("enqueue node with value: %dn", hn.value);

        enqueue(hn, &q);

    }

    printf("ndequeue all values:n");

    for (i = 0; i < n; ++i) {

        hn = dequeue(&q);

        printf("dequeued node with value: %d, queue size after removal: %dn", hn.value, q.size);

    }

}

Compile the code using,

gcc -o pq pq.c

A sample run would be as below,

References:
1. Wikipedia: http://en.wikipedia.org/wiki/Binary_heap
2. Programming Perls. 2nd Edition

Heap

Heap and Its Properties
Heap is a data structure frequently used to implement heap sort and priority queue. It has structure property and heap-order property. The simplest heap is binary heap. For the convenient of discussion, the description below assumes min binary heap.

Structure property: A heap is a complete binary tree. It means all except the bottom level of the heap is completely filled, and the bottom level is filled from left to right.

Heap-order property: For every node N in a heap, the value of its parent is smaller than or equal to the value of N, except for the root node, which has no parent. This means the smallest element is the root.

Insertion and Removal

Both insertion and root removal at a heap can be done in O(logN) worst case time. Insert has an average of constant time. To build the heap from already exist N items, it takes O(N) worst case time.

Insertion: to insert an element into a heap, first put the element at the first empty slot at the bottom of the heap binary tree. Then we check if the element is smaller than its parent. If no, the element insertion is done. If yes, we swap the element and its parent and the check is done again. The process repeats until the element is greater than or equal to its parent or it has reached the root.

Removal of Root: to remove root from a heap, we use the last element M at the bottom level to replace the element to be removed. After that, we check if M is smaller than all its children. If yes, the removal is done; otherwise, we select the smallest element from its children and swap M with that element. The process is done repeatedly until M is smaller than or equal to all its children or M has no child.

Build a Heap: to build a heap for existing N elements, one can insert N elements into heap, this takes NlogN time. There is better approach than this. We can initialize the heap without considering the order property. Then try to build the order property from bottom up to the root of the element. It’s proven that this consumes only O(n) time.

Implementation

Heap is often implemented using array. If we put the root at index 1, then the following relationship holds,

left child index = parent index * 2
right child index = parent index * 2 + 1
parent index = child index / 2

This makes traversal through the heap easy.

Below is a sample implementation of heap in C,

#include <stdio.h>

#include <stdlib.h>

 

typedef struct heapData {

    //dummy

} heapData;

 

typedef struct heapNode {

    int value;

    heapData data;               //dummy

} heapNode;

 

void insert(heapNode aNode, heapNode* heap, int size) {

    int idx;

    heapNode tmp;

    idx = size + 1;

    heap[idx] = aNode;

    while (heap[idx].value < heap[idx/2].value && idx > 1) {

    tmp = heap[idx];

    heap[idx] = heap[idx/2];

    heap[idx/2] = tmp;

    idx /= 2;

    }

}

 

void shiftdown(heapNode* heap, int size, int idx) {

    int cidx;        //index for child

    heapNode tmp;

    for (;;) {

        cidx = idx*2;

        if (cidx > size) {

            break;   //it has no child

        }

        if (cidx < size) {

            if (heap[cidx].value > heap[cidx+1].value) {

                ++cidx;

            }

        }

        //swap if necessary

        if (heap[cidx].value < heap[idx].value) {

            tmp = heap[cidx];

            heap[cidx] = heap[idx];

            heap[idx] = tmp;

            idx = cidx;

        } else {

            break;

        }

    }

}

 

heapNode removeMin(heapNode* heap, int size) {

    int cidx;

    heapNode rv = heap[1];

    //printf("%d:%d:%dn", size, heap[1].value, heap[size].value);

    heap[1] = heap[size];

    --size;

    shiftdown(heap, size, 1);

    return rv;

}

 

void buildHeap(heapNode* heap, int size) {

    int i;

    //apply shiftdown to all elements that has children, 

    //it can be proven that it runs on O(n) worst-cast time

    for (i = size/2; i > 0; --i) { 

        shiftdown(heap, size, i);

    }

}

 

int main(int argc, char **argv) {

    int n; 

    int i;

    heapNode *heap, *heap2;

    heapNode hn;

    n = atoi(argv[1]);

    //insert the element one by one to heap

    //store elements in heap2 without keeping the order-property

    //then buildHeap

    heap = (heapNode*)malloc(sizeof(heapNode)*(n+1));

    heap2 = (heapNode*)malloc(sizeof(heapNode)*(n+1));

    srand(time(NULL));

    for (i = 0; i < n; ++i) {

        hn.value = rand()%10000;

        printf("insert node: %dn", hn.value);

        insert(hn, heap, i); 

        heap2[i+1] = hn;

    }

    buildHeap(heap2, n);

    for (i = 1; i <= n; ++i) {

        printf("%d:%dn", heap[i].value, heap2[i].value);

    }

    

    printf("remove root values:n");

    for (i = n; i > 0; --i) {

        hn = removeMin(heap, i);

        printf("heap: min: %d, heap size before removal: %dn", hn.value, i);

        hn = removeMin(heap2, i);

        printf("heap2: min: %d, heap size before removal: %dn", hn.value, i);

    }

}

Compile the code with

gcc -o heap heap.c

A sample run would be as below,

heap

Figure 1. Sample Run of Heap Functions

The program builds two heaps. The first one insert the element one by one, the second heap is built with all elements available. As shown in the screenshot, the heap structures are different, but both the structure and order property are maintained in both heaps.

Heap is frequently used to implement priority queue and heapsort.

References:

1. Programming Pearls, 2nd edition.

2. Data Structures and Algorithms Analysis in C++, 3rd edition.

Depth-First Graph Traversal using Stack — a C Implementation

There’re two common ways of visiting all nodes of a graph, depth-first search and breadth-first search. These two graph traversal approaches relate to two popular data structures, stack and queue respectively. (One can also do it in a recursive manner. It’s more intuitive but less efficient.)

This post covers depth-first search with a stack data structure. The algorithm expressed in pseudo code is as below,

DFS (Graph, root):
   create a stack S
   push root node to S
   while S is not empty:
       pop an itme v from S
       mark the item v as visited
       for each node w that is directed from v:
           if w is not visited:
               push w to S

The first step of the implementation is to implement a stack. The stack implemented below has the basic functions like push, pop, ifEmpty, etc.

1. Stack Implementation

The header file stack.h, defines the data structures used in Stack and Graph.

#include <stdio.h>
#include <stdlib.h>
typedef struct Point {
   int h;
   int w;
} Point;
typedef struct GraphNode {
   struct Point pvalue;
   struct GraphNode* left;
   struct GraphNode* right;
} GraphNode;
typedef struct StackElement {
   struct GraphNode value;
   struct StackElement *up;
   struct StackElement *down;
} StackElement;
typedef struct Stack {
   struct StackElement* topP;
   struct StackElement* bottomP;
} Stack;
void initStack(struct Stack *s);
void push(struct Stack *s, struct GraphNode _value);
void pop(struct Stack *s);
struct GraphNode top(struct Stack *s);
int ifEmpty(struct Stack *s);

The actual implementation stack.c

/**
a C stack implementation using linked list
*/
#include "stack.h"
/*initialize the stack*/
void initStack(struct Stack *s) {
   s->topP = NULL;
   s->bottomP = NULL;
}
/*add an element to the top of the stack*/
void push(struct Stack *s, struct GraphNode _value) {
   //allocate a new StackElement for _value
   StackElement *newElement;
   newElement = (StackElement*) malloc(sizeof(StackElement));
   newElement->value = _value;
   newElement->up = NULL;
   newElement->down = NULL;
   if (s->topP == NULL) {
        //first element
       s->topP = newElement;
        s->bottomP = newElement;
   } else {
        //push it to the top
        newElement->down = s->topP;
        s->topP = newElement;
   }
}
/*delete the top element from the stack*/
void pop(struct Stack *s) {
   StackElement *element;
   if (s->topP == NULL) {
        //empty stack
        return;
   } else {
        element = s->topP;
        s->topP = element->down;
        if (s->topP != NULL)
             s->topP->up = NULL;
        free(element);
   }
}
/*get the top value of the stack, but don't delete it*/
struct GraphNode top(struct Stack *s) {
   return s->topP->value;
}
/*check if the stack is empty*/
int ifEmpty(struct Stack *s) {
   return (s->topP == NULL ? 1:0);
}

Note that the top() function above returns the top element of the stack and the pop() function deletes the top element from the stack. Some stack implementations has the pop() function does both return and delete. But I personally think separating them is more convenient as sometimes you just want to take a look at the stack but not delete anything from it.

The Stack above will take input element of GraphNode type, but you’re free to modify it to fit your needs.

2. The Depth-First Graph Traversal

For simplicity, an example of visiting all nodes of a binary tree is given below. The tree structure is included in the comment of the source code file dfs.c,

#include <stdio.h>
#include <stdlib.h>
#include "stack.h"
/* a simple graph (binary tree)
         (0,    3)   <– root node
          /     
      (1, 2)    (1,4)
       /       /   
   (2,1) (2,2) (2,3) (2,5)
*/
typedef struct Graph {
   struct GraphNode* root;
} Graph;
struct GraphNode initNode(int _h, int _w) {
   struct GraphNode node;
   node.pvalue.h = _h;
   node.pvalue.w = _w;
   node.left = NULL;
   node.right = NULL;
   return node;
}
int main() {
   struct Graph g;
   struct GraphNode currentNode;
   /*read in the graph from top to bottom*/
   g.root = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root = initNode(0, 3);
   //level 2 nodes
   g.root->left = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->left = initNode(1, 2);
   g.root->right = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->right = initNode(1, 4);
   //level 3 nodes
   (*g.root->left).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).left = initNode(2, 1);
   (*g.root->left).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).right = initNode(2, 2);
   (*g.root->right).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).left = initNode(2, 3);
   (*g.root->right).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).right = initNode(2, 5);
   /*DFS traversal*/
   Stack s;
   initStack(&s);
   push(&s, *g.root);
   while (!ifEmpty(&s)) {
       currentNode = top(&s);
        pop(&s);
        //mark as visited, here we simply print the graph node value out
       printf("(%d, %d)n", currentNode.pvalue.h, currentNode.pvalue.w);
        //at most two children, left and right. As the binary tree is acyclic,
        // both nodes should not be visited yet, for cyclic graphs, 
        //one need additional methods to check if the nodes is visited or not before push
       if (currentNode.left != NULL) {
           push(&s, *currentNode.left);
       } 
       if (currentNode.right != NULL) {
           push(&s, *currentNode.right);
        } 
   }
   return 0;
}

3. Output

To compile the code in Linux, type the following command,

gcc stack.c dfs.c -o dfs

Then run the binary dfs,

        ./dfs

The output we’ll get is,

(0, 3)

(1, 4)

(2, 5)

(2, 3)

(1, 2)

(2, 2)

(2, 1)

which corresponds to the expected nodes visiting order in depth-first search. As we push the left node first, the traversal is from right to left.

Breadth-First Graph Traversal using Queue – a C Implementation

There’re two common ways of visiting all nodes of a graph, depth-first search and breadth-first search. These two graph traversal approaches relate to two popular data structures, stack and queue respectively. (One can also do it in a recursive manner. It’s more intuitive but less efficient.)

This post covers breadth-first search with a queue data structure. The algorithm expressed in pseudo code is as below,

BFS (Graph, root):
    create a queue Q
    enqueue root node to Q
    while Q is not empty:
        dequeue an itme v from Q
        mark the item v as visited
        for each node w that is directed from v:
            if w is not visited:
                enqueue w to Q

The first step of the implementation is to implement a queue. The queue implemented below has the basic functions like enqueue, dequeue, ifEmpty, etc.

1. A Queue Implementation

The header file queue.h, defines the data structures used in Queue and Graph.

#include <stdio.h>
#include <stdlib.h>
typedef struct Point {
   int h;
   int w;
} Point;
typedef struct GraphNode {
   struct Point pvalue;
   struct GraphNode* left;
   struct GraphNode* right;
} GraphNode;
typedef struct QueueElement {
   struct GraphNode value;
   struct QueueElement *next;
} QueueElement;
typedef struct Queue {
   struct QueueElement* head;
   struct QueueElement* tail;
} Queue;
void initQueue(struct Queue *q);
void enqueue(struct Queue *q, struct GraphNode _value);
void dequeue(struct Queue *q);
struct GraphNode front(struct Queue *q);
int ifEmpty(struct Queue *q);

The source file queue.c,

/**
a C queue implementation using linked list
*/
#include "queue.h"
/*initialize the queue*/
void initQueue(struct Queue *q) {
   q->head = NULL;
   q->tail = NULL;
}
/*insert an element to the end of the queue*/
void enqueue(struct Queue *q, struct GraphNode _value) {
   //allocate a new QeueuElement for _value
   QueueElement *newElement;
   newElement = (QueueElement*) malloc(sizeof(QueueElement));
   newElement->value = _value;
   newElement->next = NULL;
   if (q->head == NULL) {
        //first element
       q->head = newElement;
       q->tail = newElement;
   } else {
        //put it to the tail
        q->tail->next = newElement;
        q->tail = newElement;
   }
}
/*delete the first element from the queue*/
void dequeue(struct Queue *q) {
   QueueElement *element;
   if (q->head == NULL) {
        //empty queue
        return;
   } else {
        element = q->head;
        q->head = q->head->next;
        free(element);
   }
}
/*get the front value of the queue, but don't delete it*/
struct GraphNode front(struct Queue *q) {
   return q->head->value;
}
/*check if the queue is empty*/
int ifEmpty(struct Queue *q) {
   return (q->head == NULL ? 1:0);
}

Note that the front() function above returns the first element of the queue and the dequeue() function deletes the first element from the queue. Some queue implementations has the dequeue() function does both return and delete. But I personally think separating them is more convenient as sometimes you just want to take a look at the queue but not delete anything from it.

The Queue above will take input element of GraphNode type, but you’re free to modify it to fit your needs.

2. The Breadth-First Graph Traversal

For simplicity, an example of visiting all nodes of a binary tree is given below. The tree structure is included in the comment of the source code file bfs.c,

#include <stdio.h>
#include <stdlib.h>
#include "queue.h"
/* a simple graph (binary tree)
         (0,    3)   <– root node
          /     
      (1, 2)    (1,4)
       /       /   
   (2,1) (2,2) (2,3) (2,5)
*/
typedef struct Graph {
   struct GraphNode* root;
} Graph;
struct GraphNode initNode(int _h, int _w) {
   struct GraphNode node;
   node.pvalue.h = _h;
   node.pvalue.w = _w;
   node.left = NULL;
   node.right = NULL;
   return node;
}
int main() {
   struct Graph g;
   struct GraphNode currentNode;
   /*read in the graph from top to bottom*/
   g.root = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root = initNode(0, 3);
   //level 2 nodes
   g.root->left = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->left = initNode(1, 2);
   g.root->right = (GraphNode*) malloc(sizeof(GraphNode));
   *g.root->right = initNode(1, 4);
   //level 3 nodes
   (*g.root->left).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).left = initNode(2, 1);
   (*g.root->left).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->left).right = initNode(2, 2);
   (*g.root->right).left = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).left = initNode(2, 3);
   (*g.root->right).right = (GraphNode*) malloc(sizeof(GraphNode));
   *(*g.root->right).right = initNode(2, 5);
   /*BFS traversal*/
   Queue q;
   initQueue(&q);
   enqueue(&q, *g.root);
   while (!ifEmpty(&q)) {
       currentNode = front(&q);
       //dequeue(&q); //you can either put dequeue here or at the end of while
        //mark as visited, here we simply print the graph node value out
       printf("(%d, %d)n", currentNode.pvalue.h, currentNode.pvalue.w);
        //at most two children, left and right. As the binary tree is acyclic,
        // both nodes should not be visited yet, for cyclic graphs, 
        //one need additional methods to check if the nodes is visited or not before enqueue
       if (currentNode.left != NULL) {
           enqueue(&q, *currentNode.left);
       } 
       if (currentNode.right != NULL) {
           enqueue(&q, *currentNode.right);
        } 
        dequeue(&q);
   }
   return 0;
}

3. Output

To compile the code in Linux, type the following command,

gcc queue.c bfs.c -o bfs

Then run the binary bfs,

        ./bfs

The output we’ll get is,

(0, 3)

(1, 2)

(1, 4)

(2, 1)

(2, 2)

(2, 3)

(2, 5)

which corresponds to the expected nodes visiting order in breadth-first search, level 0 node, follows by level 1 nodes, and finally level 2 nodes.

 

Sequential Containers and Container Adapters in C++ STL

Side Note: This is a post I wrote on Jul 7 2008. It was on my previous blog.

This article discusses the usage of three sequential containers in C++ STL, namely vectors, deques, and lists. It also covers two container adapters based on these containers, stack and queue.

<0> Sequential Containers
<0.0> Common operations for all sequential containers

a. Size operations.

There are three size operations for all of the three sequential containers.

size() returns the actual number of elements of the container.

empty() indicates whether the container is empty or not, it is usually more efficient than size() == 0.

max_size() returns the maximum number of elements a container might contain. The number is usually limited by PC hardwareor the maximum value of the type of the index.

b. Comparisons.

The common comparison operators ==, !=, <, >, <=, >= are applicable for containers. And they follow the rules listed below,

    b.0    Both containers must have the same type.

    b.1    Two containers are equal if both the elements and orders are the same.

    b.2    A lexicographical comparison is done when check if a container is less than the other.

c. [Speed Performance] Assignment and swap()

swap() is a preferrable way to do assignment if you don’t need the source container any more.

When you do assignment with containers, it copies all the elements of the source container and remove all the old elements in the destination container. The operation is usually of linear complexity.

swap() only swaps the internal pointers and it takes constant time.

<0.1> Vectors

The internal nature of a vector is dynamic array. Some points worth note-taking are,

a. random access: if the position is known, one can access any element directly using index in constant time. The iterator for a vector is random access iterator, so all the algorithms implemented in STL can be used.

b. Insert and delete: Insert [push_back()] and delete [pop_back()] at the end of a vector takes constant time. But the inerstions [insert()] and deletions [erase()] occurs at other parts of a vector normally requires linear time because these operations may need to displace a lot of elements.

c. Capacity: vector provides one additional size operation in addition to the three mentioned above, capacity(). It returns the number of elements a vector could hold without having to reallocate memory.

[Speed Performance] Reallocation takes linear time, to avoid it, we can use reserve() to ensure enough capacity for our vectors. reserve() takes linear time also, but it reduces the number of reallocations.

d. [Memory Performance] vector<bool>: it usually uses internally only 1 bit for an element. One can apply vectorName.flip() or vectorName[index].flip() to negate all the bits or a single bit.

<0.2> Deques

a. deques are very similar to vector, except that the underlying dynamic array is open at both ends for a deque.

Because of this differences, deque is fast for insertions and deletions at both ends.

b. Deque provides no support to control the capacity at the reallocation, so no capacity() and reserve() operations for deque.

<0.3> Lists

a. list is quite different from vector and deque, and it manages its elements as doubly linked list. The results are the following,

    a.0    A list does not provide random access.

    a.1    Deletion and insertion takes constant time at any position.

[Speed Performance] a.2 Lists provide special members for moving elements. These are faster versions of general algorithms that have the same names, because they only redirect pointers rather than copy and move the values.

            These functions include remove(), merge(), splice(), unique(), sort() and so on.

<1> Container Adapters

Container adapters adapt standard STL containers to fit special needs.

There are three container adapters in C++ STL, namely stack, queue and priority queue. Because priority queue is closely related to some other data structure called heaps, it is not covered here.

<1.0> Stacks

a. operations: push(), pop(), top(), empty(), size().

b. Implementation: the default implementation is deque. But since it only requires the push and pop elements at the top, it can also be implemented by vector or list also.

<1.1> Queues

a. operations: push(), pop(), front(), back(), size(), empty().

b. Implementation: the default implementation is deque. Because it requires push at the end and pop at the front, it can implemented by list also, but not vector.