Selenium: Reuse the HTTP Session in Firefox Browser

Selenium is a browser automation framework, which can be used for automating web application testing, crawling web pages etc. For a quick start guide, one can refer to

This post gives an example of reusing an existing HTTP session in firefox browser with Selenium. The example is based on Selenium version 2.26.0.

The basic idea is to check if there’s an existing Firefox browser started by Selenium, if so, we simply connect to it and reuse the browser session.

Below is an example of using the same Firefox browser session for requesting 10 pages from Google Play. Note that the code does not work for existing Firefox browser window not started by Selenium.






import org.openqa.selenium.WebDriver;

import org.openqa.selenium.firefox.FirefoxDriver;

import org.openqa.selenium.remote.DesiredCapabilities;

import org.openqa.selenium.remote.RemoteWebDriver;


public class FirefoxDriverReuseExample {

    private static String ROOT_URL = "";

    private static String[] appIds = new String[]{













    private static void retrieveAllUrls() {

        WebDriver driver = null;

        int cnt = 0;

        boolean isRunning = false;

        for (int i = 0; i < appIds.length; ++i) {

            String url = ROOT_URL + appIds[i];

            try {

                //see if the web driver is running

                final Socket s = new Socket();

                s.connect(new InetSocketAddress("localhost", 7055)); 


                isRunning = true; 

            } catch (IOException io) {

                isRunning = false;


            if (!isRunning) {

                //start a new session

                driver = new FirefoxDriver();


                String pageSource = driver.getPageSource();



            } else {

                //reuse the existing session

                try {

                    if (null == driver) {

                        //if not initialized yet, connect

                        driver = new RemoteWebDriver(new URL("http://localhost:7055/hub"), DesiredCapabilities.firefox()); 



                    String pageSource = driver.getPageSource();



                } catch (Exception e) {





        System.out.println("Total No. of requests and responses: " + cnt);



    public static void main(String[] args) {




Note that I used the selenium-server-standalone-2.26.0.jar as referenced library for the code. It can be found at


Selenium project page:

HtmlUnit Memory Leak — A Workaround

HtmlUnit is a programmable browser without GUI. It’s written in Java and exposes APIs that allow us to open a window, load a web page, execute Javascript, fill in forms, click buttons and clicks etc. HtmlUnit is typically used for website testing or crawling information from web sites.

Recently I worked on a task which uses HtmlUnit 2.10 to retrieve information of some web page with fairly complex Javascript. It seems the Javascript engine is causing some memory leak issues. After loading a few web pages, the memory usage is becoming high (>1GB) and eventually OutOfMemoryError will occur. The HtmlUnit FAQ page suggests that we should call WebClient.closeAllWindows(). I tried but it doesn’t work.

Instead of digging into the JavaScript engine and find out why the error happened. I decided to use a workaround — use a two process approach. The main process will keep track of the pages that has been crawled, what to crawl next etc. and create a child process to do the actual retrieval using HtmlUnit. After the child process finishes crawling for several pages, it will exit. The main process will create a new process to crawl next few pages. To make things simple, the two processes use file IO for Inter-Process Communication (IPC). That is, the child process writes what are the pages have been crawled, the main process reads it to update what have been crawled.

Because all memory allocated to the child process will be freed when it is terminated. This approach can work with the memory leak unfixed, but with performance penalty.

1. HtmlUnit website:

git: Merge A Range of Commits from One Branch to Another

Suppose we have created two branches, namely master and work, we developed a few features and fixed a few bugs on the work branch, and we want to merge the bug fixes to the master branch. Below are the steps that should be carried out.

0. Check out the commits range

Make sure we’re at work branch. We can use either “git log” or a GUI tool like gitk to see what are the start and end commit SHA1 IDs. Below is a sample output of “git log”.

Suppose we want to merge the first two commits from the work branch to master branch. The SHA1 ID are 315125… and 46e768… respectively.

1. Check out a temporary branch

This can be done with the command below,

git checkout master
git checkout -b tmp

This branch tmp will be the same as master branch. And now we’re at tmp branch.

2. Move work branch ahead

This can be done with the following command,

git branch -f master 315125

We only need to provide the first few digits of the SHA1 ID, git is smart enough to figure it out the rest. This will set the master branch to contain all the commits up to commit 315125.

3. Rebase

This is done with command below,

git rebase –onto tmp 46e768~1 master

46e768 is the start commit number we want to merge to master branch. This will rebase master branch to tmp branch + all commits between 46e768 and 315125. Since tmp branch contains the what the original master branch contains. This is equivalent to adding commits from 46e768 to 315125 to master branch.

In case there’s any conflicts, we’ll need to resolve the conflict and run

git rebase –continue

After it’s done, we’ll be at master branch, ready to push the changes.

HTTP/HTTPS Load Testing with http_load–Part 2

This is the second part of HTTP/HTTPS load testing with http_load. Please read part 1 first.

2.3 Investigation of Wired Results

We add printf statements to print out the connect time and read response time. The original three lines of code are as below (they’re at different places of http_load.c file).

struct timeval now, nowt, nowtt;

handle_connect( cnum, &now, 1 );

handle_read( cnum, &now );

The modified code is as below.

struct timeval now, nowt, nowtt;

(void) gettimeofday( &nowt, (struct timezone*) 0 );

handle_connect( cnum, &now, 1 );

(void) gettimeofday( &nowtt, (struct timezone*) 0 );

long connect_t = delta_timeval(&nowt, &nowtt);

printf("handle_connect time: %ldn", connect_t);

(void) gettimeofday( &nowt, (struct timezone*) 0 );

handle_read( cnum, &now );

(void) gettimeofday( &nowtt, (struct timezone*) 0 );

long read_t = delta_timeval(&nowt, &nowtt);

printf("handle_read time: %ldn", read_t);

We run the test using surls.txt and surls2.txt again. And below are the results.

First it’s the surls.txt.

Then the surls2.txt
The connect time for github is > 30 times of the connect time to When we’re having multiple parallel connections, the connect takes so much time that reading the response is delayed. This has something to do with http_load’s design, which we’ll discuss next.

3. The Drawbacks of http_load

http_load has the following drawbacks.

3.1. Single thread multiple connections

http_load uses Linux select call to handle multiple connections with a single thread. This design is good when both connect and read response time are short, which is the case for most of HTTP request,  but not true for most of HTTPS case. As shown above, it takes almost half a second to connect to github.

When the connect time is long and there’re parallel connections, it’s likely the events are going to be queued up. In other words, when the response is received by http_load host, http_load cannot read it immediately because it’s busy with establishing other connections or reading other response. Therefore,  the response time obtained is not accurate as it includes the time that a response waited before http_load reads it.

A better design could a thread-pool approach. This could reduce the wait time as multiple responses can be read by multiple threads concurrently.

For current http_load, if we’re doing test with https, it’s better to set -parallel 1 and run multiple instances of http_load. Note that setting rate may not work as it can still introduce parallel connections.

3.2. Read all URLs into memory

http_load loads every URL into memory before it starts to send out HTTP request. This makes it easy to pick up a URL randomly for each request, but a big URL file may cause large amount of memory allocation and a long initialization time. I encountered cases where 10 seconds test takes more than 10 minutes to finish when the input file contains about half a million URLs.

Apache JMeter is a better alternative in terms of HTTPS testing, in my personal point of view.

1. http_load website:

HTTP/HTTPS Load Testing with http_load–Part 1

http_load is simple open source tool sends multiple HTTP requests in parallel. It runs a single thread picking up URLs randomly from an input file. Both HTTP and HTTPS are supported.

0. Build http_load

Build http_load is quite simple.

tar xvf http_load-12mar2006.tar.gz

  • Start terminal, go to the extracted directory, and type make to build it.
  • Start http_load by ./http_load command.

1. Use http_load for HTTP testing

1.1 Controlling Request Rate

http_load supports a few options. We can control the request rate by either -parallel or -rate. parallel indicates the number of concurrent connections. For example, if we specify -parallel 5, then http_load will try to establish and maintain (create new connection once a connection is closed) 5 connections, and send out one request through each connection.

Rate controls the request rate in a different manner. It refers to number of requests sent out per second. If we specify -rate 5, then http_load will make 5 requests per second. It will determine how many connections to open internally. We can specify either parallel or rate, but not both.

1.2 Controlling Test Time

Another important parameter is seconds, it allows us to specify how long the test should run. Or we can specify fetches, which indicates the number of requests to send for the entire test.

1.3 Specify Test URLs

The last parameter is always the url_file, a file which contains a list of URLs for test. http_load will read all URLs at the beginning and randomly pick up one for each request. There’re a few other parameters, one can refer to http_load help message for more info.

1.4 An Example

As an example, suppose we have a URL file named urls.txt contains the following URLs.

Then if we want to simulate 5 users sending requests for 10 seconds, we can specify the command as below,

./http_load -parallel 5 -seconds 10 urls.txt

http_load will output the results once the test is completed.

From the results, we can read the min, max and average response time, and throughput (No. of fetches per second) etc.

2. Use http_load for HTTPS Testing

2.1 Compile http_load with HTTPS support

By default, HTTPS support is disabled. We can enable the HTTPS by uncomment the lines below in the Makefile.

#SSL_TREE =     /usr/local/ssl
#SSL_INC =      -I$(SSL_TREE)/include
#SSL_LIBS =     -L$(SSL_TREE)/lib -lssl –lcrypto

Once enabled the above lines, simply type the following commands in terminal to build it.

make clean

2.2 An Example

HTTPS test works the same as HTTP test. Suppose we have the URLs below in a file named surls.txt.

And another file named surls2.txt.

We first test with surls.txt, and below is the result.

Then surls2.txt, and below is the result.

The result for surls.txt seems wired. As we’re increasing the number of parallel connections, the response time increases too. For website like, it’s unlikely to be the server side issue. Could it be the client side issue? NO. Because test with surls2.txt shows the client can take care of 30 parallel connections just fine. So WHY? We investigate this issue in next post.

Apache JMeter: Input from File, with HTTP Testing as An Example

Previous post covers how to set up basic HTTP testing with Apache JMeter. Many times we found ourselves need to read input from a file, say, a list of URLs for testing. Apache JMeter can be easily configured to do so. We’ll walk through a HTTP testing reading from URLs from an input text file.

0. Add Thread Group: Start JMeter, right click Test Plan > Add > Thread Group.

1. Add HTTP Request: Right click Thread Group > Add > Sampler > HTTP Request, specify the Path as ${PATH}, leave everything else as default. This is shown as below,
2. Add CSV Data Set Config: Right click Thread Group > Add > Config Element > CSV Data Set Config. Set the filename as the file contains the input URLs, Variables Names as “PATH,”,  Delimiter as “n”, so that JMeter will parse the URLs line by line and each line is fed to the PATH variable, which we used in HTTP Request at step 1. Another two important settings are Recycle on EOF and Stop Thread on EOF.

Recycle on EOF: The file is re-read from the beginning once it reaches EOF if set to true. In this  test, we set it to false.

Stop Thread on EOF: if Recycle on EOF is set to false, should the thread be stopped on EOF. In this test, we set it to true.

Below is a screenshot of what is the settings look like,

3. Add Listeners: Add listeners to show the testing results, we added Graph Results, View Results Tree and View Results in Table in our test.

4. Adjust Group Thread Settings: Adjust Thread Group settings. In our test, we set the settings as below,

In this case, JMeter will fire up 5 threads in 1 second and each thread will fetch 200 URLs from the input text file. Thread 1 fetches first line, thread 2 fetches second line etc. Since we set the Recycle on EOF at CVS config as false, the test will end either the input file reaches an end or 1000 URLs are fired.

If we set Recycle on EOF as true, the test will only end only 1000 URLs are fired as it can re-read the URLs from the beginning of the input file once it reaches EOF.

5. Test results: We can run the test, and view the results in the three listeners in different format.

Apache JMeter user manual:

Apache JMeter for HTTP Testing

Apache JMeter is a Java application used for various testing and performance evaluation. It is originally designed for web testing but has been extended to support other functions.

This post covers the basic steps of setting JMeter for HTTP testing on Ubuntu.

0. Install JMeter

At Linux Ubuntu, type the command below,

$ sudo apt-get install jmeter

Alternatively, one can go the official website to download the latest release. Once installed, start JMeter by command jmeter.

1. Right click Test Plan > Add > Thread Group. The Thread Group has several fields to allow us adjusting the testing behavior.

Figure 1. Thread Group

2. Right click Thread Group > Add > Config Element > HTTP Request Defaults. The settings in this configuration element will be applied to all HTTP requests we created later. We simply set the server name and leave the rest of the settings as default. This is shown as below,

Figure 2. HTTP Request Defaults

3. Right click Thread Group > Add > Sampler > HTTP Request. We leave all settings unchanged except for the path. We set path to the url we want to test, in this case, we want to test the YouTube About page. This is shown as below,

Figure 3. HTTP Request

We also add another HTTP Request to test Press&Blogs page.

4. Right click Thead Group > Add > Listener > View Results in Table. This will allow us to view all the testing results.

5. Save the test plan and run the test by clicking run > start.

6. Below is the testing results.

Figure 4. View Results in Table

7. We can add other listeners like View results in Tree, Graph Results etc. Each listener allows us to see the results in a different way.

1. Apache JMeter website:

How to Import a Maven Project into Eclipse

Project managed by Maven cannot be imported to Eclipse by default. However the conversion step is really simple.

1. Convert the project

Issue the command below at the project directory where pom.xml is found.

$ mvn eclipse:eclipse

This will generate the eclipse files, including .project and .classpath.

2. Add maven local repository to classpath at eclipse.

  • At Eclipse IDE, Select Window > Preferences
  • Select Java > Build Path > Classpath Variables
  • Click New button > defined a new variable named M2_REPO that pointing to your local Maven repository. For Linux, it will be ~/.m2/repository by default; for windows, it will be C:Documents and Settingsusername.m2repository by default.
  • Click OK all the way to finish the setting.

3. Eclipse>File>Import>Existing Projects into Workspace, just like you do for a typical Eclipse project.

Publish Git Repository to Server

At times we’ll need to create a git repository and put is to a server where other developers can access. Of course you can do it with github or other git repository hosting service, but what if you’ll need to put it to your own server?

1. Create git directory locally

1.a to make your current directory a git repository, enter the command,

git init

1.b to make a directory a git repository, enter the command,

git init <directory name>

After initialization the git repository, you can add some files. If the repository is empty when you publish to server, it’ll more troublesome to configure.

2. Add new files

git add <project files and folders>
git commit -am “check in msg”

3. Publish repository to server

git clone <directory path>/<directory name> <project name>.git
scp -r <project name>.git <ssh username>@<ssh server address>:<your git server path>

4. Commit changes to server

git push <ssh username>@<ssh server address>:<your git server path>/<project name>.git

or simply

git push

5. Clone the project

git clone <ssh username>@<ssh server address>:<your git server path>/<project name>.git

6. Get Changes

git pull <ssh username>@<ssh server address>:<your git server path>/<project name>.git

or simply

git pull

How to Set up Nagios

Nagios monitoring system is distributed with a central server and a few client nodes. Two modes of monitoring are supported, namely active check and passive check. Active check requires the server to execute the checking mechanism and therefore can lead to high processing at the server when number of client nodes increases. On the other hand, passive checks are executed on the client nodes and the results are sent to the server side.

This post covers how to set up Nagios with passive checks. Each client is customized to monitor the tasks run on the client and send out the results to the server. The central server will receive the results, render the results through web interface and notify critical issues through emails.

1. Nagios Server Set Up

Ubuntu repository has Nagios packages. Though it’s not the latest, it is stable and easy to install. Simply issue the command below,

sudo apt-get install nagios3

This will likely to install postfix mail transfer agent (MTA) too. postfix is required to send out notification emails. Simply follow the installation screen to set up postfix.

1.1 Web Interface Configuration

Nagios server provides web interface with the help of apache web server. Normally everything is taken care of by the installation process and one can enter the url to the browser to view the web interface. The url is normally http://<ip or hostname>/nagios3/. Below is a screenshot of the Nagios display on my set up.

Figure 1. Nagios Web Interface

1.1.1 Apache Virtual Host

In case your machine has some sites set up already, and you want to configure Nagios web interface as a virtual host, this can be done easily.

The nagios installation comes with a sample apache configuration file. One can simply use the configuration with slight modifications.

a. Copy the configuration files

cp -rpf  /usr/share/doc/nagios3-common/examples/apache2.conf /etc/apache2/sites-enabled/nagios

ln -sf /etc/apache2/sites-available/nagios /etc/apache2/sites-enabled/nagios

b. Edit the file

Then we’ll need to add the following two lines.

To the beginning of the file,

<VirtualHost *:80>

To the end of the file,


c. Restart apache2 and nagios3

sudo service apache2 restart

sudo service nagios3 restart


1.2 Configuration

Nagios3 configuration files are under the directory /etc/nagios3/.

nagios.cfg is the main configuration. You can read information about other configuration files and directories from the file.

By default, there’s also a commands.cfg file, which defines the commands and cgi.cfg file, which contains the configurations for the cgi.

More configuration files can be found under /etc/nagios3/conf.d directory.

1.2.1 Host and Host Group Configuration

In order to monitor a client nodes, we’ll need to define it as a host. We can create a new file under conf.d directory named host.cfg, and then adds the host definition. A sample host definition is as below,

define host {

   use generic-host;

   host_name testMachine;

   alias    some remote host;



This definition inherits the generic-host definition, which is available in the conf.d/generic-host_nagios2.cfg file.

We can define as many hosts as we want in the same file or using a different file.

When number of hosts increase, we want to group them into host groups for easier management and configuration. This can be done easily. By default, a number of host groups have been defined under hostgroup_nagios2.cfg file. We can add our host group by appending the following to the end of the file,

define hostgroup {

               hostgroup_name  test-servers

               alias           SSH servers

               members         testMachine


When you have multiple hosts for a group, you can use “,” to separate them.

1.2.2 Service Configuration

The default services are defined at conf.d/services_nagios2.cfg file. Here we describe how to define a passive service.

First, create a file named passive-service_nagios2.cfg under conf.d directory. Copy the content below to the file,

define service {


       use     generic-service


       name    passive_service


       active_checks_enabled   0


       passive_checks_enabled  1


       flap_detection_enabled  0


       register                0


       is_volatile             0


       check_period            24x7


       max_check_attempts      1


       normal_check_interval   5


       retry_check_interval    1


       check_freshness         0


       contact_groups  admins


       check_command   check_dummy!0


       notification_interval   120


       notification_period     24x7


       notification_options    w,u,c,r


       stalking_options        w,c,u



This basically defines a generic passive service, where the active checks have been disabled and passive checks are enabled.

Now create a file, say test.cfg, put the content below to the file,

define service {

       use     passive_service

       service_description     TestMessage

       host_name       testMachine


Alternatively, you can use the host group we defined previously instead of hostname. This can be handy when you have lots of client nodes.

define service {

       use     passive_service

       service_description     TestMessage

       hostgroup_name  test-servers


1.2.3 Define the Passive Check Command

In the passive_service defined above, we called a command “check_dummy!0”, which has not been defined yet. The command is not defined yet.

Normally the external commands exist as plug-ins for Nagios. The executables can be found under /usr/lib/nagios/plugins/ directory. The configuration files for the plug-ins are found at /etc/nagios-plugins/config/ directory.

To continue our example, we create a configuration file passive_check.cfg under /etc/nagios-plugins/config/ with the following content,

define command {

       command_name    check_dummy

       command_line    $USER1$check_dummy $ARG1$


Note that the check_dummy executable already exists, so that’s all we need to do.

1.2.4 Enable Passive Checks

By default, the passive checks are disabled. We’ll need to open /etc/nagios3/nagios.cfg, the main configuration file, and set “check_external_commands=1”. You may also want to adjust the command_check_interval to control the frequency of checking updates from client nodes.

1.2.5 Install NSCA at Server

nsca is a plug in allows the clients to send passive check results to the server. One can download nsca from reference 1.

To compile the nsca plug in.


make all

If everything goes right, there’ll be executables nsca and send_nsca under src/ directory.

We can copy the nsca and sample-config/nsca.cfg to /etc/nagios3 directory and start the nsca server side by ./nsca -c /etc/nagios3/nsca.cfg.

Then nsca is ready to accept checking results from client nodes and forward to Nagios server.

1.2.6 Email Notification

The notification on Nagios can be configured with great flexibility. Here we simply shows how to enable the straightforward email notification.

At conf.d directory, open contacts_nagios2.cfg. We simply change the email address for contact_name root as our own email address, say

We can use the following script to test if the email server works fine.

echo “hello” | /usr/bin/mail -s “hello”

2. Client Node Configuration

The client node configuration is simple. Create a folder, say /opt/nagios/ for all the client side files. Copy send_nsca and sample-config/send_nsca.cfg files to this folder. And now we’re ready to write checking scripts.

The scripts below check if certain processes are running and then send results to server.








SERVICE="testMessage"        #the nagios nsca service name






















SUCCESS_DESP="All required processes are running"


FAIL_DESP="At least one required process is not running"





for pc in "${PSs[@]}"




   if ps ax | grep -v grep | grep $pc >; /dev/null




       echo "$pc is running"




       FPSs=("${FPSs[@]}" $pc)






echo ${#FPSs[@]}


if [ "${#FPSs[@]}" -eq "0" ]








   FAIL_DESP="${#FPSs[@]} processes are not running:"


   for fps in "${FPSs[@]}"




       FAIL_DESP="$FAIL_DESP $fps "








$NSCA_PATH localhost -p 5667 -c $CONFIG_PATH <; $OUT_PATH

If we check  the out.txt, the following are the things sent to the nsca server,

TestMachine testMessage 0 All required processes are running

The fields are separated by tab (t), they’re hostname, service name, status (0 for success, 1 for warning and 2 for critical), and a description message. The line is ended by a newline character.

4. Others

For debugging,

tail -f /var/log/syslog

For starting, stopping and restarting nagios3:

sudo service nagios3 start

sudo service nagios3 stop

sudo service nagios3 restart

For reloading config files,

sudo service nagios3 reload


1. NSCA:

2. Nagios Installation guide.

3. Nagios NSCA Installation guide.