Hough Transform for Line Detection –An Intro and Sample Code

Hough transform was originally proposed to detect lines in images. It is then generalized to detect arbitrary shapes that can be represented by a set of parameters. This blog only covers Standard Hough Transform for line detection.

The Basic Idea

The basic idea of Standard Hough Transform can be divided into three phrases.

1. Represent each point of image boundary in some parameter space.

2. a voting procedure is carried out in the parameter space.

3. compare the local maxima found in the voting against certain thresholds to decide whether an instance of the search object exists.

Take the line detection as an example. A straight line can be represented as,

 ρ=xcosθ+ysinθ            (1)

where ρ is the distance between the line and origin, θ is the angle of the vector from origin to the nearest point of the line. (To indicate θ and ρ are constants, we mark them as θ0 and ρ0) As you see in hte left part of Figure 1 below,

image

Figure 1. A Straight Line in Image Domain and Hough Domain

For formula (1) above, if we view x, y as constant, and ρ, θ as parameters, then we have the right part of Figure 1. Furthermore, when ρ, θ changes and x, y being constant (let’s say x = x0, y = y0), the sinusoidal curve in the right part actually represent all the possible lines that passing through point (x0, y0) in the image domain.

Thus, for each point in image domain, there’s a corresponding curve in hough domain.

If there’s a line in the image domain, which consists of many points, then there’ll be an intersection for all the curves of those points in hough domain. (The intersected point will be (θ0, ρ0) as mentioned before) In other words, a line in image domain is essentially a point in hough domain. Therefore, the problem of detecting lines is equivalent to find an intersection in hough space.

In voting stage, an accumulator is established with a discrete set of (θ, ρ) values. For each point in the image, we calculate the value of ρ for a sampled set of θ, and the number of votes in the corresponding accumulator cell is incremented by 1.

At the end of voting, the maxima are retrieved and if the voting count is more than a threshold, we consider there’s a high possibility a line exists with the corresponding (θ, ρ) value.

A Sample Implementation

The hough transform line detection algorithm is implemented as below in Matlab,

function [ O ] = hough_transform( fPath )

%   fPath: the path to the image

%   this program implements the Standard Hough Transform 

%   to detect lines

    rhoStep=1;

    thetaStep=1;

    rhoDiffThresholdForLine=rhoStep/8;

    I=imread(fPath, 'gif');

    %find the edge of the image

    BW=edge(I,'canny');

    %imshow(BW);

    %hough transform

    %define the accumulator range

    rho=1:rhoStep:sqrt((size(BW,1))^2 + (size(BW,2))^2);

    theta=0:thetaStep:180-thetaStep;

    accu=zeros(length(rho), length(theta));

    %get the pixel indices that contains a point

    [rowInd, colInd]=find(BW);

    %for each point, plot all the lines (sampled) pass through it

    %at theta-rho plane

    for li=1:1:length(rowInd)

        for lk=1:1:length(theta)

            ltheta=theta(lk)*pi/180;

            lrho=colInd(li)*cos(ltheta) + rowInd(li)*sin(ltheta);

            %binning the lrho value

            diffs=abs(lrho-rho);

            %we only increase the count of most similar ones

            %introducing a threshold instead choosing the

            %min

            minDiff=min(diffs);

            if (minDiff<rhoDiffThresholdForLine)

               minDiffInd=find(diffs==minDiff);

               for lm=1:1:length(minDiffInd)

                   accu(minDiffInd(lm),lk) = accu(minDiffInd(lm),lk) + 1;

               end

            end

        end

    end

    %find local maxima 

    accuBMax=imregionalmax(accu);

    [rho_candi, theta_candi]=find(accuBMax==1);

    %find the points in theta-rho plane that has count more than

    %threshold

    linePoints=0;

    %get a list of lines detected with their rho and theta values

    rhoLines=[];

    thetaLines=[];

    for li=1:1:length(rho_candi)

        l_accu=accu(rho_candi(li), theta_candi(li));

        if (l_accu<=0)

            %do nothing

        elseif (l_accu > 25)

            linePoints=linePoints+1;

            rhoLines=[rhoLines;rho(rho_candi(li))];

            thetaLines=[thetaLines;theta(theta_candi(li))];

        end

    end

end

 

You can also find another implementation at reference 4.

References:

1. http://en.wikipedia.org/wiki/Hough_transform

2. Hough Transform: http://www.cs.jhu.edu/~misha/Fall04/GHT1.pdf

3. IDL Reference Guide. Available at http://idlastro.gsfc.nasa.gov/idl_html_help/HOUGH.html

4. Matlab code: Line Detection via Standard Hough Transform

http://www.mathworks.com/matlabcentral/fileexchange/4983-line-detection-via-standard-hough-transform

All Focused Image by Focal Stacking

Side Note: First Draft on Apr 12 2011. This post is based on Dr. Michael S. Brown’s assignment for a course I took and enjoyed. Some of the test pictures are from the assignment.

Digital Camera has limited depth of field. It’s difficult to capture a single image with all the important elements of the scene in focus sometimes.

Focal stacking is a technique to “stack” multiple images with different focus into a single all focused image.

Below is an example borrowed from wikipedia,

File:Focus stacking Tachinid fly.jpg

If you look at the images carefully, you’ll find that the first image has the first part of the fly focused, the second image has the second part of the fly focused, and the third image is a combined all-focused image.

Focal Stacking Implementation in Matlab

Here I introduce a simple method of focal stacking assuming that the input image sequences are aligned.  The input image sequence can be found at another page here.

The idea is follows,

1. A focused image will have a lot of sharp edges. Edges can normally be found using derivative filters. Laplacian filter is used here to retrieve these sharp edges. Therefore, laplacian values can be used to indicate the sharpness level of each pixel.

2. We want the output image to be all-in-focus, but we also want the image to look smooth and natural. In other words, we want a smooth transition when we take pixels from different images. If we take 10 neighboring pixels from 10 different images, the output is likely to be noisy, as below,

image

A Focused Image based purely on Laplacian

Therefore, we want to somehow smooth the image without introduing noticeable blur. We can do this by smooth the Laplaican we computed for every image. Since Laplacian values are the choice indicator, we smoothed our choice in this manner.

The smooth can be done using an average filter, and the result I got is something below,

image

A Focused Image by Smoothed Laplacian

The image above is much less noisy than using purely laplacian. However, there’re still obvious noise at the boundary of the foreground (with a lot of sharp edges) and background (no sharp edges).

3. An error correction technique is used to improve the result. I still used Laplacian. Firstly, I obtain the sum of Laplacian from all input images, and this sum of laplacian image is considered to contain all the correct edges of the input images. We then start from the coordinate of the sharpest pixel in this sum image to pick up pixels for our output image. For every pixel, if its corresponding sum image pixel has a value bigger than 0 (the pixel falls on an edge), we select the pixel from the input image with highest laplacian value at that pixel. If its corresponding sum image pixel has a value of 0 (the pixel doesn’t fall on an edge), we select the same image as one its neighbors so the final output is smooth. The one image got selected again is the one with the highest Laplacian value. Below is the output,

  clip_image002

A Focused Image by Laplacian with Reference to Sum of Laplacian of Individual Image

Focal Stacking in Photoshop

Photoshop has the functionalities to help one to do focal stacking. You can find details here. Below is the result I got using Photoshop CS5. Honestly, it’s slightly better than my Matlab result.

fly-photo

You can download the matlab code here.

Image Histogram Equalization

Side Note: First draft on Mar 26 2011.

1. Concepts

Image Histogram refers to a graphical representation of tonal distribution of a digital image. It plots the number of pixels at each tonal level. (according to wikipedia)

Image histogram equalization is an image processing method for image contrast adjustment. The nice property of image histogram equalization is it makes use of the input image property. It can be very effective for image that has low contrast. Below is an example,

Mandrillimage

2. Matlab Implementation

Matlab has histogram equalization function built-in, histeq. But one can easily implement it himself/herself.

A simple histogram equalization makes use of the Cumulative Distribution Function (CDF),

image

where Sk is the transformed tonal level for an input tonal level rk. k = 0, 1, …2^numOfBitsPerChannelOfAPixel-1 is the tonal level range. nj is the total number of pixels with jth tonal level. The equation says the total number of pixels have gray level no more than kth level divide by the total number of pixels will be the CDF of input tonal level k.

The CDF is able to map the input tonal level for every pixel to a range of [0, 1], once we scale this range to [0, 255], we can produce the output tonal level for every pixel.

The matlab implementation for Gray image and explanation is as below,

function[O]=feipeng_histeq(fpath)
    %1. read the image file into a matrix
    I=imread(fpath);
    %2. get the total number of pixel
    N=numel(I);
    %3. get a one-dimensional matrix indicating number of 
    %pixels at each
    %grey-level
    A=zeros(1,256);
    for i=1:1:256
        A(1,i)=numel(I(I==i-1));
    end
    figure, imhist(I);  %for test
    %4. get the cumulative number of pixels at each grey-level
    %[note]can be improved by using the previous calculated
    % B for calculation of
    %current B
    B=zeros(1,256);
    for i=1:1:256
        for j=1:1:i
            B(1,i)=B(1,i)+A(1,j);
        end
    end
    %5. get the normalized cumulative distribution at each 
    % grey-level
    for i=1:1:256
        B(1,i)=B(1,i)/N;
    end
    figure, plot(0:255, B);
    %6. for each pixel, compute it's output intensity
    [nrows,ncols]=size(I);
    O=zeros(nrows,ncols, 'uint8');
    for i=1:1:nrows
        for j=1:1:ncols
            O(i,j)=B(1,I(i,j)+1)*255;
        end
    end
    AO = zeros(1, 256);
    for i=1:1:256
        AO(1, i) = numel(O(O==i-1));
    end
    figure, imhist(O);
    figure,imshow(O);
    %for comprison with built-in histogram equalization
    O1 = histeq(I);
    figure, imhist(O1);
    figure, imshow(O1);
end

The plots for the histogram are shown below,

imageimageimage

The 3 plots are input image, the output image histogram plot for the implementation above and the matlab built-in implementation histogram.

3. Photoshop

Photoshop (not sure starts with which version, but CS5 definitely) has histogram equalization built-in at Image=>Adjustments=>Curves.

Denoising and Detail Transfer with Flash and No-Flash Image Pairs — Part 2: Matlab Implementation

Side Note: First Draft on 21 Mar 2011.

If you haven’t read the theory part yet, please read the theory part first: http://www.roman10.net/?p=53 . This post describes the implementation in a step by step manner. You can also download the entire matlab file at the end.

1. Denosing of Non-Flash Image

The bilateral filter is a multiplication of Gaussian on distance and Gaussian on intensity difference. If the image is in RGB, we can apply it across all 3 channels separately.

1.1 Apply basic Bilateral Filter

As the flash image contains shadow and specularities, the joint bilteral filter cannot produce better results than basic bilateral filter in these regions. Therefore, we still need to compute the basic bilateral filtered non-flash image.

The basic bilateral filter is expressed as

image

The matlab code for the implementation is as below,

image

There’re two ways to compute the gd, using matlab built-in fspecial or implement our own 2-D Gaussian filter. In the code above, the first method is used, and the second method is included in the comment. More information about how Gaussian filter can be found at How to Gaussian in Matlab.

The intensity difference gaussian is computed for every pixel across all R G B channels. One needs to be careful for the pixels near boundary as the neighboring region may go outside of the boundry.

There’re three parameters one can adjust for the bilteral filter, the windows size Wsize, the variance for distance gaussian sigma_d and the intensity difference gaussian variance sigma_r. The parameters should be fine-tuned to output best denosing results while maintaining the details of the picture.

1.2 Apply Joint Bilteral Filter

The joint bilteral is expressed as,

image

Basically computing the intensity difference gaussian with Flash image data rather than non-flash image data.

Below is the matlab code snippet,

image

As you can see, the code is quite similar to the basic bilateral code. But here to use the 0.02 of the max intensity difference across all RGB channles of flash image as the variance for intensity difference gaussian. According to the reference paper[1] and our experiment, 0.02 gives good result.

2. Compute the Detail Image

The detail image is expressed as,

Where Fbase is the basic bilteral filtered flash image.  ε is set to 0.02 by following the reference paper [1]. As the code is very similar to basic bilteral filter code above, it is not shown here.

3. Compute the Shadow and Specularities Mask

The Shadow and Specularities masks for flash image can be quite complicated. But here we just demonstrate a very simple method, just to illustrate the idea.

We first get a luminance component of the input flash image, finding the minimum and maximum value of the luminance component. We define the shadow threshold as shadowTh = min(Lumi) + (max(Lumi)-min(Lumi)*0.06), the specularityTh = max(Lumi) – (max(Lumi) – min(Lumi)*0.06).  Any pixels with lower luminance than shadowTh will be considered as shadow, any pixels with higher luminance value than specularityTh will be considered as specularities.

The code is presented as below,

 image

4. Synthesize the Final Image

The final output image is computed as below,

where M is the aggregation of the shadow and specularities masks. Anr is the joint bilateral filtered image. Fdetail is the detail image computed from flash image. Abase is the base bilteral filtered non-flash image.

5. Some Sample Results

In the order of flash image, non-flash image and output. You’ll need to zoom in to see the effects. You can test more images from reference 1 link.

image3_flashimage3_noflash

image3XOutput

You can download the entire matlab code here.

Reference:

[1] http://research.microsoft.com/en-us/um/redmond/projects/FlashNoFlash/

Denoising and Detail Transfer with Flash and No-Flash Image Pairs — Part 1: Theory

Side Note: First Draft on Mar 17 2011. I’m “imaginarily” interested in photography for quite a while. Well, I once bought a digital camera and only used it for less than 10 times. I thought digital image processing was faking natural stuff. But now I find it’s actually fun. Smile

Photographer hates dark scene. There’re several techniques you can deal with it, all with disadvantages. Firstly, you can have longer exposure time, but it gives you motion blurred image. Second option is to open up the aperture more, but it narrows down you depth of field. Third choice is to increase your CCD gain (higher ISO settings), but noise gets magnified too. Ok, you can use flash light, but it changes the illumilation completely, and we all know different lights give people different feelings.

The idea to solve the problem is to synthesize an image from two images, with and without flash captured with the same focal length and aperature. Suppose we adjusted the ISO setting properly, we will get the no-flash image with the illumination we want, but noisy, another flash image with details and less noise.

The general approach to synthesize a better image is to use the information and details in the flash image to reduce the noisy and transfer better detail to the image without flash.

But how are we going to to do it?

1. Bilateral Filter

To reduce noise in an image, a smoothing/low-pass filter can be used. The rock star Gaussian filter are often used. However, Gaussian filter operates based on spatial distance of pixels only. It applies the same filter everywhere. While it reduces noise, it also takes away a lot of details and blurs the entire image.

A better filter in this case is Bilateral filter, which does not only consider spatial distance of neighboring pixels, but also takes the pixel intensity difference into consideration. In the sense, it adapts to the pixel values. It can be expressed in the formula below,

image

Here gd is the Gaussion function of the pixel distance, gr is the Gaussion function of the intensity difference. k (p) is the normalization factor.

The effect of the smoothing can be controlled by the variance of the two Gaussian functions.

2. Joint Bilateral Filter

Thinking about the flash image contains the details, which is essentially expressed by the intensity values of its pixels. Or in other words, flash image presents the high frequency components better. So by using the flash image intensity information to calculate the Gaussian range weight gr, we can achieve better results in terms of preserving the details.

image

Here we use flash image pixel intensity for the second Gaussion function gr.

3. Detail Transfer from Flash Image to Non-Flash Image

The Joint Bilateral Filter can denoise the non-flash image while preserving the details in it. However, the non-flash image itself doesn’t capture detail that well, and flash image does a better job.

To transfer the details from flash image to non-flash image, we begin by computing a detail layer from flash image,

image

The Fbase is the Bilateral filtered F. The ratio is called quotient image or ratio image, and it captures the local detail variation in F.

To transfer the detail,  A = A(NR) *F(Detail).

4. Flash Shadows and Specularities

Flash shadows and specularities are artifacts caused by flash, and we don’t want to transfer it over to non-flash image. A mask is computed for both shawdows and specularities and improve the final result by,

image

M is the mask. With this formula, the shadows and specularities are ignored when transfer the details.

The mask is computed by comparing the intensity with some thresholds and image morphological operations.

Reference:http://research.microsoft.com/en-us/um/redmond/projects/FlashNoFlash/