Real-Time Video Bilayer Segmentation–Theory–Part 1 Bayesian Estimation
Side Note: First draft on Mar 30 2011.
This article is mainly based on the reference paper 1 and 2.
Video bilayer segmentation refers to the process of dividing the video frames into foreground and background. Here we introduce a video bilayer segmentation process which is close to real-time.
The entire process can be illustrated as the figure below,
Figure 1. Process Overview of the real-time Bilayer Segmentation
The input includes the video and the segementation mask for the first frame. The segmentation of the first frame can be done using background subtraction, interactive graph cut, image snapping, lazy snapping and so on.
The segmentation for the rest of the video frames are done one by one automatically by the process illustrated above.
1. Bayesian Estimation
For each pixel p in a video frame, a probability Iprob (p) of a pixel belongs to foreground can be expressed as,
where Cp is the color vector of pixel p, F and B are foreground and background respectively. The likelihood P(Cp|F) and P(Cp|B) are calculated by accumulating background and foreground color historgrams. The prior P (F) and P (B) are computed from the previous segmentation mask.
1.1 Calculation of likelihood P(Cp|F) and P(Cp|B)
The likelihood basically indicates the probability of a color being foreground (as in P(Cp|F)) or background (as in P(Cp|B)) based on color distribution of all previous segmentation results.
To build a color histogram (here we use 2 dimensional histogram as example), we set a two dimentional grid, each bin in the grid with certain value ranges. For example,
[0…10, 0..10][0..10, 11…20]….[0..10, 251..260]
[11..20, 0..10][11..20, 11..20]…[11..20, 251..260]
…
[251..260, 0..10][251..260, 11..20]…[251..260, 251..260]
And we count the number of pixels that falls into this 2-dimentional grids. As the video frame pixel normally contains 3 components, therefore, a 3-dimensional can be built for it.
There’re two ways of creating the color histograms for likelihood calculation. The first one is the accumlative histograms. As the foreground and background changes, the accumlative histogram can incorporate these changes into the calculation. As segmentation always contain some errors, the accumlation process can be improved by only updating the histogram for the bins which have zero values. In this case, the error pixels doesn’t accumulate and only have very small values in the histogram with limited influence.
The other method is to use the first segmentation result to build the color histogram and use it for subseqent processing. This is useful if we know the foreground and background color distribution is not going to change much.
Once the color histogram is built, We can normalize them and read the values for the likelihoods P(Cp|F) and P(Cp|B).
1.2 Compute Priors P (F) and P(B)
The priors are computed based on the previous frame, in consideration of temporal correlations. The computation can be expressed as the formula below,
where a(t-1) is the previous segmentation mask, with 255 indicates the foreground and 0 for background. G3x3 and G7x7 are Gaussian filters. Resize are scaling transformation operations. The result Mt is a smoothed mask.
With Mt, the P (F) and P(B) are be calculated by,
With the priors and likehoods, Iprob can be calculated.
Reference:
1. Real-Time Video Matting Based on Bilayer Segmentation
2. Live Video Segmentation in Dynamic Backgrounds Using Thermal Vision
Leave a Reply Cancel reply
40% Discount on My Book — Android NDK Cookbook
Android NDK Cookbook ebook 40% discount with promotion code MREANC40 at Packt Publishing The promotion code is valid until 15th June.Categories
- Android Apps (18)
- Android Audio Editor (1)
- TS 2 (3)
- Video Converter Android (8)
- Video2Gif (1)
- Android Tutorial (26)
- Android Dev Tools (1)
- API illustrated (8)
- Multimedia API (3)
- ffmpeg on Android (4)
- NDK (6)
- UI (5)
- Animation (1)
- Code Snippet (2)
- Coding Beyond Technique (18)
- a word, a world (4)
- Bug Rectified (4)
- Programming Habit (1)
- Software as a Career (1)
- Software as User Experience (1)
- Compilers and Related (2)
- ELF (2)
- Computer Languages (31)
- C/C++ (13)
- Java (9)
- JavaScript (2)
- PHP (1)
- Python (8)
- Data Structure & Algorithms (29)
- Bits (1)
- Data Structure (5)
- Integers (10)
- BigInteger (1)
- Prime (4)
- Search (3)
- Sorting (5)
- Strings (5)
- Database (1)
- SQLite (1)
- Digital Signal Processing (33)
- Distributed Systems (17)
- Apache Cassandra (6)
- Apache Hadoop (8)
- Apache Avro (3)
- Apache Nutch (3)
- Apache Solr (1)
- Linux Study Notes (40)
- crontab (1)
- Linux Kernel Programming (8)
- Linux Programming (12)
- IPC (2)
- Linux Network Programming (5)
- Linux Signals (2)
- Linux Shell Scripting (1)
- ssh (3)
- Machinery (30)
- misc (1)
- My Ideas (1)
- My Project (3)
- Mobile Caching (1)
- Selective Decoding (2)
- My Publication (1)
- My Readings (1)
- Networking (15)
- Program for Performance (8)
- Uncategorized (1)
- Virtual Machine (2)
- Web Dev (8)
- web components (3)
- Android Apps (18)
Recent Comments
Archives
- May 2013 (1)
- April 2013 (1)
- March 2013 (4)
- December 2012 (2)
- November 2012 (6)
- October 2012 (6)
- September 2012 (3)
- August 2012 (13)
- July 2012 (15)
- June 2012 (3)
- May 2012 (8)
- April 2012 (4)
- March 2012 (13)
- February 2012 (19)
- January 2012 (9)
- December 2011 (11)
- November 2011 (12)
- October 2011 (4)
- September 2011 (12)
- August 2011 (16)
- July 2011 (15)
- June 2011 (6)
- May 2011 (10)
- April 2011 (13)
- March 2011 (20)
- February 2011 (4)
- November 2010 (2)
- May 2010 (1)
- April 2010 (1)
- February 2010 (1)




