Hierarchical ?
We saw that the major issue with region-based segmentation is the
difficulty to get the right region quantity. Indeed, each object
owns its proper scale. In the 512x512 pixels "house" image below
from the INRIA's MOVI project, the letters on the ground are
around 50 pixels big, windows are 500 pixels big and the front of
the house is 10000 pixels big.
If the size of the regions is constrained above 5000 pixels, the
front of the house is correctly segmented but the letters are
gone. On the other side, allowing region size of 50 pixels leads
to plenty of small regions on the front of the house, which
therefore looses its global aspect:
=> Every object owns its proper scale
Initial Image | Min size=50px (320 régions) Letters scale | Min size=500px (98 régions) Windows scale | Min size=5000px (21 régions) House front scale | This explains why a single region image is NOT a sufficient
result that can be exploited: in an ideal way, one region
image per scale is needed. Therefore the "hierarchical" term.
Introduction to the algorithm
The used method is close to a merge algorithm :
The hierarchical property of the algorithm comes from
looping in increasing order on the specified minimum sizes and saving
the results for each of them. As the algorithm restarts from the
result of the previous minimum size, processing an additional
scale is not only fast but ensures also that new regions will be
a collection of one or several regions of the previous scale.
This property is very important because the result can be thus
formatted as a region pyramid.
Output informations differs from the classical ones:
Note: A image denoise phase (similar to a sequential alternate
filter but faster) has been added before the merge phase in
order to improve noise robustness.
Note: The algorithm uses the color information when available.
Execution time
The objection which shall rise is : apply this algorithm as
is leads to prohibitive execution times (O(N²) with
the pixel quantity in the image). Some minor approximations,
in particular the recomputation of the "gradient between
regions", and an efficient implementation of the region
neighbors gradient sort overcome this.
To give an idea, the previous result on the 512x512 pixels
"house" image, has been obtained in 0.35 seconds (0.31 s
for the two first scales (50 and 500) and 0.28 s for the first
one alone (50)).
The algorithm has been implemented in C++ (compiled with GNU
g++ 3.2.2 option -O2 under Linux, on an AMD 1800+ processor).
Other result examples
The first row displays images with special colors to enhance
contrasts. The region color is the average color.
The second row displays images with random colors for each area.
Algorithm parameters :
640x480 image of the crane from the Microsoft building in
Cambridge, processed in 0.39 s
Initial image | Min size=40px (697 regions) | Min size=300px (172 regions) | Min size=2000px (31 regions) |
768x576 image of the Leuven castle, processed in 0.63 s
Initial image | Min size=100px (969 regions) | Min size=500px (260 regions) | Min size=2500px (58 regions) |
|