Draft:Selective search
Submission declined on 18 December 2024 by Dan arndt (talk). This submission is not adequately supported by reliable sources. Reliable sources are required so that information can be verified. If you need help with referencing, please see Referencing for beginners and Citing sources.
Where to get help
How to improve a draft
You can also browse Wikipedia:Featured articles and Wikipedia:Good articles to find examples of Wikipedia's best writing on topics similar to your proposed article. Improving your odds of a speedy review To improve your odds of a faster review, tag your draft with relevant WikiProject tags using the button below. This will let reviewers know a new draft has been submitted in their area of interest. For instance, if you wrote about a female astronomer, you would want to add the Biography, Astronomy, and Women scientists tags. Editor resources
|
This article includes a list of references, related reading, or external links, but its sources remain unclear because it lacks inline citations. (June 2024) |
In computer vision and image processing, selective search is a method of object detection that takes an image and captures objects within that image into defined spaces called bounding boxes (defined by an origin point, height value, and length value). These boxes then act as Regions of Interest (ROI) for an image classifier to classify what object is present in the bounding box.[1]
The selective search method is an attempt to make object detection less computationally taxing then exhaustive search and capture the benefits of segmentation in establishing the boundary lines of the boxes based on the shape of the object being classified.
It avoids needing to make a lot of correct guesses on its parameters by using an ensemble approach to segmentation rather than a single model and it avoids having a fixed bounding box size, thus being able to detect objects at different scales, by combining segments to varying sizes rather than having a fixed sized grid overlaying the image from which to search from, as the computationally feasible exhaustive approaches have.
Uses
[edit]It is the fundamental Region of Interest extractor for the original Region Based Convolutional Neural Network (R-CNN) as well as Fast R-CNN.
How it works
[edit]The image is taken in and split up into an initial set of small starting regions by the fast method of Felzenszwalb and Huttenlocher.[2] A greedy algorithm is used to iteratively group regions together. The algorithm works by taking a given region and calculating the similaritiese between it and all neighbouring regions. The two most similar regions are grouped together. This occurs recursively until a single region spanning the image results. The Hierarchical Grouping Algorithm is as follows:
Input: (color) image Output: Set of object location hypotheses L Obtain initial regions R = {r1,...rn} using Fast Method (Felzenszwalb and Huttenlocher) Initialize similarity set S = Ø foreach Neighboring region pair (ri,rj) do Calculate similarity s(ri,rj) S=S U s(ri,rj) while S != Ø do Get highest similarity s(ri,rj) = max(S) Merge corresponding regions rt = ri U rj Remove similarities regarding ri : S = S \ s(ri,r*) Remove similarities regarding rj : S = S \ s(r*,rj) Calculate similarity set St, between rt and its neighbors S = S U St R = R U rt Extract object location boxes L from all regions in R
Selective search relies on variety in its assembly of larger regions and it accomplishes this variety in 3 ways:
using a variety of color spaces with different invariance properties, using different similarity measure sij, and varying the starting regions of calculation.
The similarity measure in the above algorithm s(ri,rj) is defined as:
a1Scolor(ri,rj)+a2Stexture(ri,rj)+a3Ssize(ri,rj)+a4Sfill(ri,rj)
ai ∈ {0,1}
Efficient Graph-Based Image Segmentation: Felzenszwalb and Huttenlocher (Fast Method for Image Component Generation)
[edit]For the initial segments list used in 'R' above, a method has to be employed to actually generate regions in the image that have some kind of structure. In other words, pixels have to be associated in groups that have some kind of shared meaning between them.
These groups are called components and the segments list (R) is filled with them.
The method used in the original paper for Selective Search was created by Felzenszwalb and Huttenlocher[2]
It's algorithm goes as follows:
Input: Graph(Vertices, Edges) Output: Segmentation list(Components) where Components = list(vertices) NOTE: image that is used to generate graph typically has a Gaussian blur at 0.8 radius to reduce artifacting in final segmentation list. Edges = list(Edge) Edge = (Vertex1, Vertex2, weight) Component = (list(verticies), internal_diff:int) THRESHOLD = arbitrary valued used to grow or shrink the number of total components by increasing or decreasing the threshold for merging components when comparison for merging occurs. Sorted_Edges = list of 'Edges' from 'Graph' sorted from smallest to largest by weight Segmentation_list = [] for vertex in Vertices: Segmentation_list.append(Component(vertex, internal_diff=0)) for edge in Sorted_edges: if edge.v1 is not in same component as edge.v2 and edge.weight <= min(Component(v1).internal_diff + THRESHOLD, Component(v2).internal_diff + THRESHOLD): merge the two components. merged component internal_diff = max(weight, v1.internal_diff, v2.internal_diff) update Segmentation_list with union of two Components
References
[edit]External links
[edit]- [2] [1]