Object Recognition Project
Active In SP
Joined: Sep 2010
18-10-2010, 11:13 AM
Viewpoint independent recognition of free-form objects and their segmentation in the presence of clutter and occlusions is a challenging task. We present a novel 3D model-based algorithm which performs this task automatically and efficiently. A 3D model of an object is automatically constructed offline from its multiple unordered range images (views). These views are converted into multidimensional table representations (which we refer to as tensors). Correspondences are automatically established between these views by simultaneously matching the tensors of a view with those of the remaining views using a hash table-based voting scheme. This result in a graph of relative transformations used to register the views before they are integrated into a seamless 3D model. These models and their tensor representations constitute the model library.
During online recognition, a tensor from the scene is simultaneously matched with those in the library by casting votes. Similarity measures are calculated for the model tensors which receive the most votes. The model with the highest similarity is transformed to the scene and, if it aligns accurately with an object in the scene, that object is declared as recognized and is segmented. This process is repeated until the scene is completely segmented.
THE aim of object recognition is to correctly identify objects in a scene and estimate their pose (location and orientation). Object recognition in complex scenes in the presence of clutter (due to noise and the presence of unwanted objects) and occlusions (due to the presence of multiple objects) is a challenging task. Object recognition from 2D images is an appealing approach due to the widespread availability of cameras. However, 2D recognition techniques are sensitive to illumination, shadows, scale, pose, and occlusions.
Three dimensional object recognition on the other hand, does not suffer from these limitations. An important paradigm of 3D object recognition is model-based, as opposed to view-based, whereby 3D models of objects are constructed offline and stored in a model library using a suitable representation. During online recognition, a range image of the scene is converted into a similar representation and matched with the models of the database in order to recognize library objects.
A 3D model of a free-form object is constructed by acquiring its range images from multiple viewpoints so that its surface is completely covered. These views are then registered in a common coordinate basis. Registration is performed in two steps, namely, coarse and fine registration. Coarse registration can be performed manually or automatically through system calibration or feature matching. We will focus on automatic coarse registration using feature matching, also known as correspondence identification. Coarse registration is followed by fine registration, using, for example, the Iterative Closest Point (ICP) algorithm. After fine registration, the views are integrated and reconstructed to form a seamless 3D model.
The main challenge in 3D modeling is the automatic establishment of correspondences between overlapping views. This problem becomes more challenging when the views are unordered (i.e., the order in which the views were acquired is unknown and, hence, there is no a priori knowledge about which view pairs overlap). A pair wise correspondence algorithm is not practical in such cases because it must exhaustively search for correspondences between 2 view pairs (2 Þ, where N is the total number of views).
In the case of unordered views, a multi view correspondence algorithm is more suitable. We define multi view correspondence as a one-to-many correspondence approach whereby a single view is simultaneously matched with multiple views. Our major contribution in the model database construction is a novel multi view correspondence algorithm which is an extension of our pair wise correspondence algorithm. Existing correspondence techniques such as the RANSACbased DARCES, bitangent curve matching, spin image matching, geometric histogram matching, three-tuple matching, and SAI matching are all pair wise correspondence techniques and, therefore, cannot be efficiently applied to solve the multi view correspondence problem. Huber and Hebert proposed a framework for automatic 3D modeling from unordered views. Their framework is, however, based on an exhaustive search to find correspondences between all possible pairs of views in order to initialize a graph of relative pose