Weakly Supervised Object Localization with ConvNet Feature and Multi-fold MIL Training


Object localization is a problem on how to locate a bounding box into object position in an image. Weakly supervised learning refer to when the training data is incomplete, in this case without object bounding box annotation. This study focus on how to perform object localization in weakly supervised fashion. The main contribution would be how to use features extracted from pre-trained convolutional neural network (ConvNet) with Multi-fold MIL training [1]. ConvNet feature has been shown as effective feature descriptor for various vision task [2]. Selective Search is used as region proposal algorithm to generate training data and possible object location in test phase [3].

object localization, weakly supervised, multiple instance learning, convolutional neural network

The objective for this is to improve the accuracy for weakly supervised object localization in image using combination of ConvNet feature and multi-fold MIL training.


In the first phase, a region proposal algorithm such as Selective Search is used to generate candidate object window from input image.
Then each regions is having its ConvNet features extracted.
This training data is then used to train a detector using Multi-fold weakly supervised training.
The trained detector then can be used to predict the object location on the test set.

The performance will then be compared with other state-of-the-art method in weakly supervised object localization using PASCAL VOC 2007 dataset using average precision (AP).


