written by hyojung chang
0. About Faster R-CNN
... Our object detection system, called Faster R-CNN, is composed of two modules. The first module is a deep fully convolutional network that proposes regions, and the second module is the Fast R-CNN detector [2] that uses the proposed regions. The entire system is a single, unified network for object detection (Figure 2). Using the recently popular terminology of neural networks with ‘attention’ [31] mechanisms, the RPN module tells the Fast R-CNN module where to look. In Section 3.1 we introduce the designs and properties of the network for region proposal. In Section 3.2 we develop algorithms for training both modules with features shared. ...omitted below Reference : <Link> |
Faster R-CNN is composed of two networks, one is the Region Proposal Network and the other is a Detector that detects objects through the proposed regions above. At this time, Region Proposal Network(= RPN) is the core idea of Faster R-CNN. Faster R-CNN inherits Fast R-CNN structure, eliminates selective search, and calculates RoI through RPN. The RPN improves accuracy and running time as well as avoids to generate excess of proposal boxes because the RPN reduces the cost by sharing computation on convolutional features. RPN and Fast R-CNN are merged into a single network by sharing their convolutional features. This combination helps Faster R-CNN to have leading performance on accuracy but leads to its architecture as a two-stage network which reduces the speed of processing of this method.
A detailed description of Faster R-CNN process is posted in [All for nothing] section. If you are curious, please refer to the note.
Anyway, Faster R-CNN extracts the feature map first and then pass it to the RPN to calculate the RoI. RoI pooling is performed with the obtained RoI, and then classification is performed for object detection.
1. Why we choose Faster R-CNN for disease detection? (Comparison between Faster R-CNN, YOLO and SSD)
Reference : <Link> | ... So far, detection models are divided into two main approaches, namely, one-stage approach and two-stage approach. Models in the one-stage approach is known as detectors which have better and more efficient detection in comparison to another approach. The efficiency here has the potential power to run in real time and is able to apply them to practical applications. However, the trade-off between accuracy and speed is a difficult challenge which needs to be taken into the account in order to balance the gap. However, models in the two-stage approach have their reputation of region-based detectors which have high accuracy but are too low in speed to apply them to real world. This drawback comes from the computation of networks. ...omitted below Reference : <Link> |
Based on the above evidence, we chose Faster R-CNN because accuracy is more important than the training time of the model. And also, we thought that Faster R-CNN process fits our goal(our goal is to predict lung diseases and propose disease areas from input image(CT image in our project)).
Therefore, we decided to train the model using Faster R-CNN and perform prediction through the derived model.