Learning-AI

Machine Learning Papers Notes (CNN)

Compiled by Patrick Liu

This note covers advancement in computer vision/image processing powered by convolutional neural network (CNN) in increasingly more challenging topics from Image Classification to Object Detection to Segmentation.

Image Classification

Goal: Predict a label with confidence to an entire image.

Evolution from AlexNet, VGGNet, GoogLeNet (Inception) to ResNet.

AlexNet (NIPS 2012)

VGG16 (ICLR 2015, 09/2014)

Object Detection

Goal: Predict a label with confidence, as well as the coordinates of a box bounding each object in an image.

The evolution from R-CNN (regions with CNN-features), Fast R-CNN, Faster R-CNN, YOLO (YOLOv2 and YOLO9000) and SSD.

Review blogs

A Brief History of CNNs in Image Segmentation: From R-CNN to Mask R-CNN

R-CNN

OverFeat

Fast R-CNN

Faster R-CNN

YOLO

YOLOv2 and YOLO9000

SSD

Extended reading

Segmentation

Goal: Semantic segmentation aims at grouping pixels in a semantically meaningful way and are, therefore, pixel-wise segmentation. It predicts a label with confidence for each pixel in the image.

Instance classification is more challenging in that it include object detection. See illustration below for an example.

Review blogs

FCN (Fully connected networks)

U-net

V-Net

FPN (Feature pyramid network)

Instance/Object segmentation

Instance segmentation involves challenges from object detection with bounding boxes and semantic segmentation. Facebook AI Research (FAIR) has a series of progressive research on on DeepMask, SharpMask and MultiPath Network. Here is a blog post review by Piotr Dollar, and here is another one

DeepMask

SharpMask

MultiPath Network

Mask R-CNN

Polygon RNN (2017 CVPR)

Medical applications

ChestX-ray8

CNN feature extractor for TB