Mar 2019
tl;dr: A convolutional deep belief network (CDBN) is trained to perform recognition and retrieval of 3D voxel grid. It can also hallucinate the missing parts of depth maps.
The paper builds upon deep belief network popular at that time and uses a new way to train 3D network. It also achieves several other tasks, such as 2.5D joint classification and completion, and next best-view prediction. However it is surpassed by many other methods later such as MVCNN and the famed PointNet.
This paper is from CVPR 2015, only 4 years old as of the time of writing but I am already having difficulty going through this paper as smoothly as other more recent papers, due to the archaic terminology and method.
Gibbs sampling is used to estimate the posterior distribution $p(y | x_o)$. |
A joint distribution is learned $p(x, y)$. Recognizing the object is to estimate $p(y | x)$ (p of y given x). To estimate this posterior distribution, Gibbs sampling is used, by forward propagating x (with x_u randomly initialized) and backpropagating y alternatively, with the weight of network fixed. This gives the completed shape and prediction simultaneously. This procedure is run in parallel for a large number of times, and the class corresponds to the most frequently sampled class. |