THESIS
2020
xiii, 101 pages : illustrations ; 30 cm
Abstract
Freehand sketching is an artistic expression frequently adopted in human-human and human-computer
communications. However, interpreting the sketching semantics remains an algorithmic
challenge. Humans commonly introduce various levels of abstraction and distortion into their
creations, which cannot be straightforwardly captured by hand-crafted features or rules. The recent
availability of large-scale sketch datasets and 3D geometry datasets opens up new opportunities
to analyze sketches in a data-driven manner. In this thesis, we draw on these two types of data
and present a line of data-driven techniques for semantic sketch analysis, including recognition
with vector inputs, segmentation with 3D geometry labeling transfer, and reconstruction with 3D
geometry templates. To proce...[
Read more ]
Freehand sketching is an artistic expression frequently adopted in human-human and human-computer
communications. However, interpreting the sketching semantics remains an algorithmic
challenge. Humans commonly introduce various levels of abstraction and distortion into their
creations, which cannot be straightforwardly captured by hand-crafted features or rules. The recent
availability of large-scale sketch datasets and 3D geometry datasets opens up new opportunities
to analyze sketches in a data-driven manner. In this thesis, we draw on these two types of data
and present a line of data-driven techniques for semantic sketch analysis, including recognition
with vector inputs, segmentation with 3D geometry labeling transfer, and reconstruction with 3D
geometry templates. To process the 3D geometries extensively used in the sketch interpretation,
we also propose a robust local multi-view descriptor.
First, as a global analysis of sketches, we develop an end-to-end network architecture named
Sketch-R2CNN for sketched object recognition. Existing studies commonly cast the problem as an
image recognition task by rasterizing input sketches to pixel images. Instead, we propose to extract
descriptive features from the vector sketch representation with recurrent neural networks (RNNs).
We design a differentiable line rasterization module that renders the vector sketches and the RNN
features to point feature maps. Subsequent convolutional neural networks (CNNs) readily take the
informative point feature maps as input for object category prediction.
Second, as a step towards finer-level analysis, we introduce an efficient segmentation method to
identify semantic parts in sketched objects. Due to the lack of sketch datasets with segmentation labelings,
we resort to segmented 3D geometry datasets for synthesizing line drawings. Our method,
combining CNNs and multi-label graph cuts, can effectively transfer segmentation labelings from
3D geometries to freehand sketches.
Third, with the above global and part-level analysis, we explore a template-based method for
sketch reconstruction. We retrieve 3D geometries with part structures similar to input sketches.
Then the 3D geometries serve as proxies for lifting 2D sketches to 3D, which is formulated as a
quadratic energy optimization problem.
Lastly, we propose a robust learning-based 3D local descriptor, assisting the processing (e.g.,
orientation alignment) of 3D geometries collected online for the sketch analysis. We represent 3D
local geometry as multi-view images through a differentiable renderer in neural networks. The
viewpoints used in rendering are optimizable instead of being fixed with hand-crafted rules. We
also design an effective soft-view pooling module for integrating the visual features extracted from
each view to a single compact descriptor.
Post a Comment