Original paper
GroupViT: Semantic Segmentation Emerges from Text Supervision
Published: Jun 1, 2022
Abstract
Grouping and recognition are important components of visual scene understanding, e.g., for object detection and semantic segmentation. With end-to-end deep learning systems, grouping of image regions usually happens implicitly via top-down supervision from pixel-level recognition labels. Instead, in this paper, we propose to bring back the grouping mechanism into deep networks, which allows semantic segments to emerge automatically with only...
Paper Details
Title
GroupViT: Semantic Segmentation Emerges from Text Supervision
Published Date
Jun 1, 2022