Semantically Coherent Out-of-Distribution Detection

Jingkang Yang1      Haoqi Wang2      Litong Feng2      Xiaopeng Yan2      Huabin Zheng2     
Wayne Zhang2,3,4                    Ziwei Liu1
1. S-Lab, Nanyang Technological University                      2. SenseTime Research
3. Qing Yuan Research Institute, Shanghai Jiao Tong University
4. Shanghai AI Laboratory, Shanghai, China
Proceedings of the IEEE International Conference on Computer Vision (ICCV 2021)


  • SC-OOD benchmarks: encouraging OOD detectors to focus on detecting semantic shifts and being robust to negligible covariate shifts.
  • Unsupervised Dual Grouping (UDG): an end-to-end SC-OOD detection method that effectively uses a realistic external unlabeled set.

SC-OOD Benchmarks

Current out-of-distribution (OOD) detection benchmarks are commonly built by defining one dataset as in-distribution (ID) and all others as OOD. However, these benchmarks unfortunately introduce some unwanted and impractical goals, e.g., to perfectly distinguish CIFAR dogs from ImageNet dogs, even though they have the same semantics and negligible covariate shifts. These unrealistic goals will result in an extremely narrow range of model capabilities, greatly limiting their use in real applications.

To overcome these drawbacks, we re-design the benchmarks and propose the SC-OOD. On the SC-OOD benchmarks, existing methods suffer from large performance degradation, suggesting that they are extremely sensitive to low-level discrepancy between data sources while ignoring their inherent semantics. An effective SC-OOD approach is awaiting.

Our SC-OOD benchmarks can be downloaded through either Microsoft One-Drive or Google Cloud (1.7G).


Unsupervised Dual Grouping (UDG)

We propose an SC-OOD approach with the help of an realistic external unlabeled set. Different from standard outlier explosure methods whose unlabeled set is purely OOD, our unlabeled set is contaminated by a portion of ID samples. We believe it is a more realistic setting, as a powerful image crawler can easily prepare millions of unlabeled data but will inevitably introduce ID samples that are expensive to be purified.

With the help of a realistic unlabeled set for SC-OOD, we design an elegant framework featured by unsupervised dual grouping (UDG) for the joint modeling of labeled and unlabeled data. The proposed UDG enhances the semantic expression ability of the model by exploring unlabeled data with an unsupervised deep clustering task, and the grouping information generated by the auxiliary task can also dynamically separate the ID and OOD samples in the unlabeled set. ID samples separated from the unlabeled set will join other given ID samples for classifier training, and the rest will be forced to produce a uniform posterior distribution like other outlier exposure methods. In this way, ID classification and OOD detection performances are simultaneously improved.




author = {Yang, Jingkang and Wang, Haoqi and Feng, Litong and Yan, Xiaopeng and Zheng, Huabin and Zhang, Wayne and Liu, Ziwei},
title = {Semantically Coherent Out-of-Distribution Detection},
booktitle={Proceedings of the IEEE International Conference on Computer Vision},
year = {2021}}