Welcome to Visual Intelligence & Perception Lab at SUSTech

The VIP Lab aims to explore the theory and application of multimodality in computer vision. In the past few years, rapid progress has been made in the areas of multimodal retrieval, pedestrian re-identification, and RGBD tracking.

There are four main areas of research:

  1. Vision-Language-Audio Multi-modal Learning: Currently, deep learning algorithms are used for uni-modal tasks, but multi-modal joint representation is necessary for AGI. Our research group aims to mine cross-modal correlation and multi-modal representation to solve multi-modal tasks involving image, video, text, and audio data or enhance the learning of single-modal tasks. Specific research areas include vision-language pre-training, image/video description, audio-visual recognition and localization, cross-modal retrieval, and video summarization.

  2. Multi-Modal Perception: In our research, we focus on the challenges of visual perception and the importance of multi-modal perception to achieve accurate and robust scene perception and understanding. We aim to perform multi-modal detection, tracking, and segmentation in multi-modal videos using various sensors, such as RGB, depth, thermal, point cloud, event, and language. Our research covers both aligned and unaligned multi-modal fusion, and we aim to facilitate analysis and understanding in videos for class-agnostic objects and scenes. Additionally, we apply our research on different platforms, such as UAVs and unmanned vehicles, considering resource efficiency. We are currently constructing a series of multi-modal perception platforms.

  3. Robust Learning: Our group focuses on out-of-distribution detection, corruption robust learning, and adversarial robustness in machine learning. One of our main goals is to develop methods for detecting out-of-distribution instances, which are inputs that differ significantly from the training data. This is essential for ensuring the reliability and safety of machine learning systems in real-world applications.

    We also work on building models that are resistant to data corruption and noise, which are common in real-world scenarios. Ensuring the robustness of models to various forms of data corruption is critical for deploying machine learning models in safety-critical applications.

    Another focus of our group is adversarial robustness, which involves making models resistant to adversarial attacks. These attacks are deliberate attempts to fool machine learning models by making small perturbations to the input data. Adversarial defense is essential to ensure the security and reliability of machine learning systems in critical applications. We use techniques such as adversarial training to enhance the resilience of models against adversarial attacks.

  4. Visual Anomaly Detection: Anomaly detection is an important machine learning problem. Different from the assumption of static and closed system, that most existing machine learning methods are based on, it researches how machine learning models can deal with the unknown and uncertain information under the open and dynamic system environment. With the assumption of open environment, the learning systems developed for anomaly detection are usually expected to leverage the knowledge from the knows (normal data and patterns) to infer the unknowns (abnormal or novel patterns that different from the normal ones). Anomaly detection approaches usually extract, characterize and model the patterns with the available normal data, and then develop reasonable anomaly detectors to discover novel or abnormal patterns in the newly observed data. When the target of anomaly detection is the image data, then comes the visual anomaly detection or image anomaly detection.


Joining VIP Lab

If you are interested in joining please go to the recruitment page.



Mar. 20, 2023

Three papers have been accepted by CVPR! Congratulations to Jinyu, Teng and Tiantian!

Nov. 23, 2022

One paper has been accepted by IEEE TMM! Congratulations to Zhu!

Nov. 23, 2022

Two papers have been accepted by AAAI 2023! Congratulations to Jingfei and Mingchen!

Sept. 17, 2022

One paper has been accepted by NeurIPS 2022! Congratulations to Xi Jiang!

Aug. 16, 2022

Two papers have been accepted by IEEE TIP! Congratulations to Chenyang and Fufu!

July 31, 2022

One paper has been accepted by IEEE TIP! Congratulations to Tiantian!

July 04, 2022

Three papers have been accepted by ECCV 2022! Congratulations to Jinyu, Yawen, and Zhongqun!

July 03, 2022

Three papers have been accepted by ACMMM 2022! Congratulations to Jinbao, Guoyang, Jinyu and Wujing!

May 17, 2022

One paper has been accepted by ICML 2022! Congratulations to Teng!