Xiang Li (李翔)
Associate Professor, College of Computer Science, Nankai University
Address: No. 38, Tongyan Road, Haihe Education Park, Tianjin, China
Email: xiang.li.implus [at] {nankai.edu.cn}
Research Group Page: IMPlus@PCALab

About Me [GitHub] [Google Scholar] [Research Group]

I'm an Associate Professor in College of Computer Science, Nankai University, in the Team of Ming-Ming Cheng. I got my PhD degree from the Department of Computer Science and Technology, Nanjing University of Science and Technology (NJUST) in 2020. My advisor is Prof. Jian Yang from NJUST, who is a Changjiang Scholar. My vice-advisor is Prof. Xiaolin Hu from Tsinghua University. I started my postdoctoral career in NJUST as a candidate for the 2020 Postdoctoral Innovative Talent Program, supervised by Prof. Jinhui Tang. In 2016, I spent 8 months as a research intern in Microsoft Research Asia, supervised by Prof. Tao Qin and Prof. Tie-Yan Liu. I was a visiting scholar at Momenta, mainly focusing on monocular perception algorithm.

My recent works are mainly on:
  • neural architecture design, CNN/Transformer
  • object detection/recognition
  • unsupervised learning
  • knowledge distillation

We are looking for self-motivated PhD candidates! Please feel free to contact me through the email (attach your CV). During the PhD career, you can have:

  • joint supervision with well-known research institute (e.g., Megvii, SenseTime Research, Huawei Noah's Ark Lab, BAAI)
  • hand in hand guidance to publish earlier papers
  • relatively flexible and free research space
We would not push hard, but you should always be self-driven for your own target, i.e., making solid and impactful contributions to the CV/AI community.

Honor

(See more details (codes, solutions, summaries) in [AICompetition of Group Page])
  • 1st place of of Change Detection in High-resolution and Multi-temporal Optical Images, 2nd place of Forgery Detection in Multi-scenario Remote Sensing Images of Typical Objects in 2024 TC I Contest on Intelligent Interpretation for Multi-modal Remote Sensing Application, total 13,000 RMB bonus
  • Second place of IACC International Algorithm Case Competition, namely the remote sensing detection, 100,000 RMB bonus (2nd from 116 teams)
  • Champion of 2022 Jittor AI competition, namely the landscape picture generation, 50,000 RMB bonus (1st from 154 teams)
  • Second place of 2020 Zhengtu Cup's first AI competition, namely the industrial defect detection algorithm, 150,000 RMB bonus (2nd from 900 teams)
  • Champion of 2016 Didi Tech Di-Tech's first big data competition, namely the travel demand prediction algorithm, 100,000 US dollars bonus (1st from 7664 team)
  • Champion of 2015 Alibaba Tianchi's first big data competition, namely Ali mobile recommendation algorithm, 300,000 RMB bonus (1st from 7186 team)
  • 2015 Dean Medal of School of Computer Science, Nanjing University of Science and Technology, 2016 Presidential Medal of Nanjing University of Science and Technology, 2016 National Scholarship
  • ACM-ICPC Asia Regional Contest, Silver Medal (1st)

News

Selected Publications

(* indicates equal contribution, # corresponding author)
Large Selective Kernel Network for Remote Sensing Object Detection
Yuxuan Li, Qibin Hou, Zhaohui Zheng, Ming-Ming Cheng, Jian Yang#, Xiang Li#
in ICCV, 2023
[Paper] [BibTex] [Code]

LSKNet can dynamically adjust its large spatial receptive field to better model the ranging context of various categories of objects in remote sensing scenarios
Curriculum Temperature for Knowledge Distillation
Zheng Li, Xiang Li#, Lingfeng Yang, Borui Zhao, Renjie Song, Lei Luo, Jun Li, Jian Yang#
in AAAI, 2023
[Paper] [BibTex] [Code]
CTKD organizes the distillation task from easy to hard through a dynamic and learnable temperature. The temperature is learned during the student’s training process with a reversed gradient that aims to maximize the distillation loss in an adversarial manner.
RecursiveMix: Mixed Learning with History
Lingfeng Yang*, Xiang Li*, Borui Zhao, Renjie Song, Jian Yang#
in NeurIPS (Spotlight), 2022
[Paper] [BibTex] [Code]
RecursiveMix is a simple but effective data augmentation technique that first leverages the historical input-prediction-label triplets.
DTG-SSOD: Dense Teacher Guidance for Semi-Supervised Object Detection
Gang Li, Xiang Li#, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang#
in NeurIPS, 2022
[Paper] [BibTex] [Code(to be released)]
DTG-SSOD explores a novel “dense-to-dense” paradigm, instead of the traditional “sparse-to-dense” paradigm, for effective semi-supervised object detection.
PseCo: Pseudo Labeling and Consistency Training for Semi-Supervised Object Detection
Gang Li, Xiang Li#, Yujie Wang, Yichao Wu, Ding Liang, Shanshan Zhang#
in ECCV, 2022
[Paper] [BibTex] [Code] [Blog(Chinese)] [Video(Chinese)]
PseCo delves into two key techniques of semi-supervised learning (e.g., pseudo labeling and consistency training) for SSOD, and integrate object detection properties into them.
Uniform Masking: Enabling MAE Pre-training for Pyramid-based Vision Transformers with Locality
Xiang Li, Wenhai Wang, Lingfeng Yang, Jian Yang#
in arXiv, 2022
[Paper] [BibTex] [Code] [Blog(Chinese)]
UM-MAE is an efficient and general technique that supports MAE-style MIM Pre-training for popular Pyramid-based Vision Transformers (e.g., PVT, Swin).
Dynamic MLP for Fine-Grained Image Classification by Leveraging Geographical and Temporal Information
Lingfeng Yang, Xiang Li#, Renjie Song, Borui Zhao, Juntian Tao, Shihao Zhou, Jiajun Liang, Jian Yang#
in CVPR (Oral), 2022
[Paper] [BibTex] [Code]
A very simple and effective approach for fine-grained recognition tasks using auxiliary knowledge like geographical/temporal information. We achieve SOTA results and take third place in the iNaturalist challenge at FGVC8 (CVPR21 workshop)
Knowledge Distillation for Object Detection via Rank Mimicking and Prediction-guided Feature Imitation
Gang Li*, Xiang Li*, Yujie Wang, Shanshan Zhang#, Yichao Wu, Ding Liang in AAAI, 2022
[Paper] [BibTex]
Rank Mimicking and Prediction-guided Feature Imitation for knowledge Distillation of Dense Object Detection, A Simple and Effective Approach!
PVTv2: Improved Baselines with Pyramid Vision Transformer
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan, Kaitao Song, Ding Liang, Tong Lu, Ping Luo, Ling Shao
Technical Report, 2021
[Paper] [Code] [中文解读] [Report] [Talk] [BibTex]
A better PVT.
Pyramid Vision Transformer: A Versatile Backbone for Dense Prediction without Convolutions
Wenhai Wang, Enze Xie, Xiang Li, Deng-Ping Fan#, Kaitao Song, Ding Liang, Tong Lu#, Ping Luo, Ling Shao
in ICCV, 2021 (oral presentation)
[Paper] [Code] [中文解读] [Report] [Talk] [BibTex]
A pure Transformer backbone for dense prediction, such as object detection and semantic segmentation.
PAN++: Towards Efficient and Accurate End-to-End Spotting of Arbitrarily-Shaped Text
Wenhai Wang*, Enze Xie*, Xiang Li, Xuebo Liu, Ding Liang, Zhibo Yang, Tong Lu#, Chunhua Shen
TPAMI, 2021
[Paper] [Code] [BibTex]
We extend PSENet (CVPR'19) and PAN (ICCV'19) to a text spotting system.
Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection
Xiang Li*, Wenhai Wang, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang
in CVPR, 2021
[Paper] [Code] [BibTex]
The improved version of GFocal!
Generalized Focal Loss: Learning Qualified and Distributed Bounding Boxes for Dense Object Detection
Xiang Li*, Wenhai Wang, Lijun Wu, Shuo Chen, Xiaolin Hu, Jun Li, Jinhui Tang, Jian Yang
in NeurIPS, 2020
[Paper] [Code]
We propose the generalized focal loss for learning the improved representations of dense object detector. GFocal is officially included in [MMDetection], and is an important part of the [winning solution] in GigaVision contest (object detection and tracking tracks) hosted in ECCV 2020 workshop (winner: DeepBlueAI team).
Selective kernel networks
Xiang Li*, Wenhai Wang, Xiaolin Hu, Jian Yang
in CVPR, 2019
[Paper] [BibTex] [Code]
We propose a selective kernel mechanism for convolution.
Understanding the disharmony between dropout and batch normalization by variance shift
Xiang Li*, Shuo Chen, Xiaolin Hu, Jian Yang
in CVPR, 2019
[Paper] [BibTex]
We explore and address the disharmony between dropout and batch normalization.
Understanding the disharmony between weight normalization family and weight decay
Xiang Li*, Shuo Chen, Jian Yang
in AAAI, 2020
[Paper]
We explore and address the disharmony between weight normalization family and weight decay.
LightRNN: Memory and computation-efficient recurrent neural networks
Xiang Li*, Tao Qin, Jian Yang, Tie-Yan Liu
in NeurIPS, 2016
[Paper] [BibTex]
We propose a memory and computation-efficient recurrent neural networks for language model.
Stacked conditional generative adversarial networks for jointly learning shadow detection and shadow removal
Jifeng Wang*, Xiang Li*, Jian Yang
in CVPR, 2018
[Paper] [BibTex] [Dataset]
We release a new dataset for jointly shadow detection and removal.
Shape Robust Text Detection with Progressive Scale Expansion Network
Wenhai Wang*, Enze Xie*, Xiang Li*, Wenbo Hou, Tong Lu, Gang Yu, Shuai Shao
in CVPR, 2019
[Paper] [Poster] [BibTex] [Code]
We proposed a segmentation-based text detector that can precisely detect text instances with arbitrary shapes.
Mixed Link Networks
Wenhai Wang*, Xiang Li*, Jian Yang, Tong Lu
in IJCAI, 2018
[Paper] [Poster] [BibTex] [Code]
We proposed an parameter-efficient convolutional neural networks for image classification.

Review Services

Journal Reviewer
IEEE Transactions on Neural Networks and Learning Systems (TNNLS)
IEEE Transactions on Image Processing (TIP)
IEEE Transactions on Multimedia (TMM)
International Journal of Computer Vision (IJCV)
IEEE Transactions on Pattern Analysis and Machine Intelligence (TPAMI)

Conference Reviewer
IEEE Conference on Computer Vision and Pattern Recognition (CVPR), 2020, 2021, 2022, 2023
AAAI Conference on Artificial Intelligence (AAAI), 2019, 2020, 2021, 2022, 2023
European Conference on Computer Vision (ECCV), 2022