Title: Deep Domain Adaptation for Computer Vision
Deep learning methods suffer from domain shifts between training data source domain and target application domain. In recent years, several domain adaptation methods have been proposed to reduce the domain shift for practical applications of deep neural network models. In this first part of the tutorial, we will first provide an overview of different domain adaptation problems and survey some representative domain adaptation techniques, including distribution alignment, adversarial domain adaptation, and data augmentation with image domain transfer. Some application examples of applying domain adaptation techniques to face recognition and face anti-spoofing problems are discussed.
For the second part of the tutorial, we focus on GAN-based image-translation models to achieve domain adaptation with large and complex domain shifts, such as day-to-night. AugGAN could transform on-road driving images to a desired domain while image-objects would be well preserved. Later, AugGAN is further extended to a multimodal structure-consistent GAN, called Multimodal AugGAN, which can transform daytime on-road driving images to their nighttime counterparts with different ambient light levels. We will discuss some results of these image-translation models across different weathers, times-of-the-day, and datasets and their applications to train object detectors on a target domain.
For the third part, we will discuss some different problem settings of domain adaptation. In semantic segmentation problems, when adapting from the synthetic source domain to the unlabeled target domain, some recent methods employed the idea of within-domain adaptation to alleviate the semantic inconsistency. Multi-source domain adaptation focuses on learning a domain-agnostic model, which needs to handle the conflicts across multiple domains as well as to narrow the domain gap between source and target domains. Another setting is source-free adaptation, which is a special setting of model adaptation when the source dataset is unavailable during the model adaptation. Finally, we discuss the problem setting when only multiple sources are available during the training, but there exists no target domain for distribution alignment, which is to achieve domain generalization.
Part 1 (Shang-Hong Lai)
● Overview of domain adaptation (15 minutes)
○ Problem description
○ Categorization of domain adaptation problems
● Deep domain adaptation techniques (45 minutes)
○ Distribution alignment
○ Adversarial domain adaptation
○ Image domain transfer
○ Applications on face recognition and anti-spoofing
Part 2 (Che-Tsung Lin)
● Domain adaptation via GAN-based image-to-image translation (60 minutes)
○ Domain adaptation via image-translation
○ Unimodal structure-consistent image-to-image translation
○ Multimodal structure-consistent image-to-image translation
○ Results of image-translation across different weathers, times-of-the-day, and datasets
Part 3 (Chiou-Ting Hsu)
● Other domain adaptation settings (60 minutes)
○ Cross-domain adaptation vs. within domain adaptation
○ Multi-source domain adaptation
○ Source-free adaptation
○ Source-only domain generalization
Principal Researcher, Microsoft AI R&D Center, Taiwan
Professor, National Tsing Hua University, Taiwan
Email: email@example.com ; firstname.lastname@example.org
Shang-Hong Lai received the Ph.D. degree from University of Florida, Gainesville, USA in 1995. He worked at Siemens Corporate Research in Princeton, New Jersey, USA, as a member of technical staff during 1995-1999. Since 1999, he joined the Department of Computer Science, National Tsing Hua University, Taiwan, where he is now a professor there. Since the summer of 2018, Dr. Lai has been on leave from NTHU to join Microsoft AI R&D Center, Taiwan. He is currently a principal researcher at Microsoft AI R&D Center and leads a science team focusing on computer vision research for face related applications.
Dr. Lai’s research interests are mainly focused on computer vision, image processing, and machine learning. He has authored more than 300 papers published in refereed international journals and conferences in these areas. In addition, he has been awarded around 30 patents on his research on computer vision. He has been involved in the organization for a number of international conferences in computer vision and related areas, including ICCV, CVPR, ECCV, ACCV, ICIP, etc. Furthermore, he has served as an associate editor for Pattern Recognition and Journal of Signal Processing Systems.
Postdoctoral Researcher, Chalmers University of Technology, Sweden
Email: email@example.com ; firstname.lastname@example.org
Che-Tsung Lin received his B.S. degree in Mechanical Engineering from National Taiwan University of Science and Technology and MS. degree in the Institute of Applied Mechanics from National Taiwan University, and Ph.D. degree in the computer science department in National Tsing Hua University in 2003 and 2005, and 2020, respectively. He was an associate researcher from 2006 to 2014, a researcher from 2014 to 2020 and a senior researcher in 2020 in the Intelligent Mobility Division, Mechanical and Systems Lab, Industrial Technology Research Institute, Taiwan. From April to October in 2013, he was a visiting researcher at the computer science department of University of California, Santa Barbara, USA. He is currently a postdoctoral researcher at Chalmers University of Technology, Sweden. His research is mainly about object detection, semantic segmentation, domain adaptation and their applications in ADAS and autonomous vehicles.
Professor, National Tsing Hua University, Taiwan
Chiou-Ting Hsu received the Ph.D. degree in computer science and information engineering from National Taiwan University, Taipei, Taiwan, in 1997. From 1998 to 1999, she was with Philips Innovation Center, Taipei, Philips Research, as a senior research engineer. Since 1999, she joined the Department of Computer Science, National Tsing Hua University, Taiwan, and is now a professor there. She was a visitor scholar at Columbia University, New York, USA, in 2005, at University of Maryland, College Park, USA, in 2009, and at EURECOM, France, in 2019. She was an associate editor of Advances in Multimedia and the IEEE Transactions on Information Forensics and Security (2012-2015), and is currently an associate editor of Journal of Visual Communication and Image Representation and EURASIP Journal on Image and Video Processing. She was an elected member of the IEEE Information Forensics and Security Technical Committee (2013-2015) and of the APSIPA Image, Video, and Multimedia Technical Committee (2013-2016).