Domain Adaptation 论文阅读笔记
文章目录
- Domain Adaptation 论文阅读笔记
- 一、Method Summary
- Unsupervised Domain Adaptation by Backpropagation
- Learning Transferable Features with Deep Adaptation Networks
- Coupled Generative Adversarial Networks
- Domain Separation Networks
- DiDA: Disentangled Synthesis for Domain Adaptation
- Unsupervised Domain Adaptation in the Wild via Disentangling Representation Learning
- Unsupervised Domain Adaptation via Disentangled Representations: Application to Cross-Modality Liver Segmentation
- Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation
- Contrastive Adaptation Network for Unsupervised Domain Adaptation (CVPR 2019)
- MME: Semi-supervised Domain Adaptation via Minimax Entropy
- PAC: Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining and Consistency (BMVC 2021)
- Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation
- 2. Experiment part
- 1. (Unsupervised) Domain Adaptation
- 2. Joint-Domain Learning
- 3. Analysis part
What is Domain Adaptation(DA)? — attempt to map representations between the two domains or learn to extract features that are domain–invariant.
source有label,target没有
一、Method Summary
Unsupervised Domain Adaptation by Backpropagation
domian classifier部分的梯度需要通过gradient reverse layer,使encoder提到的信息不利于domain 分类,也就是domain-invariant feature
Learning Transferable Features with Deep Adaptation Networks
(https://blog.csdn.net/weixin_40526176/article/details/79065861)
- 多层适配
- 适配最后3层——认为(AlexNet)最后3层是task-specific,对于其他网络要另外计算相似度
- Multi kernel-MMD(Maximum Mean Discrepancy)
- 可以用来计算不同域feature的距离,相当于把1中的maximize domain error换成这边的minimize MMD
Coupled Generative Adversarial Networks
- 即便没有labeled cross-domain pair,也可以通过weight sharing和adversarial learning学习到2个domain的joint distribution——相当于输入同一个vector z,2个generator的输出是一对语义相关但是各有特点的pair。
- weight sharing如highlight部分所示,其中z是random vector,因为有了weight sharing,可以保证对应高层语义信息的layer,其处理信息的方式是一致的。
- 这似乎不是DA,但是这个框架可以用在DA上,效果似乎很不错——因为虽然target没有label,但是source有label,并且有weight sharing机制,使得2个generator得到的图像g(z)理论上是同一个数字。
Domain Separation Networks
构建一个直接提取domain-invariant的框架,会导致 these representations might trivially include noise that is highly correlated with the shared representation.
- Overall Loss:
- Reconstruction Loss:
- 用scale mse,因为普通mse
penalizes predictions that are correct up to a scaling term.
,而scale msepenalizes differences between pairs of pixels. This allows the model to learn to reproduce the overall shape of the objects being modeled without expending modeling power on the absolute color or intensity of the inputs.
(为什么scale会导致model分心?)
- 用scale mse,因为普通mse
- L_dif: (可以
encourages orthogonality
,why?)
- L_sim:
- domain classfier(gradient reverse layer)+CE loss
- MMD loss
DiDA: Disentangled Synthesis for Domain Adaptation
通过交替进行domain adaptation和disentangle synthesis这两步,逐渐得到更好的labeled target data
- DA:训domain-invariant common feature
- Disentangle:在DA的基础上,训specific feature,要让common和specific的combination可以reconstruct input,但是这个specific feature得对分类不利(这边可能有个GRL?)
Unsupervised Domain Adaptation in the Wild via Disentangling Representation Learning
- As the category information between the source and the target domains can be imbalanced, directly aligning latent feature representation may lead to negative transfer.
- So they disentangle the latent feature to category related code (global code) as well as style related code (local code).
Unsupervised Domain Adaptation via Disentangled Representations: Application to Cross-Modality Liver Segmentation
- 对每个domain分别提取style code s i s_i si和content code c i c_i ci,然后把这些code输入G中(怎么输?),得到对应的img(要得到content-only img,必须解耦才行吗?)
- 通过这样的训练,可以得到content-only img
- 然后用得到的content-only img来训练一个新的模型
- 这个方法可以用来做domain adaptation,也可以做joint-domain learning
Learning from Synthetic Data: Addressing Domain Shift for Semantic Segmentation
(https://github.com/swamiviv/LSD-seg)
- Discriminator: Patch discriminator的变形,
each pixel in the output map indicates real/fake probabilities across source and target domains hence resulting in four classes per pixel: src-real, src-fake, tgt-real, tgt-fake.
- Auxiliary Classifier GAN (ACGAN)思想:
by conditioning G during training and adding an auxiliary classification loss to D, they can realize a more stable GAN training and even generate large scale images.
—— 或许可以用来reconstruct full img
Iteratively update:
- F的训练和cross-domain有关系:
To update F, we use the gradients from D that lead to a reversal in domain classification, i.e. for source embeddings, we use gradients from D corresponding to classifying those embeddings as from target domain.
Contrastive Adaptation Network for Unsupervised Domain Adaptation (CVPR 2019)
- 类似MMD,提出了个CDD,用来拉近fc层处target和source的距离
- Alternative optimization:先cluster,得到pseudo target label,然后根据这些label去用CDD算intra-class、inter-class discrepancy,再回去更好的cluster
MME: Semi-supervised Domain Adaptation via Minimax Entropy
- 先用labeled训F+C,F提feature,C包含一组weight (the weight vectors can be regarded as estimated prototypes for
each class.),将feature转换为domain-invariant prototype - 然后对F minimize entropy——得到discriminative feature
- 对C maximize entropy (similarity)——让每类prototype (C的weight) 都和unlabeled target feature相近
PAC: Surprisingly Simple Semi-Supervised Domain Adaptation with Pretraining and Consistency (BMVC 2021)
先用rotation pretext task pretraining,然后再做domain adaptation fine-tuning,对labeled img要满足分类正确,对unlabeled img要让perturb版本输出和原来差不多
Learning Invariant Representations and Risks for Semi-supervised Domain Adaptation
- 挺理论的一篇,核心在于同时 learn invariant representation 和 invariant risk (data are collected from multiple envrionments with different distributions where spurious correlations are due to dataset biases. This part of spurious correlation will confuse model to build predictions on unrelated correlations rather than true causal relations.)
- 让他们的optimal predictor对齐?
2. Experiment part
1. (Unsupervised) Domain Adaptation
- Train:source
- Test: target
method | exp setup |
---|---|
Unsupervised Domain Adaptation by Backpropagation | 如果source比target更复杂,则还行;source比target简单,就不太行 |
Learning Transferable Features with Deep Adaptation Networks | 1. Unsupervised adaptation -> use all source examples with labels and all target examples without labels 2.semi-supervised adaptation -> randomly down-sample the source examples, and further require 3 labeled target examples per category. |
Domain-Adversarial Training of Neural Networks | |
Coupled Generative Adversarial Networks | |
Domain Separation Networks | 用2个baseline作为lower bound和upper bound(不用DA,只在source或只在target上训练) |
2. Joint-Domain Learning
- 多个domain的数据混一起train
- 目标:得到的结果比只在单个domain上train的好
3. Analysis part
Visualization(t-SNE): 证明在Target域模型得到的feature是:
- easier to discriminate
- more align with source