CVPR2021 最全整理:论文分类汇总 / 代码 / 项目 / 论文解读(更新中)【计算机视觉】

作为计算机视觉领域三大顶会之一,CVPR2021目前已公布了所有接收论文ID,一共有1663篇论文被接收,接收率为23.7%,虽然接受率相比去年有所上升,但竞争也是非常激烈,相关报道:CVPR 2021接收结果出炉!录用1663篇,接受率提升,你的论文中了吗?

在本文中,我们对CVPR2021的最新论文进行了分类汇总,并将对优秀论文解读报道技术直播。我们将对CVPR2021顶会论文进行实时跟进和分类,欢迎点击文末关注按钮,即可获取本帖最新更新消息。

此前我们也对CVPR2020、CVPR2019的论文进行了整理,做了分类汇总,点击下列推文即可前往:

所有关于CVPR的论文整理都汇总在了我们的Github项目中,该项目目前已收获6500 Star。
Github项目地址:https://github.com/extreme-assistant/CVPR2021-Paper-Code-Interpretation

CVPR2021同系列整理:

下文为对CVPR2021论文的分方向整理:

分类目录:

1. 检测

2. 分割(Segmentation)

3. 图像处理(Image Processing)

4. 估计(Estimation)

5. 图像&视频检索/理解(Image&Video Retrieval/Video Understanding)

6. 人脸(Face)

7. 三维视觉(3D Vision)

8. 目标跟踪(Object Tracking)

9. 医学影像(Medical Imaging)

10. 文本检测/识别(Text Detection/Recognition)

11. 遥感图像(Remote Sensing Image)

12. GAN/生成式/对抗式(GAN/Generative/Adversarial)

13. 图像生成/合成(Image Generation/Image Synthesis)

14. 场景图(Scene Graph

15. 视觉定位(Visual Localization)

16. 视觉推理/视觉问答(Visual Reasoning/VQA)

17. 图像分类(Image Classification)

18. 神经网络结构设计(Neural Network Structure Design)

19. 模型压缩(Model Compression)

20. 模型训练/泛化(Model Training/Generalization)

21. 模型评估(Model Evaluation)

22. 神经网络架构搜索(NAS)

23. 数据处理(Data Processing)

24. 主动学习(Active Learning)

25. 小样本学习/零样本学习/元学习(Few-shot/Zero-shot Learning)

26. 持续学习(Continual Learning/Life-long Learning)

27. 迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

28. 度量学习(Metric Learning)

29. 对比学习(Contrastive Learning)

30. 强化学习(Reinforcement Learning)

31. 元学习(Meta Learning)

32. 视听学习(Audio-visual Learning)

33. 视觉预测(Vision-based Prediction)

34. 数据集(Dataset)

暂无分类



检测

图像目标检测(Image Object Detection)

[22] Adaptive Class Suppression Loss for Long-Tail Object Detection(长尾目标检测的自适应类抑制损失)

paper | code

[21] DAP: Detection-Aware Pre-training with Weak Supervision(具有弱监督的可感知检测的预训练)

paper

[20] Dense Relation Distillation with Context-aware Aggregation for Few-Shot Object Detection(稠密关系蒸馏与上下文感知聚合用于小样本对象检测)

papercode

[19] Scale-aware Automatic Augmentation for Object Detection(用于物体检测的可感知规模的自动增强)

paper | code

[18] Data-Uncertainty Guided Multi-Phase Learning for Semi-Supervised Object Detection(数据不确定性指导的多阶段学习,用于半监督对象检测)

paper

[17] OTA: Optimal Transport Assignment for Object Detection(OTA:用于对象检测的最佳传输分配)

paper | code

[16] Distilling Object Detectors via Decoupled Features(通过解耦功能蒸馏物体检测器)

paper | code

[15] I^3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors(I ^ 3Net:用于适应一阶段对象检测器的隐式实例不变网络)

paper

[14] Robust and Accurate Object Detection via Adversarial Learning(通过对抗学习进行稳健而准确的目标检测)

paper

[13] You Only Look One-level Feature

paper | code

[12] End-to-End Object Detection with Fully Convolutional Network()

paper | code

解读:丢弃Transformer,FCN也可以实现E2E检测

[11] FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding(通过对比提案编码进行的小样本目标检测)

paper

[10] Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection(学习可靠的定位质量估计用于密集目标检测)

paper | code

解读:大白话 Generalized Focal Loss V2

[9] MeGA-CDA: Memory Guided Attention for Category-Aware Unsupervised Domain Adaptive Object Detection(用于类别识别无监督域自适应对象检测)

paper

[8] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

paper | code

[7] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

paper

[6] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

paper

[5] Instance Localization for Self-supervised Detection Pretraining(自监督检测预训练的实例定位)

papercode

[4] Multiple Instance Active Learning for Object Detection(用于对象检测的多实例主动学习)

paper | code

[3] Towards Open World Object Detection(开放世界中的目标检测)

paper | code

[2] Positive-Unlabeled Data Purification in the Wild for Object Detection(野外检测对象的阳性无标签数据提纯)

[1] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

paper | code

解读:无监督预训练检测器

视频目标检测(Video Object Detection)

[4] Dogfight: Detecting Drones from Drones Videos(从无人机视频中检测无人机)

paper

[3] Depth from Camera Motion and Object Detection(相机运动和物体检测的深度)

paper

[2] There is More than Meets the Eye: Self-Supervised Multi-Object Detection and Tracking with Sound by Distilling Multimodal Knowledge(多模态知识提取的自监督多目标检测与有声跟踪)

paper | video | project

[1] Dogfight: Detecting Drones from Drone Videos(从无人机视频中检测无人机)

三维目标检测(3D object detection)

[12] Objects are Different: Flexible Monocular 3D Object Detection(对象不同:灵活的单眼3D对象检测)

paper | code

[11] HVPR: Hybrid Voxel-Point Representation for Single-stage 3D Object Detection(HVPR:用于单阶段3D对象检测的混合体素点表示)

paper

[10] GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection(用于单眼3D对象检测的数学可微分的分组NMS)

paper | code

[9] Delving into Localization Errors for Monocular 3D Object Detection(深入研究单目3D对象检测的定位错误)

paper | code

[8] Depth-conditioned Dynamic Message Propagation for Monocular 3D Object Detection(用于单眼3D对象检测的深度条件动态消息传播)

paper | code

[7] LiDAR R-CNN: An Efficient and Universal 3D Object Detector(高效且通用的3D对象检测器)

paper | code

[6] M3DSSD: Monocular 3D Single Stage Object Detector(单眼3D单级目标检测器)

paper

[5] MonoRUn: Monocular 3D Object Detection by Self-Supervised Reconstruction and Uncertainty Propagation(通过自我监督的重构和不确定性传播进行单眼3D目标检测)

paper

[4] ST3D: Self-training for Unsupervised Domain Adaptation on 3D Object Detection(ST3D:在三维目标检测上进行无监督域自适应的自训练)

paper | code

[3] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)

paper | code

[2] 3DIoUMatch: Leveraging IoU Prediction for Semi-Supervised 3D Object Detection(利用IoU预测进行半监督3D对象检测)

paper | code | project | video

[1] Categorical Depth Distribution Network for Monocular 3D Object Detection(用于单目三维目标检测的分类深度分布网络)

paper

人物交互检测(HOI Detection)

[5] Affordance Transfer Learning for Human-Object Interaction Detection(物价转移学习用于人物交互检测)

paper | code

[4] Detecting Human-Object Interaction via Fabricated Compositional Learning(通过人为构图学习检测人与物体的相互作用)

paper | code

[3] Reformulating HOI Detection as Adaptive Set Prediction(将人物交互检测重新配置为自适应集预测)

paper | code

[2] QPIC: Query-Based Pairwise Human-Object Interaction Detection with Image-Wide Contextual Information(具有图像范围的上下文信息的基于查询的成对人物交互检测)

paper | code

[1] End-to-End Human Object Interaction Detection with HOI Transformer(使用HOI Transformer进行端到端的人类对象交互检测)

paper | code

伪装目标检测(Camouflaged Object Detection)

[2] Uncertainty-aware Joint Salient Object and Camouflaged Object Detection(不确定度联合显着物体和伪装物体检测)

paper

[1] Simultaneously Localize, Segment and Rank the Camouflaged Objects(同时定位,分割和排序伪装的对象)

paper | code

旋转目标检测(Rotation Object Detection)

[2] ReDet: A Rotation-equivariant Detector for Aerial Object Detection(ReDet:用于航空物体检测的等速旋转检测器)

paper | code

[1] Dense Label Encoding for Boundary Discontinuity Free Rotation Detection(密集标签编码,用于边界不连续自由旋转检测)

paper | code | 解读-DCL:旋转目标检测新方法

显著性检测(Saliency Object Detection)

[3] Weakly Supervised Video Salient Object Detection(弱监督视频显著性目标检测)

paper

[2] Group Collaborative Learning for Co-Salient Object Detection(协同显著性目标检测的小组协作学习)

paper | project

[1] Deep RGB-D Saliency Detection with Depth-Sensitive Attention and Automatic Multi-Modal Fusion(具有深度敏感注意力和自动多模态融合的深度RGB-D显著性检测)

paper

图像异常检测(Anomally Detection in Image)

[1] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

paper

关键点检测(Keypoint Detection)

[1] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并:无监督的对准关键点检测器)

paper | code



分割(Segmentation)

图像分割(Image Segmentation)

[9] Adaptive Prototype Learning and Allocation for Few-Shot Segmentation(小样本分割的自适应原型学习和分配)

paper

[8] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS:用于3D医学图像分割的可区分神经网络拓扑搜索)

paper

[7] Self-Guided and Cross-Guided Learning for Few-Shot Segmentation(自我指导和交叉指导学习,用于少量分割)

paper

[6] Locate then Segment: A Strong Pipeline for Referring Image Segmentation(找到然后分割:用于参考图像分割的强大管道)

paper

[5] Boundary IoU: Improving Object-Centric Image Segmentation Evaluation(边界IoU:改进以对象为中心的图像分割评估)

paper | code

[4] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

paper

[3] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)

paper | code

[2] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?(【小样本】没有元学习的小样本分割:你只需要一个好的转换推论?)

paper | code

[1] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

全景分割(Panoptic Segmentation)

[4] Panoptic Segmentation Forecasting(全景分割预测)

paper

[3] Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation(无提案的LiDAR点云全景分割)

paper

[2] Cross-View Regularization for Domain Adaptive Panoptic Segmentation(用于域自适应全景分割的跨视图正则化)

paper

[1] 4D Panoptic LiDAR Segmentation(4D全景LiDAR分割)

paper

语义分割(Semantic Segmentation)

[20] Progressive Semantic Segmentation(渐进式语义分割)

paper

[19] InverseForm: A Loss Function for Structured Boundary-Aware Segmentation(结构化边界感知分割的损失函数)

paper

[18] 3D-to-2D Distillation for Indoor Scene Parsing(用于室内场景解析的3D到2D蒸馏)

paper

[17] One Thing One Click: A Self-Training Approach for Weakly Supervised 3D Semantic Segmentation(一键式点击:一种用于弱监督3D语义分割的自训练方法)

paper

[16] Background-Aware Pooling and Noise-Aware Loss for Weakly-Supervised Semantic Segmentation(弱监督语义分割的背景感知池和噪声感知损失)

paper

[15] PiCIE: Unsupervised Semantic Segmentation using Invariance and Equivariance in Clustering(PiCIE:在聚类中使用不变性和等方差的无监督语义分割)

paper | code

[14] Source-Free Domain Adaptation for Semantic Segmentation(用于语义分割的无源域自适应)

paper

[13] RobustNet: Improving Domain Generalization in Urban-Scene Segmentation via Instance Selective Whitening(通过实例选择性增白提高城市场景分割中的域泛化)

paper | code

[12] Coarse-to-Fine Domain Adaptive Semantic Segmentation with Photometric Alignment and Category-Center Regularization(具有光度对齐和类别中心正则化的粗到细域自适应语义分割)

paper

[11] Cross-Dataset Collaborative Learning for Semantic Segmentation(跨数据集协同学习的语义分割)

paper

[10] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)

paper

[9] Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations(通过稀疏和纠缠的潜在表示的排斥力进行连续语义分割)

paper

[8] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)

paper

[7] Capturing Omni-Range Context for Omnidirectional Segmentation(捕获全方位上下文进行全方位分割)

paper

[6] MetaCorrection: Domain-aware Meta Loss Correction for Unsupervised Domain Adaptation in Semantic Segmentation(MetaCorrection:语义分割中无监督域自适应的域感知元丢失校正)

paper

[5] Learning Statistical Texture for Semantic Segmentation(学习用于语义分割的统计纹理)

paper

[4] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

paper

[3] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

paper

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)

paper | code

[1] PLOP: Learning without Forgetting for Continual Semantic Segmentation(PLOP:学习而不会忘记连续的语义分割)

paper

实例分割(Instance Segmentation)

[7] DARCNN: Domain Adaptive Region-based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images(DARCNN:用于生物医学图像中无监督实例分割的基于域自适应区域的卷积神经网络)

paper

[6] Weakly-supervised Instance Segmentation via Class-agnostic Learning with Salient Images(通过带有显着图像的类不可知学习进行弱监督实例分割)

paper | code

[5] FAPIS: A Few-shot Anchor-free Part-based Instance Segmenter(FAPIS:少量基于无锚的基于实例分割器)

paper

[4] Weakly Supervised Instance Segmentation for Videos with Temporal Mask Consistency(具有时间掩码一致性的视频的弱监督实例分割)

paper

[3] Deep Occlusion-Aware Instance Segmentation with Overlapping BiLayers(具有重叠BiLayer的深度遮挡感知实例分割)

paper | code

[2] BBAM: Bounding Box Attribution Map for Weakly Supervised Semantic and Instance Segmentation(用于弱监督语义和实例细分的边界框归因图)

paper

[1] End-to-End Video Instance Segmentation with Transformers(使用Transformer的端到端视频实例分割)

paper | code


超像素(Superpixel)

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

paper

视频目标分割(Video Object Segmentation)

[3] Efficient Regional Memory Network for Video Object Segmentation(用于视频对象分割的高效区域存储网络)

paper

[2] Learning to Recommend Frame for Interactive Video Object Segmentation in the Wild(学习推荐帧用于交互式野外视频对象分割)

paper | code

[1] Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion(模块化交互式视频对象分割:面具交互,传播和差异感知融合)

paper | project

抠图(Matting)

[1] Real-Time High Resolution Background Matting

paper | code | project | video

密集预测(Dense Prediction)

[3] Generic Perceptual Loss for Modeling Structured Output Dependencies(用于建模结构化输出依存关系的一般感知损失)

paper

[2]Densely connected multidilated convolutional networks for dense prediction tasks(用于密集预测任务的多重卷积连接网络)

paper

[1] Dense Contrastive Learning for Self-Supervised Visual Pre-Training(自监督视觉预训练的密集对比学习)

paper | code

估计(Estimation)

姿态估计(Human Pose Estimation)

[12] Multi-View Multi-Person 3D Pose Estimation with Plane Sweep Stereo(具有平面扫描立体声的多视图多人3D姿势估计)

paper | code

[11] Bottom-Up Human Pose Estimation Via Disentangled Keypoint Regression(通过解聚的关键点自下而上的人体姿势估计)

paper | code

[10] Monocular 3D Multi-Person Pose Estimation by Integrating Top-Down and Bottom-Up Networks(通过集成自上而下和自下而上的网络进行单眼3D多人姿势估计)

paper | code

[9] Reconstructing 3D Human Pose by Watching Humans in the Mirror(通过照镜子中的人来重建3D人的姿势)

paper | project

[8] SimPoE: Simulated Character Control for 3D Human Pose Estimation(用于3D人体姿势估计的模拟角色控制)

paper | project

[7] Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-localization in Large Scenes from Body-Mounted Sensors(人体姿势系统(HPS):人体安装传感器在大场景中的3D人体姿势估计和自定位)

paper | project

[6] Graph Stacked Hourglass Networks for 3D Human Pose Estimation(用于3D人体姿势估计的图形堆叠沙漏网络)

paper

[5] From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation(【动物姿态估计】从合成到真实:用于动物姿势估计的无监督域自适应)

paper | code

[4] DCPose: Deep Dual Consecutive Network for Human Pose Estimation(用于人体姿态估计的深度双重连续网络)

paper | code

[3] Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing(用于实例感知人类语义解析的可微分多粒度人类表示学习)

paper | code

[2] CanonPose: Self-supervised Monocular 3D Human Pose Estimation in the Wild(野外自监督的单眼3D人类姿态估计)

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers(具有透视作物层的3D姿势的几何感知神经重建)

paper

手势估计(Gesture Estimation)

[4] Fingerspelling Detection in American Sign Language(美国手语中的手指拼写检测)

paper

[3] Read and Attend: Temporal Localisation in Sign Language Videos(阅读和参加:手语视频中的时间本地化)

paper | [project](https://www.robots.ox.ac.uk/ ̃vgg/research/bslattend/)

[2] Skeleton Based Sign Language Recognition Using Whole-body Keypoints(基于全身关键点的基于骨架的手语识别)

paper | code

[1] Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration(基于语义聚合和自适应2D-1D配准的相机空间手部网格恢复)

paper | code

光流/位姿/运动估计(Optical Flow/Pose/Motion Estimation)

[10] DSC-PoseNet: Learning 6DoF Object Pose Estimation via Dual-scale Consistency(通过双尺度一致性学习6DoF对象姿势估计)

paper

[9] Learning optical flow from still images(从静止图像中学习光流)

paper | project

[8] Learning Optical Flow from a Few Matches(通过少量匹配学习光流)

paper | code

[7] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds(FESTA:场景点云通过时空注意进行光流估计)

paper

[6] Wide-Depth-Range 6D Object Pose Estimation in Space(空间中的深度范围6D对象姿态估计)

paper

[5] Deep Two-View Structure-from-Motion Revisited(重新审视运动的深层两视图结构)

paper

[4] FS-Net: Fast Shape-based Network for Category-Level 6D Object Pose Estimation with Decoupled Rotation Mechanism(具有分离旋转机制的类别级6D对象姿势估计的快速基于形状的网络)

paper

[3] GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation(用于单眼6D对象姿态估计的几何引导直接回归网络)

paper | code

[2] Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments(在动态室内环境中,通过空间划分的鲁棒神经路由可实现摄像机的重新定位)

paper | project

[1] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

paper | code

深度估计(Depth Estimation)

[10] Self-supervised Learning of Depth Inference for Multi-view Stereo(多视图立体声深度推理的自我监督学习)

paper | code

[9] Depth Completion with Twin Surface Extrapolation at Occlusion Boundaries(遮挡边界处的深度补全和双曲面外推)

paper

[8] S2R-DepthNet: Learning a Generalizable Depth-specific Structural Representation(学习通用的深度特定的结构表示)

paper

[7] RGB-D Local Implicit Function for Depth Completion of Transparent Objects(RGB-D局部隐式函数用于透明对象的深度补全)

paper | code

[6] LED2-Net: Monocular 360 Layout Estimation via Differentiable Depth Rendering(通过可分辨深度渲染进行单眼360布局估算)

paper | project

[5] Deep Two-View Structure-from-Motion Revisited(重新审视运动的深层两视图结构)

paper

[4] Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging(学习微透镜掩模以在飞行时间成像中进行飞行像素校正)

paper | project

[3] Generalizing to the Open World: Deep Visual Odometry with Online Adaptation(推广到开放世界:具有在线适应功能的深度视觉里程表)

paper

[2] Beyond Image to Depth: Improving Depth Prediction using Echoes(超越图像深度:使用回声改善深度预测)

paper | code

[1] PLADE-Net: Towards Pixel-Level Accuracy for Self-Supervised Single-View Depth Estimation with Neural Positional Encoding and Distilled Matting Loss(具有神经位置编码和蒸馏消光损耗的自我监督单视图深度估计的像素级精度)

paper


图像处理(Image Processing)

[1] Invertible Image Signal Processing(可逆图像信号处理)

paper | code

超分辨率(Super Resolution)

[6] Unsupervised Degradation Representation Learning for Blind Super-Resolution(盲超分辨率的无监督退化表示学习)

paper | code

[5] Flow-based Kernel Prior with Application to Blind Super-Resolution(基于流的内核先于盲超分辨率的应用)

paper | code

[4] ClassSR: A General Framework to Accelerate Super-Resolution Networks by Data Characteristic(通过数据特征加速超分辨率网络的通用框架)

paper | 解读-超分性能不降低,计算量降低50%:加速图像超分的ClassSR

[3] Learning Continuous Image Representation with Local Implicit Image Function(通过局部隐含图像功能学习连续图像表示)

paepr | code | video | project

[2] Data-Free Knowledge Distillation For Image Super-Resolution(DAFL算法的SR版本)

[1] AdderSR: Towards Energy Efficient Image Super-Resolution(将加法网路应用到图像超分辨率中)

paper | code

解读:华为开源加法神经网络

图像复原/图像增强(Image Restoration)

[2] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX:具有神经基础扩展的实时视图合成)

paper | code

[1] Multi-Stage Progressive Image Restoration(多阶段渐进式图像复原)

paper | code

图像去阴影/去反射(Image Shadow Removal/Image Reflection Removal)

[3] From Shadow Generation to Shadow Removal(从阴影生成到阴影去除)

paper

[2] Robust Reflection Removal with Reflection-free Flash-only Cues(通过无反射的仅含Flash线索进行鲁棒的反射去除)

paper | code

[1] Auto-Exposure Fusion for Single-Image Shadow Removal(用于单幅图像阴影去除的自动曝光融合)

paper | code

图像去噪/去模糊/去雨去雾(Image Denoising)

[4] Explore Image Deblurring via Blur Kernel Space(通过模糊内核空间探索图像去模糊)

paper

[3] Semi-Supervised Video Deraining with Dynamic Rain Generator(带动态雨水产生器的半监督视频去雨)

paper

[2] ARVo: Learning All-Range Volumetric Correspondence for Video Deblurring(学习用于视频去模糊的全范围体积对应)

paper

[1] DeFMO: Deblurring and Shape Recovery of Fast Moving Objects(快速移动物体的去模糊和形状恢复)

paper | code | video

图像编辑/图像修复(Image Edit/Inpainting)

[8] TransFill: Reference-guided Image Inpainting by Merging Multiple Color and Spatial Transformations(通过合并多个颜色和空间变换进行参考引导的图像修复)
paper

[7] DeFLOCNet: Deep Image Editing via Flexible Low-level Controls(通过灵活的低级控件进行深度图像编辑)

paper

[6] Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE(使用分层VQ-VAE生成图像修复的多样结构)

paper | code

[5] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper | code

[4] DeFLOCNet: Deep Image Editing via Flexible Low level Controls(通过灵活的低级控件进行深度图像编辑)

[3] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[2] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

paper | code

[1] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(利用GAN中潜在的空间维度进行实时图像编辑)

图像翻译(Image Translation)

[6] ReMix: Towards Image-to-Image Translation with Limited Data(使用有限的数据实现图像到图像的翻译)

paper

[5] Closing the Loop: Joint Rain Generation and Removal via Disentangled Image Translation(闭环:通过解图像翻译联合产生和去除雨水)

paper

[4] CoMoGAN: continuous model-guided image-to-image translation(连续的模型指导的图像到图像翻译)

paper | code

[3] Spatially-Adaptive Pixelwise Networks for Fast Image Translation(空间自适应像素网络,用于快速图像翻译)

paper | project

[2] Image-to-image Translation via Hierarchical Style Disentanglement

paper | code | 解读-层次风格解耦:人脸多属性篡改终于可控了(CVPR2021 Oral)

[1] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码:用于图像到图像翻译的StyleGAN编码器)

paper | code | project

图像质量评估(Image Quality Assessment)

[1] SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance(具有相似分布距离的无监督人脸图像质量评估)

paper

风格迁移(Style Transfer)

[2] ArtFlow: Unbiased Image Style Transfer via Reversible Neural Flows(通过可逆神经流进行无偏的图像风格迁移)


paper

[1] Rethinking Style Transfer: From Pixels to Parameterized Brushstrokes(重新考虑风格迁移:从像素到参数化笔触)

paper


人脸(Face)

[5] Towards High Fidelity Face Relighting with Realistic Shadows(逼真的阴影逼真的高保真面部)

paper

[4] Unsupervised Disentanglement of Linear-Encoded Facial Semantics(线性编码的面部语义的无监督解缠)

paper

[3] High-fidelity Face Tracking for AR/VR via Deep Lighting Adaptation(通过深度照明自适应实现AR / VR的高保真人脸跟踪)

paper | project

[2] Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes(具有10^7个节点的大规模图上的结构感知人脸聚类)

paper | code&project

[1] SDD-FIQA: Unsupervised Face Image Quality Assessment with Similarity Distribution Distance(具有相似分布距离的无监督人脸图像质量评估)

paper

人脸识别/检测(Facial Recognition/Detection)

[9] IronMask: Modular Architecture for Protecting Deep Face Template(用于保护深脸模板的模块化体系结构)

paper

[8] HLA-Face: Joint High-Low Adaptation for Low Light Face Detection(用于低光人脸检测的联合高低适应)

paper | project

[7] Dive into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition(潜入歧义:面部表情识别的潜在分布挖掘和成对不确定性估计)

paper

[6] Affective Processes: stochastic modelling of temporal context for emotion and facial expression recognition(情感过程:情感和面部表情识别的时态随机模型)

paper

[5] Cross-Domain Similarity Learning for Face Recognition in Unseen Domains(跨域相似性学习在未知领域中的人脸识别)

paper

[4] MagFace: A Universal Representation for Face Recognition and Quality Assessment(MagFace:人脸识别和质量评估的通用表示形式)

paper | code

[3] CRFace: Confidence Ranker for Model-Agnostic Face Detection Refinement(用于模型不可知的面部检测细化的置信度排名)

paper

[2] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

paper

[1] WebFace260M: A Benchmark Unveiling the Power of Million-Scale Deep Face Recognition(揭示了百万级深度人脸识别力量的基准测试)

paper | benchmark

人脸生成/合成/重建/伪造/编辑(Face Generation/Face Synthesis/Face Reconstruction/Face Forgery/Face Editing)

[11] Riggable 3D Face Reconstruction via In-Network Optimization(通过网络内优化进行可操纵的3D人脸重建)

paper | code

[10] Everything's Talkin': Pareidolia Face Reenactment(一切都在说话':帕累多利亚脸部重现)

paper | project

[9] Face Forensics in the Wild(人脸伪造数据集)

paper | paper

[8] High-Fidelity and Arbitrary Face Editing(高保真和任意脸部编辑)

paper

[7] Frequency-aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection(【人脸伪造检测】由单中心损失监督的频率感知判别特征学习,用于人脸伪造检测)

paper

[6] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)

paper | project

[5] ForgeryNet: A Versatile Benchmark for Comprehensive Forgery Analysis(进行全面伪造分析的多功能基准)

paper | code

[4] Image-to-image Translation via Hierarchical Style Disentanglement(通过分层样式分解实现图像到图像的翻译)

paper | code

[3] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时:一个多任务学习框架)

paper | code

[2] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper | code

[1] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)

paper | code | project

人脸反欺骗(Face Anti-Spoofing)

[3] MagDR: Mask-guided Detection and Reconstruction for Defending Deepfakes(面罩引导的检测和重建,以防御深造假)

paper

[2] Cross Modal Focal Loss for RGBD Face Anti-Spoofing(跨模态焦点损失,用于RGBD人脸反欺骗)
paper

[1] Multi-attentional Deepfake Detection(多注意的Deepfake检测)

paper


目标跟踪(Object Tracking)

[16] Multiple Object Tracking with Correlation Learning(相关学习的多目标跟踪)

paper

[15] Learning to Track Instances without Video Annotations(学习在没有视频注释的情况下跟踪实例)

paper

[14] STMTrack: Template-free Visual Tracking with Space-time Memory Networks(具有时空存储网络的无模板视觉跟踪)

paper | code

[13] Online Multiple Object Tracking with Cross-Task Synergy(具有跨任务协同作用的在线多对象跟踪)

paper | code

[12] Towards More Flexible and Accurate Object Tracking with Natural Language: Algorithms and Benchmark(使用自然语言实现更灵活,准确的对象跟踪:算法和基准)

paper

[11] Learnable Graph Matching: Incorporating Graph Partitioning with Deep Feature Learning for Multiple Object Tracking(可学习的图匹配:将图分区与深度特征学习相结合以实现多对象跟踪)

paper | code

[10] IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking(IoU攻击:针对视觉对象跟踪的临时相干黑盒对抗攻击)

paper | code

[9] Transformer Tracking(Transformer跟踪)

paper | code

[8] Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking(Transformer与追踪器相遇:利用时间上下文进行可靠的视觉追踪)

paper

[7] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段:在线多目标跟踪器)

paper | code

[6] Learning a Proposal Classifier for Multiple Object Tracking(用于多对象跟踪的分类器)

paper | code

[5] Center-based 3D Object Detection and Tracking(基于中心的3D目标检测和跟踪)

paper | code

[4] HPS: localizing and tracking people in large 3D scenes from wearable sensors(通过可穿戴式传感器对大型3D场景中的人进行定位和跟踪)

[3] Track to Detect and Segment: An Online Multi-Object Tracker(跟踪检测和分段:在线多对象跟踪器)

project | video

[2] Probabilistic Tracklet Scoring and Inpainting for Multiple Object Tracking(多目标跟踪的概率小波计分和修复)

paper

[1] Rotation Equivariant Siamese Networks for Tracking(旋转等距连体网络进行跟踪)

paper




图像&视频检索/理解(Image&Video Retrieval/Video Understanding)

[6] Thinking Fast and Slow: Efficient Text-to-Visual Retrieval with Transformers(快速思考和缓慢思考:使用变压器进行高效的文本到视觉检索)

paper

[5] StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval(StyleMeUp:迈向与风格无关的基于草图的图像检索)

paper

[4] More Photos are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval(您只需要更多照片:基于半监督学习的细粒度基于草图的图像检索)

paper | code

[3] Revamping Cross-Modal Recipe Retrieval with Hierarchical Transformers and Self-supervised Learning(使用分层Transformer和自我监督学习改进跨模态食谱检索)

paper

[2] On Semantic Similarity in Video Retrieval(视频检索中的语义相似度)

papercode

[1] QAIR: Practical Query-efficient Black-Box Attacks for Image Retrieval(实用的查询高效的图像检索黑盒攻击)

paper

行为识别/动作识别/检测/分割/定位(Action/Activity Recognition)

[19] Self-Supervised Learning for Semi-Supervised Temporal Action Proposal(自我监督学习的半监督时间行动建议)

paper

[18] Anchor-Constrained Viterbi for Set-Supervised Action Segmentation(锚约束维特比用于集合监督的动作分割)

paper

[17] Action Shuffle Alternating Learning for Unsupervised Action Segmentation(动作洗牌交替学习,实现无监督动作分割)

paper

[16] Self-supervised Motion Learning from Static Images(从静态图像进行自我监督的运动学习)

paper

[15] CoLA: Weakly-Supervised Temporal Action Localization with Snippet Contrastive Learning(带有片段对比学习的弱监督实时动作定位)
paper

[14] Recognizing Actions in Videos from Unseen Viewpoints(从看不见的角度识别视频中的动作)

paper

[13] No frame left behind: Full Video Action Recognition(没有残影:完整的视频动作识别)

paper

[12] Learning Salient Boundary Feature for Anchor-free Temporal Action Localization(学习显着边界特征以实现无锚时间动作定位)

paper | code

[11] Temporal Context Aggregation Network for Temporal Action Proposal Refinement(时间上下文聚合网络,用于改进时间行动建议)

paper

[10] The Blessings of Unlabeled Background in Untrimmed Videos(未修饰视频中未标记背景的祝福)

paper

[9] Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation(临时加权层次聚类,实现无监督动作分割)

paper | code

[8] Coarse-Fine Networks for Temporal Activity Detection in Videos(粗细网络,用于视频中的时间活动检测)

paper

[7] Learning Discriminative Prototypes with Dynamic Time Warping(通过动态时间扭曲学习判别性原型)

paper

[6] Temporal Action Segmentation from Timestamp Supervision(时间监督中的时间动作分割)

paper

[5] ACTION-Net: Multipath Excitation for Action Recognition(用于动作识别的多路径激励)

papercode

[4] BASAR:Black-box Attack on Skeletal Action Recognition(骨骼动作识别的黑匣子攻击)

paper

[3] Understanding the Robustness of Skeleton-based Action Recognition under Adversarial Attack(了解对抗攻击下基于骨骼的动作识别的鲁棒性)

paper

[2] Temporal Difference Networks for Efficient Action Recognition(用于有效动作识别的时差网络)

paper | code

[1] Behavior-Driven Synthesis of Human Dynamics(行为驱动的人类动力学综合)

paper | code<>

重识别(Re-identification)

[8] Neural Feature Search for RGB-Infrared Person Re-Identification(神经特征搜索以重新识别RGB红外人)

paper

[7] Group-aware Label Transfer for Domain Adaptive Person Re-identification(组感知标签传输,用于域自适应行人重识别)

paper

[6] Lifelong Person Re-Identification via Adaptive Knowledge Accumulation(通过自适应知识积累对终身行人重识别)

paper

[5] Anchor-Free Person Search(Anchor-Free行人搜索)

paper | code

[4] Intra-Inter Camera Similarity for Unsupervised Person Re-Identification(摄像机内部相似度用于无监督人员重新识别)

paper

[3] Watching You: Global-guided Reciprocal Learning for Video-based Person Re-identification(基于视频的人员重新识别的全球指导对等学习)

paper

[2] Joint Noise-Tolerant Learning and Meta Camera Shift Adaptation for Unsupervised Person Re-Identification(联合抗噪学习和元相机移位自适应,用于无监督人员的重新识别)

paper

[1] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

paper

图像/视频字幕(Image/Video Caption)

[6] Human-like Controllable Image Captioning with Verb-specific Semantic Roles(具有动词特定语义作用的类人可控图像字幕)

paper | code

[5] Co-Grounding Networks with Semantic Attention for Referring Expression Comprehension in Videos(语义注意的共同接地网络,用于引用视频中的表达理解)

paper | project

[4] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕:从组织病理学教科书和文章中学习表示形式)

paper

[3] Open-book Video Captioning with Retrieve-Copy-Generate Network(带有检索复制生成网络的开卷视频字幕)

paper

[2] VX2TEXT: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs(基于视频的文本生成的端到端学习来自多模式输入)

paper

[1] Scan2Cap: Context-aware Dense Captioning in RGB-D Scans(:RGB-D扫描中的上下文感知密集字幕)
paper | code | project | video


医学影像(Medical Imaging)

[12] DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation(DiNTS:用于3D医学图像分割的可区分神经网络拓扑搜索)

paper

[11] Confluent Vessel Trees with Accurate Bifurcations(分叉的融合容器树)
paper

[10] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)

paper

[9] XProtoNet: Diagnosis in Chest Radiography with Global and Local Explanations(使用全局和局部解释诊断胸部X光片)

paper

[8] FedDG: Federated Domain Generalization on Medical Image Segmentation via Episodic Learning in Continuous Frequency Space(在连续频率空间中通过情景学习进行医学图像分割的联合域泛化)

paper | code

[7] Multiple Instance Captioning: Learning Representations from Histopathology Textbooks and Articles(多实例字幕:从组织病理学教科书和文章中学习表示形式)

paper

[6] Discovering Hidden Physics Behind Transport Dynamics(在运输动力学背后发现隐藏物理)

paper

[5] DeepTag: An Unsupervised Deep Learning Method for Motion Tracking on Cardiac Tagging Magnetic Resonance Images(一种心脏标记磁共振图像运动跟踪的无监督深度学习方法)

paper

[4] Multi-institutional Collaborations for Improving Deep Learning-based Magnetic Resonance Image Reconstruction Using Federated Learning(多机构协作改进基于深度学习的联合学习磁共振图像重建)

paper | code

[3] 3D Graph Anatomy Geometry-Integrated Network for Pancreatic Mass Segmentation, Diagnosis, and Quantitative Patient Management(用于胰腺肿块分割,诊断和定量患者管理的3D图形解剖学几何集成网络)

[2] Deep Lesion Tracker: Monitoring Lesions in 4D Longitudinal Imaging Studies(深部病变追踪器:在4D纵向成像研究中监控病变)

paper

[1] Automatic Vertebra Localization and Identification in CT by Spine Rectification and Anatomically-constrained Optimization(通过脊柱矫正和解剖学约束优化在CT中自动进行椎骨定位和识别)

paper


文本检测/识别(Text Detection/Recognition)

[5] Scene Text Retrieval via Joint Text Detection and Similarity Learning(通过联合文本检测和相似性学习检索场景文本)

paper | code

[4] MetaHTR: Towards Writer-Adaptive Handwritten Text Recognition(迈向写作者自适应的手写文本识别)

paper

[3] MOST: A Multi-Oriented Scene Text Detector with Localization Refinement(具有本地化优化功能的多方位场景文本检测器)

paper

[2] Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition(像人类一样阅读:用于场景文本识别的自主,双向和迭代语言建模)

paper | code

[1] What If We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels(如果我们仅将真实数据集用于场景文本识别该怎么办? 带有较少标签的场景文本识别)

paepr | code


遥感图像(Remote Sensing Image)

[2] PointFlow: Flowing Semantics Through Points for Aerial Image Segmentation(语义流经点以进行航空图像分割)

paper

[1] Deep Gradient Projection Networks for Pan-sharpening(【超分辨率】泛锐化的深梯度投影网络)

paper | code


神经网络架构搜索(NAS)

[10] NetAdaptV2: Efficient Neural Architecture Search with Fast Super-Network Training and Architecture Optimization(具有快速超级网络培训和架构优化的高效神经架构搜索)

paper | project

[9] One-Shot Neural Ensemble Architecture Search by Diversity-Guided Search Space Shrinking(通过分流引导的搜索空间缩小实现一站式神经集成结构搜索)

paper | code

[8] Dynamic Slimmable Network(动态可压缩网络)

paper | code

[7] Prioritized Architecture Sampling with Monto-Carlo Tree Search(蒙特卡洛树搜索的优先架构采样)

paper | code

[6] Searching by Generating: Flexible and Efficient One-Shot NAS with Architecture Generator(通过生成进行搜索:带有架构生成器的灵活高效的一键式NAS)

paper | code

[5] Contrastive Neural Architecture Search with Neural Architecture Comparators(带有神经结构比较器的对比神经网络架构搜索)

paper | code

[4] OPANAS: One-Shot Path Aggregation Network Architecture Search for Object(一键式路径聚合网络体系结构搜索对象)

paper | code

[3] AttentiveNAS: Improving Neural Architecture Search via Attentive(通过注意力改善神经架构搜索)

paper

[2] ReNAS: Relativistic Evaluation of Neural Architecture Search(NAS predictor当中ranking loss的重要性)

paper

[1] HourNAS: Extremely Fast Neural Architecture(降低NAS的成本)

paper


GAN/生成式/对抗式(GAN/Generative/Adversarial)

[19] Regularizing Generative Adversarial Networks under Limited Data(在有限数据下对生成性对抗网络进行正则化)

paper | project | code

[18] Content-Aware GAN Compression(内容感知GAN压缩)

paper

[17] Lipstick ain't enough: Beyond Color Matching for In-the-Wild Makeup Transfer(口红还不够:超出配色范围的野外化妆效果)

paper | code

[16] LiBRe: A Practical Bayesian Approach to Adversarial Detection(LiBRe:对抗性检测的实用贝叶斯方法)

paper

[15] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)

paper

[14] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)

paper | code

[13] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)

paper

[12] MetaSimulator: Simulating Unknown Target Models for Query-Efficient Black-box Attacks(模拟未知目标模型以提高查询效率的黑盒攻击)

paper | code

[11] Soft-IntroVAE: Analyzing and Improving Introspective Variational Autoencoders(分析和改进自省变分自动编码器)

paper | code | project

[10] LOHO: Latent Optimization of Hairstyles via Orthogonalization(LOHO:通过正交化潜在地优化发型)

paper

[9] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper | code

[8] Closed-Form Factorization of Latent Semantics in GANs(GAN中潜在语义的闭式分解)

paper | code

[7] PD-GAN: Probabilistic Diverse GAN for Image Inpainting(用于图像修复的概率多样GAN)

[6] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

paper | code

[5] Efficient Conditional GAN Transfer with Knowledge Propagation across Classes(高效的有条件GAN转移以及跨课程的知识传播)

paper | code

[4] Exploiting Spatial Dimensions of Latent in GAN for Real-time Image Editing(利用GAN中潜在的空间维度进行实时图像编辑)

[3] Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs(Hijack-GAN:意外使用经过预训练的黑匣子GAN)

paper

[2] Encoding in Style: a StyleGAN Encoder for Image-to-Image Translation(样式编码:用于图像到图像翻译的StyleGAN编码器)

paper | code | project

[1] A 3D GAN for Improved Large-pose Facial Recognition(用于改善大姿势面部识别的3D GAN)

paper


图像生成/合成(Image Generation/Image Synthesis)

[15] Variational Transformer Networks for Layout Generation(用于布局生成的变电站网络)

paper

[14] VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization(通过未对准感知的归一化进行高分辨率的虚拟试戴)

paper

[13] A Closer Look at Fourier Spectrum Discrepancies for CNN-generated Images Detection(仔细研究CNN生成图像检测的傅立叶光谱差异)

paper | code

[12] Semi-supervised Synthesis of High-Resolution Editable Textures for 3D Humans(用于3D人类的高分辨率可编辑纹理的半监督合成)

paper

[11] Few-Shot Human Motion Transfer by Personalized Geometry and Texture Modeling(个性化几何和纹理建模的少量人体运动传递)

paper | code

[10] Brain Image Synthesis with Unsupervised Multivariate Canonical CSCℓ4Net(无监督多元规范CSCℓ4Net的脑图像合成)

paper

[9] Context-Aware Layout to Image Generation with Enhanced Object Appearance(具有增强的对象外观的上下文感知布局到图像生成)

paper

[8] DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network(通过对比生成对抗网络进行多种条件图像合成)

paper

[7] HumanGAN: A Generative Model of Humans Images(人类图像的生成模型)

paper

[6] PISE: Person Image Synthesis and Editing with Decoupled GAN(使用分离的GAN进行人像合成和编辑)

paper | code

[5] SMPLicit: Topology-aware Generative Model for Clothed People(穿衣服的人的拓扑感知生成模型)

paper | code

[4] Diversifying Sample Generation for Data-Free Quantization(多样化的样本生成,实现无数据量化)

paper

[3] Diverse Semantic Image Synthesis via Probability Distribution Modeling(基于概率分布建模的多种语义图像合成)

paper | code

[2] When Age-Invariant Face Recognition Meets Face Age Synthesis: A Multi-Task Learning Framework(当年龄不变的人脸识别遇到人脸年龄合成时:一个多任务学习框架)

paper | code

[1] Anycost GANs for Interactive Image Synthesis and Editing(用于交互式图像合成和编辑的AnyCost Gans)

paper | code

视图合成(View Synthesis)

[4] Layout-Guided Novel View Synthesis from a Single Indoor Panorama(单一室内全景的布局引导式新颖视图合成)

paper | project

[3] NeX: Real-time View Synthesis with Neural Basis Expansion(NeX:具有神经基础扩展的实时视图合成)

paper | code

[2] ID-Unet: Iterative Soft and Hard Deformation for View Synthesis(视图合成的迭代软硬变形)

paper

[1] Self-Supervised Visibility Learning for Novel View Synthesis(自我监督的可视性学习,用于新颖的视图合成)

paper


三维视觉(3D Vision)

[2] A Deep Emulator for Secondary Motion of 3D Characters(三维角色二次运动的深度仿真器)
paper

[1] 3D CNNs with Adaptive Temporal Feature Resolutions(具有自适应时间特征分辨率的3D CNN)

paper

点云(Point Cloud)

[21] DeepI2P: Image-to-Point Cloud Registration via Deep Classification(通过深度分类的图像到点云配准)

paper | code

[20] FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds(FESTA:场景点云通过时空注意进行光流估计)

paper

[19] Denoise and Contrast for Category Agnostic Shape Completion(类别不可知形状完成的消噪和对比度)

paper

[18] Panoptic-PolarNet: Proposal-free LiDAR Point Cloud Panoptic Segmentation(无提案的LiDAR点云全景分割)

paper

[17] ReAgent: Point Cloud Registration using Imitation and Reinforcement Learning(ReAgent:使用模仿和强化学习进行点云配准)

paper

[16] Equivariant Point Network for 3D Point Cloud Analysis(等变点网络进行3D点云分析)

paper

[15] PAConv: Position Adaptive Convolution with Dynamic Kernel Assembling on Point Clouds(PAConv:点云上具有动态内核组装的位置自适应卷积)

paper | code

[14] Skeleton Merger: an Unsupervised Aligned Keypoint Detector(骨架合并:无监督的对准关键点检测器)

paper | code

[13] Cycle4Completion: Unpaired Point Cloud Completion using Cycle Transformation with Missing Region Coding(使用缺失区域编码的循环变换完成不成对的点云)

paper

[12] Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion(通过双边扩充和自适应融合对实点云场景进行语义分割)

paper

[11] How Privacy-Preserving are Line Clouds? Recovering Scene Details from 3D Lines(线云如何保护隐私? 从3D线中恢复场景详细信息)

paper | code

[10] PointDSC: Robust Point Cloud Registration using Deep Spatial Consistency(使用深度空间一致性进行稳健的点云配准)

paper | code

[9] Robust Point Cloud Registration Framework Based on Deep Graph Matching(基于深度图匹配的鲁棒点云配准框架)

paper | code

[8] TPCN: Temporal Point Cloud Networks for Motion Forecasting(面向运动预测的时态点云网络)
paper | [code]()

[7] PointGuard: Provably Robust 3D Point Cloud Classification(可证明稳健的三维点云分类)

paper

[6] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)

paper | code

[5] SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration(SpinNet:学习用于3D点云配准的通用表面描述符)

paper | code

[4] MultiBodySync: Multi-Body Segmentation and Motion Estimation via 3D Scan Synchronization(通过3D扫描同步进行多主体分割和运动估计)

paper | code

[3] Diffusion Probabilistic Models for 3D Point Cloud Generation(三维点云生成的扩散概率模型)

paper | code

[2] Style-based Point Generator with Adversarial Rendering for Point Cloud Completion(用于点云补全的对抗性渲染基于样式的点生成器)

paper

[1] PREDATOR: Registration of 3D Point Clouds with Low Overlap(预测器:低重叠的3D点云的配准)

paper | code | project

三维重建(3D Reconstruction)

[12] Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction(全面了解通用对象:建模,分段和重构)

paper

[11] Reconstructing 3D Human Pose by Watching Humans in the Mirror(通过照镜子中的人来重建3D人的姿势)

paper | project

[10] Fostering Generalization in Single-view 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors(通过学习局部和全局形状先验的层次结构,促进单视图3D重构中的泛化)

paper

[9] NeuralRecon: Real-Time Coherent 3D Reconstruction from Monocular Video(单目视频的实时相干3D重建)

paper | project

[8] Learning Parallel Dense Correspondence from Spatio-Temporal Descriptors for Efficient and Robust 4D Reconstruction(从时空描述符中学习并行密集对应,以进行有效且鲁棒的4D重建)

paper | code

[7] POSEFusion: Pose-guided Selective Fusion for Single-view Human Volumetric Capture(用于单视图人体体积捕获的姿势引导选择性融合)

paper | project

[6] Deep Implicit Moving Least-Squares Functions for 3D Reconstruction(用于3D重构的深层隐式移动最小二乘函数)

paper | code

[5] Model-based 3D Hand Reconstruction via Self-Supervised Learning(通过自我监督学习进行基于模型的3D手重建)

paper

[4] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)

paper | project

[3] Learning Compositional Representation for 4D Captures with Neural ODE(使用神经ODE学习4D捕捉的合成表示)

paper

[2] SMPLicit: Topology-aware Generative Model for Clothed People(穿衣服的人的拓扑感知生成模型)

paper | code

[1] PCLs: Geometry-aware Neural Reconstruction of 3D Pose with Perspective Crop Layers(具有透视作物层的3D姿势的几何感知神经重建)

paper


模型压缩(Model Compression)

[3] Content-Aware GAN Compression(内容感知GAN压缩)

paper

[2] Dynamic Slimmable Network(动态可压缩网络)

paper | code

[1] Learning Student Networks in the Wild(一种不需要原始训练数据的模型压缩和加速技术)

paper | code

解读:华为诺亚方舟实验室提出无需数据网络压缩技术

知识蒸馏(Knowledge Distillation)

[10] 3D-to-2D Distillation for Indoor Scene Parsing(用于室内场景解析的3D到2D蒸馏)

paper

[9] Complementary Relation Contrastive Distillation(互补关系对比蒸馏)

paper

[8] Distilling Object Detectors via Decoupled Features(通过解耦功能蒸馏物体检测器)

paper | code

[7] Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation(通过自学来完善自己:通过自我蒸馏提炼特征)

paper | code

[6] Knowledge Evolution in Neural Networks(神经网络中的知识进化)

paper | code

[5] Semantic-aware Knowledge Distillation for Few-Shot Class-Incremental Learning(少班级增量学习的语义感知知识蒸馏)

paper

[4] Teachers Do More Than Teach: Compressing Image-to-Image Models(https://arxiv.org/abs/2103.03467)

paper | code

[3] General Instance Distillation for Object Detection(通用实例蒸馏技术在目标检测中的应用)

paper

[2] Multiresolution Knowledge Distillation for Anomaly Detection(用于异常检测的多分辨率知识蒸馏)

paper

[1] Distilling Object Detectors via Decoupled Features(前景背景分离的蒸馏技术)

剪枝(Pruning)

[3] Convolutional Neural Network Pruning with Structural Redundancy Reduction(减少结构冗余的卷积神经网络修剪)

paper

[2] Neural Response Interpretation through the Lens of Critical Pathways(关键途径对神经反应的解释)

paper | code1 | code2

[1] Manifold Regularized Dynamic Network Pruning(流形规则化动态网络剪枝)

paper

量化(Quantization)

[3] Network Quantization with Element-wise Gradient Scaling(逐元素梯度缩放的网络量化)

paper

[2] Zero-shot Adversarial Quantization(零样本对抗量化)

paper | code

[1] Learnable Companding Quantization for Accurate Low-bit Neural Networks(精确低位神经网络的可学习压扩量化)

paper


神经网络结构设计(Neural Network Structure Design)

[11] Convolutional Hough Matching Networks(卷积霍夫匹配网络)

paper

[10] Capsule Network is Not More Robust than Convolutional Network(胶囊网络并不比卷积网络更健壮)

paper

[9] Diverse Branch Block: Building a Convolution as an Inception-like Unit(多元分支块:将卷积构建为类似初始的单位)

paper | paper

[8] Scaling Local Self-Attention For Parameter Efficient Visual Backbones(扩展局部自注意力以获得有效的参数视觉主干)

paper

[7] Fast and Accurate Model Scaling(快速准确的模型缩放)

paper

[6] Involution: Inverting the Inherence of Convolution for Visual Recognition(反转卷积的固有性以进行视觉识别)

paper | code

[5] Inception Convolution with Efficient Dilation Search(具有有效膨胀搜索的初始卷积)

paper | code | 解读-Inception convolution


[4] Coordinate Attention for Efficient Mobile Network Design(协调注意力以实现高效的移动网络设计)

paper

[3] Rethinking Channel Dimensions for Efficient Model Design(重新考虑通道尺寸以进行有效的模型设计)

paper | code

[2] Inverting the Inherence of Convolution for Visual Recognition(颠倒卷积的固有性以进行视觉识别)

[1] RepVGG: Making VGG-style ConvNets Great Again

paper | code

解读:RepVGG:极简架构,SOTA性能,让VGG式模型再次伟大

Transformer

[3] Transformer Interpretability Beyond Attention Visualization(注意力可视化之外的Transformer可解释性)

paper | code

[2] UP-DETR: Unsupervised Pre-training for Object Detection with Transformers

paper | code

解读:无监督预训练检测器

[1] Pre-Trained Image Processing Transformer(底层视觉预训练模型)

paper | 解读-Transformer再下一城!low-level多个任务榜首被占领,北大华为等联合提出预训练模型IPT

图神经网络(GNN)

[2] Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

paper

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)

paper


数据处理(Data Processing)

数据增广(Data Augmentation)

[2] AutoDO: Robust AutoAugment for Biased Data with Label Noise via Scalable Probabilistic Implicit Differentiation(通过可扩展的概率隐式微分对带有标签噪声的有偏数据进行鲁棒的自动增强)

paper

[1] KeepAugment: A Simple Information-Preserving Data Augmentation(一种简单的保存信息的数据扩充)

paper

表征学习(Representation Learning)

[10] Seeing Out of tHe bOx: End-to-End Pre-training for Vision-Language Representation Learning(眼见为实:视觉语言表示学习的端到端预训练)

paper

[9] Self-supervised Video Representation Learning by Context and Motion Decoupling(通过上下文和运动解耦进行自我监督的视频表示学习)

paper

[8] Jigsaw Clustering for Unsupervised Visual Representation Learning(拼图聚类的无监督视觉表示学习)

paper | code

[7] Learning by Aligning Videos in Time(【视频表征】通过时间对齐视频进行学习)

paper

[6] Vectorization and Rasterization: Self-Supervised Learning for Sketch and Handwriting(矢量化和光栅化:素描和手写的自我指导学习)

paper | code

[5] Neural Parts: Learning Expressive 3D Shape Abstractions with Invertible Neural Networks(神经零件:使用可逆神经网络学习富有表现力的3D形状提取)

paper

[4] VideoMoCo: Contrastive Video Representation Learning with Temporally Adversarial Examples(对比视频表示学习和临时对抗示例)

paper

[3] Spatially Consistent Representation Learning(空间一致表示学习)

paper

[2] Removing the Background by Adding the Background: Towards Background Robust Self-supervised Video Representation Learning(通过添加背景来删除背景:朝着背景进行鲁棒的自我监督视频表示学习)

paper | code | project | 解读

[1] VirTex: Learning Visual Representations from Textual Annotations(从文本注释中学习视觉表示)

paper | code

归一化/正则化(Batch Normalization)

[3] Adaptive Consistency Regularization for Semi-Supervised Transfer Learning(半监督转移学习的自适应一致性正则化)

paper | code

[2] Meta Batch-Instance Normalization for Generalizable Person Re-Identification(通用批处理人员重新标识的元批实例规范化)

paper

[1] Representative Batch Normalization with Feature Calibration(具有特征校准功能的代表性批量归一化)

图像聚类(Image Clustering)

[4] Structure-Aware Face Clustering on a Large-Scale Graph with 10^7 Nodes(具有10^7个节点的大规模图上的结构感知人脸聚类)

paper | code&project

[3] COMPLETER: Incomplete Multi-view Clustering via Contrastive Prediction(通过对比预测的不完整多视图聚类)

paper | code

[2] Improving Unsupervised Image Clustering With Robust Learning(通过鲁棒学习改善无监督图像聚类)

paper | code

[1] Reconsidering Representation Alignment for Multi-view Clustering(重新考虑多视图聚类的表示对齐方式)

paper | code

图像压缩(Image Compression)

[4] Learning Scalable ℓ∞-constrained Near-lossless Image Compression via Joint Lossy Image and Residual Compression(通过联合有损图像和残差压缩学习可伸缩ℓ∞约束的近无损图像压缩)

paper | code

[3] Checkerboard Context Model for Efficient Learned Image Compression(高效学习图像压缩的棋盘上下文模型)

paper

[2] Slimmable Compressive Autoencoders for Practical Neural Image Compression(实用神经图像压缩的可压缩压缩自动编码器)

paper

[1] Attention-guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton(通过压缩感知显着性骨架的深度重构来进行注意力引导的图像压缩)

paper

异常检测(Anomaly Detection)

[2] MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection(用于视频异常检测的多实例自训练框架)

paper

[1] Learning Placeholders for Open-Set Recognition(学习占位符以进行开放式识别)

paper


模型训练/泛化(Model Training/Generalization)

[6] Differentiable Patch Selection for Image Recognition(用于图像识别的差异化补丁选择)

paper | code

[5] Towards Evaluating and Training Verifiably Robust Neural Networks(评估和训练可验证的稳健神经网络)

paper | code

[4] Student-Teacher Learning from Clean Inputs to Noisy Inputs(从纯净输入到噪音输入的师生学习)

paper

[3] Uncertainty-guided Model Generalization to Unseen Domains(不确定性指导的模型泛化)

paper

[2] Knowledge Evolution in Neural Networks(神经网络中的知识进化)

paper | code

[1] PGT: A Progressive Method for Training Models on Long Videos(一种在长视频上训练模型的渐进方法)

paper | code

噪声标签(Noisy Label)

[1] Partially View-aligned Representation Learning with Noise-robust Contrastive Loss(面向部分视图对齐表示学习的噪声鲁棒对比损失函数)

paper | code

长尾分布(Long-Tailed Distribution)

[7] Adversarial Robustness under Long-Tailed Distribution(长尾分布下的对抗鲁棒性)

paper | code

[6] Adaptive Class Suppression Loss for Long-Tail Object Detection(长尾目标检测的自适应类抑制损失)

paper | code

[5] Improving Calibration for Long-Tailed Recognition(改善长尾识别的校准)

paper | code

[4] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)

paper

[3] PML: Progressive Margin Loss for Long-tailed Age Classification(长尾年龄分类的累进边际损失)

paper

[2] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition(MetaSAug:用于长尾视觉识别的元语义增强)

paper

[1] Distribution Alignment: A Unified Framework for Long-tail Visual Recognition(分布对齐:长尾视觉识别的统一框架)

paper | code



模型评估(Model Evaluation)

[1] Are Labels Necessary for Classifier Accuracy Evaluation?(测试集没有标签,我们可以拿来测试模型吗?)

paper | 解读



视听学习(Audio-visual Learning)

[5] Looking into Your Speech: Learning Cross-modal Affinity for Audio-visual Speech Separation(观察您的语音:学习跨模态亲和力以进行视听语音分离)

paper | project

[4] Localizing Visual Sounds the Hard Way(视觉声音定位的困难方法)

paper

[3] Can audio-visual integration strengthen robustness under multimodal attacks?(视听集成能否增强多模式攻击下的鲁棒性?)

paper

[2] Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation(探测对象视觉接地与声音分离的循环共同学习)

paper

[1] Positive Sample Propagation along the Audio-Visual Event Line(沿视听事件线的正样本传播)

paper | code



视觉预测(Vision-based Prediction)

[5] SGCN:Sparse Graph Convolution Network for Pedestrian Trajectory Prediction(SGCN:行人轨迹预测的稀疏图卷积网络)

paper

[4] LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents(动态代理的多模态未来轨迹的车道感知预测)

paper

[3] Multimodal Motion Prediction with Stacked Transformers(堆叠式Transformer的多模态运动预测)

paper | code

[2] Video Prediction Recalling Long-term Motion Context via Memory Alignment Learning(通过记忆对准学习的视频预测调用长期运动环境)

paper

[1] MotionRNN: A Flexible Model for Video Prediction with Spacetime-Varying Motions(针对复杂时空运动的通用视频预测模型)

paper | 解读


数据集(Dataset)

[11] The Multi-Agent Behavior Dataset: Mouse Dyadic Social Interactions(多智能体行为数据集:鼠标二元社交互动)

paper | dataset

[10] Deep Animation Video Interpolation in the Wild(野外深度动画视频插帧)

paper | code&dataset

[9] Towards Rolling Shutter Correction and Deblurring in Dynamic Scenes(在动态场景中实现卷帘快门校正和去模糊)

paper | dataset&code

[8] UAV-Human: A Large Benchmark for Human Behavior Understanding with Unmanned Aerial Vehicles(无人机-人类:了解无人机行为的大型基准)

paper

[7] Visual Semantic Role Labeling for Video Understanding(【视频理解】用于视频理解的视觉语义角色标签)

paper | code&dataset

[6] Face Forensics in the Wild(人脸伪造数据集)

paper | code&dataset

[5] Benchmarking Representation Learning for Natural World Image Collections(【自然图像分类】自然世界影像收藏的基准表示学习)

paper

[4] Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark(多标签下水道缺陷分类数据集和基准)

paper | project&dataset

[3] 3DCaricShop: A Dataset and A Baseline Method for Single-view 3D Caricature Face Reconstruction(单视图3D漫画面部重建的数据集和基线方法)

paper | project

[2] Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges(走向城市规模3D点云的语义分割:数据集,基准和挑战)

paper | code

[1] Re-labeling ImageNet: from Single to Multi-Labels, from Global to Localized Labels(重新标记ImageNet:从单标签到多标签,从全局标签到本地标签)

paper | code


主动学习(Active Learning)

[3] Vab-AL: Incorporating Class Imbalance and Difficulty with Variational Bayes for Active Learning

paper | code

[2] Multiple Instance Active Learning for Object Detection(用于对象检测的多实例主动学习)

paper | code

[1] Sequential Graph Convolutional Network for Active Learning(主动学习的顺序图卷积网络)

paper


小样本学习(Few-shot Learning)/零样本学习(Zero-shot Learning)

[9] Self-Guided and Cross-Guided Learning for Few-Shot Segmentation(自我指导和交叉指导学习,用于小样本分割)

paper

[8] Contrastive Embedding for Generalized Zero-Shot Learning(广义零样本学习的对比嵌入)

paper | code

[7] Learning Dynamic Alignment via Meta-filter for Few-shot Learning(通过元过滤器学习动态对齐,以进行小样本学习)

paper

[6] Goal-Oriented Gaze Estimation for Zero-Shot Learning(零样本学习的目标导向注视估计)

paper | code

[5] Few-Shot Segmentation Without Meta-Learning: A Good Transductive Inference Is All You Need?

paper | code

[4] Counterfactual Zero-Shot and Open-Set Visual Recognition(反事实零射和开集视觉识别)

paper | code

[3] Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection(小样本目标检测的语义关系推理)

paper

[2] Few-shot Open-set Recognition by Transformation Consistency(转换一致性很少的开放集识别)

[1] Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning(探索少量学习的不变表示形式和等变表示形式的互补强度)

paper


持续学习(Continual Learning/Life-long Learning)

[5] Rectification-based Knowledge Retention for Continual Learning(基于矫正的知识保留用于持续学习)

paper

[4] Rainbow Memory: Continual Learning with a Memory of Diverse Samples(彩虹记忆:持续学习与多种样本的记忆)

paper | code

[3] Efficient Feature Transformations for Discriminative and Generative Continual Learning(区分性和生成性持续学习的有效特征转换)

paper

[2] Rainbow Memory: Continual Learning with a Memory of Diverse Samples(不断学习与多样本的记忆)

[1] Learning the Superpixel in a Non-iterative and Lifelong Manner(以非迭代和终身的方式学习超像素)

paper


场景图(Scene Graph)

场景图生成(Scene Graph Generation)

[4] Bipartite Graph Network with Adaptive Message Passing for Unbiased Scene Graph Generation(具有自适应消息传递功能的二分图网络,用于无偏场景图的生成)

paper

[3] Fully Convolutional Scene Graph Generation(全卷积场景图生成)

paper

[2] Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation(场景图生成的语义歧义概率建模)

paper

[1] Exploiting Edge-Oriented Reasoning for 3D Point-based Scene Graph Analysis(利用基于边缘的推理进行基于3D点的场景图分析)

paper

场景图预测(Scene Graph Prediction)

[1] SceneGraphFusion: Incremental 3D Scene Graph Prediction from RGB-D Sequences(基于RGB-D序列的增量3D场景图预测)

paper

场景图理解(Scene Graph Understanding)

[4] Semantic Scene Completion via Integrating Instances and Scene in-the-Loop(通过集成实例和场景在环来完成语义场景)

paper | code

[3] 3D-to-2D Distillation for Indoor Scene Parsing(用于室内场景解析的3D到2D蒸馏)

paper

[2] Bidirectional Projection Network for Cross Dimension Scene Understanding(双向投影网络,用于跨维度场景理解)

paper | code

[1] Monte Carlo Scene Search for 3D Scene Understanding(蒙特卡洛场景搜索以了解3D场景)

paper



视觉定位(Visual Localization)

[1] LoFTR: Detector-Free Local Feature Matching with Transformers(【图像特征匹配】LoFTR:与变压器互不影响的无检测器局部特征)

paper | project



视觉推理/视觉问答(Visual Reasoning/VQA)

[7] PQA: Perceptual Question Answering(感性问题解答)

paper

[6] Domain-robust VQA with diverse datasets and methods but no target labels(具有各种数据集和方法,但没有目标标签的领域稳健的VQA)

paper

[5] AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning(AGQA:组成时空推理的基准)

paper

[4] Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution(通过概率绑架和执行进行抽象时空推理)
paper | project | supplementary

[3] ACRE: Abstract Causal REasoning Beyond Covariation(ACRE:超越协方差的抽象因果推理)

paper | project | Supplementary

[2] TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning over Traffic Events(问题解答基准和有效的交通事件视频推理网络)

paper | project

[1] Transformation Driven Visual Reasoning(转型驱动的视觉推理)

paper | code | project



图像分类(Image Classification)

[5] Benchmarking Representation Learning for Natural World Image Collections(自然世界影像收藏的基准表示学习)

paper

[4] Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark(多标签下水道缺陷分类数据集和基准)

paper | project&dataset

[3] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)

paper

[2] PML: Progressive Margin Loss for Long-tailed Age Classification(长尾年龄分类的累进边际损失)

paper

[1] A Realistic Evaluation of Semi-Supervised Learning for Fine-Grained Classification(细粒度分类的半监督学习的现实评估)

paper



迁移学习/domain/自适应(Transfer Learning/Domain Adaptation)

[19] Instance Level Affinity-Based Transfer for Unsupervised Domain Adaptation(基于实例级亲和力的无监督域自适应传输)

paper | code

[18] Unsupervised Multi-source Domain Adaptation Without Access to Source Data(无需访问源数据的无监督多源域适配)

paper

[17] Curriculum Graph Co-Teaching for Multi-Target Domain Adaptation(多目标领域适应的课程图协同教学)

paper

[16] Divergence Optimization for Noisy Universal Domain Adaptation(噪声通用域自适应的发散优化)

paper

[15] Prototypical Cross-domain Self-supervised Learning for Few-shot Unsupervised Domain Adaptation(典型的跨域自我监督学习,适用于少拍无监督领域自适应)

paper | project

[14] Progressive Domain Expansion Network for Single Domain Generalization(用于单域泛化的渐进域扩展网络)

paper

[13] Dynamic Domain Adaptation for Efficient Inference(动态域自适应以实现高效推理)

paper

[12] Adaptive Methods for Real-World Domain Generalization(真实世界域自适应的自适应方法)

paper

[11] OTCE: A Transferability Metric for Cross-Domain Cross-Task Representations(跨域跨任务表示的可传递性度量标准)

paper

[10] DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation(分解表示和自适应网络以实现无监督的跨域自适应)

paper

[9] MetaAlign: Coordinating Domain Alignment and Classification for Unsupervised Domain Adaptation(无监督域自适应的协调域对齐和分类)

paper

[8] Transferable Semantic Augmentation for Domain Adaptation(可转移的语义增强以适应领域)

paper | code

[7] Dynamic Transfer for Multi-Source Domain Adaptation(多源域自适应的动态传输)

paper

[6] Semi-supervised Domain Adaptation based on Dual-level Domain Mixing for Semantic Segmentation(基于双层域混合的半监督域自适应语义分割)

paper

[5] Multi-Source Domain Adaptation with Collaborative Learning for Semantic Segmentation(多源领域自适应与协作学习的语义分割)

paper

[4] Continual Adaptation of Visual Representations via Domain Randomization and Meta-learning(通过域随机化和元学习对视觉表示进行连续调整)

paper

[3] Domain Generalization via Inference-time Label-Preserving Target Projections(基于推理时间保标目标投影的区域泛化)

paper

[2] MetaSCI: Scalable and Adaptive Reconstruction for Video Compressive Sensing(可伸缩的自适应视频压缩传感重建)

paper | code

[1] FSDR: Frequency Space Domain Randomization for Domain Generalization(用于域推广的频域随机化)

paper



度量学习(Metric Learning)

[3] Noise-resistant Deep Metric Learning with Ranking-based Instance Selection(具有基于排名的实例选择的抗噪深度度量学习)

paper

[2] Embedding Transfer with Label Relaxation for Improved Metric Learning(嵌入转移与标签松弛功能以改善度量学习)

paper

[1] Dynamic Metric Learning: Towards a Scalable Metric Space to Accommodate Multiple Semantic Scales(动态度量学习:迈向可扩展的度量空间以适应多个语义尺度)

paper | code



对比学习(Contrastive Learning)

[3] Contrastive Learning based Hybrid Networks for Long-Tailed Image Classification(基于对比学习的混合网络的长尾图像分类)

paper

[2] AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations from Self-Trained Negative Adversaries(有效对比自我训练的负面对抗无监督表示的对抗性对比)

paper | code | 解读-AdCo基于对抗的对比学习]


[1] Fine-grained Angular Contrastive Learning with Coarse Labels(粗标签的细粒度角度对比学习)

paper




强化学习(Reinforcement Learning)

[2] Unsupervised Visual Attention and Invariance for Reinforcement Learning(强化学习的无监督视觉注意和不变性)

paper

[1] Unsupervised Learning for Robust Fitting:A Reinforcement Learning Approach(无监督学习以进行稳健拟合:一种强化学习方法)

paper




元学习(Meta Learning)

[2] Meta-Mining Discriminative Samples for Kinship Verification(进行亲缘关系验证的元挖掘歧视性样本)

paper

[1] MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition(MetaSAug:用于长尾视觉识别的元语义增强)

paper



暂无分类

Cross-Modal Center Loss for 3D Cross-Modal Retrieval(【多模态检索】用于3D跨模态检索的跨模态中心损失)

paper | code

SOLD2: Self-supervised Occlusion-aware Line Description and Detection(【图像匹配】自我监督的遮挡感知线描述和检测)

paper | code

Progressive Temporal Feature Alignment Network for Video Inpainting(【视频修复】用于视频修复的渐进时间特征对齐网络)

paper | code

SMD-Nets: Stereo Mixture Density Networks(【立体声匹配】立体声混合密度网络)

paper | project

De-rendering the World's Revolutionary Artefacts(渲染世界革命文物)

paper | project

Learning Triadic Belief Dynamics in Nonverbal Communication from Videos(【视频摘要】从视频中学习非语言交流中的三重性信念动力学)

paper

Beyond Short Clips: End-to-End Video-Level Learning with Collaborative Memories(超越短片:具有协作记忆的端到端视频级学习)

paper

Passive Inter-Photon Imaging(被动光子间成像)

paper

PhySG: Inverse Rendering with Spherical Gaussians for Physics-based Material Editing and Relighting(PhySG:球形高斯逆渲染,用于基于物理的材质编辑和重新照明)

paper | project

Learning Camera Localization via Dense Scene Matching(【密集场景匹配】通过密集场景匹配学习相机定位)

paper | code

Few-Shot Incremental Learning with Continually Evolved Classifiers(【增量学习】借助不断发展的分类器进行少量增量学习)

paper

DER: Dynamically Expandable Representation for Class Incremental Learning(【增量学习】用于类增量学习的动态可扩展表示形式)

paper

SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification(半监督分类的类似伪标签开发)

paper | code

Online Learning of a Probabilistic and Adaptive Scene Representation(概率自适应场景表示的在线学习)

paper

Embracing Uncertainty: Decoupling and De-bias for Robust Temporal Grounding(拥抱不确定性:去耦和去偏置以实现可靠的实时落地)

paper

Model-Contrastive Federated Learning(模型对比联合学习)

paper

Repopulating Street Scenes(重新填充街景)

paper

Visual Room Rearrangement(视觉室重新布置)

paper

Tuning IR-cut Filter for Illumination-aware Spectral Reconstruction from RGB(可调红外截止滤光片,用于从RGB感知照明的光谱重建)

paper

Video Rescaling Networks with Joint Optimization Strategies for Downscaling and Upscaling(具有联合优化策略的视频缩放网络,用于缩小和放大)

paper

Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction(用于域外人网格重构的双层在线适应)

paper | project

Picasso: A CUDA-based Library for Deep Learning over 3D Meshes(【网格简化】毕加索:基于CUDA的3D网格深度学习库)

paper | library

Cloud2Curve: Generation and Vectorization of Parametric Sketches(参数草图的生成和矢量化)

paper

Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression(【不确定性学习】学习概率序数嵌入以进行不确定性感知回归)

paper

SSLayout360: Semi-Supervised Indoor Layout Estimation from 360◦ Panorama(【布局估计】360°全景图的半监督室内布局估计)

paper

Convex Online Video Frame Subset Selection using Multiple Criteria for Data Efficient Autonomous Driving(使用多种标准的凸面在线视频帧子集选择,以实现数据高效自动驾驶)

paper

Scene-Intuitive Agent for Remote Embodied Visual Grounding(场景直观的代理,用于远程实现可视化接地)

paper

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding(【visual grounding】弱监督视觉接地的关系感知实例细化)

paper | code

Context-aware Biaffine Localizing Network for Temporal Sentence Grounding(上下文感知的Biaffine本地化网络,用于临时Sentence Grounding)

paper

Dynamic Face Video Segmentation via Reinforcement Learning(通过强化学习进行动态人脸视频分割)

paper | code

Back to the Feature: Learning Robust Camera Localization from Pixels to Pose(从像素到姿势学习可靠的相机定位)

paper | code

Rotation Coordinate Descent for Fast Globally Optimal Rotation Averaging(【优化】旋转坐标下降用于快速全局最优旋转平均)

paper

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality(使用情感因果关系对多媒体内容进行情感分析)

paper

Deep Graph Matching under Quadratic Constraint(【图匹配】二次约束下的深度图匹配)

paper

Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging(用于光谱压缩成像的深高斯比例混合气)

paper | code

Limitations of Post-Hoc Feature Alignment for Robustness(健壮性的赛后特征对齐的局限性)

paper

Consensus Maximisation Using Influences of Monotone Boolean Functions(利用单调布尔函数的影响实现共识最大化)

paper

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food(实现对通用食品的自动营养理解)

paper

Structured Scene Memory for Vision-Language Navigation(用于视觉语言导航的结构化场景存储器)

paper | code

Learning Asynchronous and Sparse Human-Object Interaction in Videos(视频中异步稀疏人-物交互的学习)

paper

Self-supervised Geometric Perception(自我监督的几何知觉)

paper

Quantifying Explainers of Graph Neural Networks in Computational Pathology(计算病理学中图神经网络的量化解释器)

paper

Exploring Data-Efficient 3D Scene Understanding with Contrastive Scene Contexts(探索具有对比场景上下文的数据高效3D场景理解)

paper | project | video

Data-Free Model Extraction(无数据模型提取)

paper

Patch-NetVLAD: Multi-Scale Fusion of Locally-Global Descriptors for Place Recognition(用于【位置识别】的局部全局描述符的【多尺度融合】)

paper | code

Right for the Right Concept: Revising Neuro-Symbolic Concepts by Interacting with their Explanations(适用于正确概念的权利:通过可解释性来修正神经符号概念)

paper

Hierarchical and Partially Observable Goal-driven Policy Learning with Goals Relational Graph(基于目标关系图的分层部分可观测目标驱动策略学习)

paper

Domain Generalization via Inference-time Label-Preserving Target Projections(通过保留推理时间的目标投影进行域泛化)

paper

DeRF: Decomposed Radiance Fields(分解的辐射场)

project

Multi-Objective Interpolation Training for Robustness to Label Noise(多目标插值训练的鲁棒性)

paper | code

CDFI: Compression-Driven Network Design for Frame Interpolation(用于帧插值的压缩驱动网络设计)

paper | code

FLAVR: Flow-Agnostic Video Representations for Fast Frame Interpolation(【视频插帧】FLAVR:用于快速帧插值的与流无关的视频表示)

paper | code | project

Deep Animation Video Interpolation in the Wild(【视频插帧】野外深度动画视频插帧)

paper | code&dataset

Probabilistic Embeddings for Cross-Modal Retrieval(跨模态检索的概率嵌入)

paper

Self-supervised Simultaneous Multi-Step Prediction of Road Dynamics and Cost Map(道路动力学和成本图的自监督式多步同时预测)

IIRC: Incremental Implicitly-Refined Classification(增量式隐式定义的分类)

paper | project

Fair Attribute Classification through Latent Space De-biasing(通过潜在空间去偏的公平属性分类)

paper | code | project

Information-Theoretic Segmentation by Inpainting Error Maximization(修复误差最大化的信息理论分割)

paper

Kaleido-BERT: Vision-Language Pre-training on Fashion Domain(Kaleido-BERT:时尚领域的视觉语言预训练)

paper | code

UC2: Universal Cross-lingual Cross-modal Vision-and-Language Pretraining(【视频语言学习】UC2:通用跨语言跨模态视觉和语言预培训)

Less is More: CLIPBERT for Video-and-Language Learning via Sparse Sampling(通过稀疏采样进行视频和语言学习)

paper | code

D-NeRF: Neural Radiance Fields for Dynamic Scenes(D-NeRF:动态场景的神经辐射场)

paper | project

Weakly Supervised Learning of Rigid 3D Scene Flow(刚性3D场景流的弱监督学习)

paper | code | project

微信公众号: 极市平台(ID: extrememart )
每天推送最新CV干货