DXA-Net: Dual-Task Cross-Lingual Alignment Network for Zero-Shot Cross-Lingual Spoken Language Understanding
I&S-ViT: An Inclusive & Stable Method for Post-Training ViTs Quantization
Single Voter Spreading for Efficient Correspondence Grouping and 3D Registration
Step-Wise Distribution-Aligned Style Prompt Tuning for Source-Free Cross-Domain Few-Shot Learning
Toward Size-Invariant Salient Object Detection: A Generic Evaluation
Sample-Level Prototypical Federated Learning
Zero-Shot Sparse Mixture of Low-Rank Experts Construction From Pre-Trained Foundation Models
Adaptive Batch Size Time Evolving Stochastic Gradient Descent for Federated Learning
Lagrangian Motion Fields for Long-Term Motion Generation
Semantic Concentration for Self-Supervised Dense Representations Learning
BRACTIVE: A Brain Activation Approach to Human Visual Brain Learning
StyleShot: A Snapshot on Any Style
Revisiting Deformable Convolution on Graphs: Large-Range Modeling and Robustness
TransFace++: Rethinking the Face Recognition Paradigm With a Focus on Accuracy, Efficiency, and Security
PRVR: Partially Relevant Video Retrieval
An Enhanced Adaptive Confidence Margin for Semi-Supervised Facial Expression Recognition
LargeAD: Large-Scale Cross-Sensor Data Pretraining for Autonomous Driving
Zero-Shot Learning for Limited Photon Budget Denoising in Structured Illumination Microscopy
Learning Knowledge-Based Prompts for Robust 3D Mask Presentation Attack Detection
IPF-RDA: An Information-Preserving Framework for Robust Data Augmentation
Learning Efficient Meshflow and Optical Flow From Event Cameras
CAIT: Triple-Win Compression Toward High Accuracy, Fast Inference, and Favorable Transferability for ViTs
M3C: Resist Agnostic Attacks by Mitigating Consistent Class Confusion Prior
$\beta $-DARTS++: Bi-Level Regularization for Proxy-Robust Differentiable Architecture Search
Schedule-Robust Continual Learning
To Fold or Not to Fold: Graph Regularized Tensor Train for Visual Data Completion
CDTFusion: Crossing Domain and Task for Infrared and Visible Image Fusion
Toward Free-Form Local Feature Matching
Data-Dependent Rectangular Bounding Processes
Clustering Diffusion Model With Frequency-Signal Modulation for Variational Graph Autoencoders
A Survey on Video Temporal Grounding With Multimodal Large Language Model
Model-Free Test Time Adaptation for Out-of-Distribution Detection
UniSOT: A Unified Framework for Multi-Modality Single Object Tracking
Specific Emitter Identification by Edge Pattern Detection and Incremental Open-World Learning
EvLight++: Low-Light Video Enhancement With an Event Camera: A Large-Scale Real-World Dataset, Novel Method, and More
Multi-Matrix Completion: A Novel Framework for Structurally Missing Elements
Unified Cross-Modal Medical Image Synthesis With Hierarchical Mixture of Product-of-Experts
Learning Roles With Emergent Social Value Orientations
Wonder3D++: Cross-Domain Diffusion for High-Fidelity 3D Generation From a Single Image
PiercingEye: Dual-Space Video Violence Detection With Hyperbolic Vision-Language Guidance
Deeper Insights Into Deep Graph Convolutional Networks: Stability and Generalization
ID-Guard: A Universal Framework for Combating Facial Manipulation via Breaking Identification
DTL: Parameter- and Memory-Efficient Disentangled Vision Learning
Layer-Adaptive-Augmentation-Based Graph Contrastive Learning With Feature Decorrelation
Explicit Visual Prompting for Universal Foreground Segmentations
SMPLest-X: Ultimate Scaling for Expressive Human Pose and Shape Estimation
Neural Prediction Errors as a Unified Cue for Abstract Visual Reasoning
Release the Potential of Memory Buffer in Continual Learning: A Dynamic System Perspective
Fixing Background Misclassification in Few-Shot Object Detection via Product of Experts
Orthogonal Decoupling Contrastive Regularization: Toward Uncorrelated Feature Decoupling for Unpaired Image Restoration