Foundation Models Defining a New Era in Vision: A Survey and Outlook
A Unified Framework for Event-Based Frame Interpolation With Ad-Hoc Deblurring in the Wild
Self-Supervised High-Order Information Bottleneck Learning of Spiking Neural Network for Robust Event-Based Optical Flow Estimation
Glissando-Net: Deep Single View Category Level Pose Estimation and 3D Reconstruction
Multi-Objective Convex Quantization for Efficient Model Compression
Efficient Signed Graph Sampling via Balancing & Gershgorin Disc Perfect Alignment
Infrared and Visible Image Fusion: From Data Compatibility to Task Adaption
Cross-Modal 3D Shape Retrieval via Heterogeneous Dynamic Graph Representation
Hyper-YOLO: When Visual Object Detection Meets Hypergraph Computation
Filter Pruning by High-Order Spectral Clustering
Demystify Transformers & Convolutions in Modern Image Deep Networks
STDatav2: Accessing Efficient Black-Box Stealing for Adversarial Attacks
Federated Multi-View K-Means Clustering
Practically Unbiased Pairwise Loss for Recommendation With Implicit Feedback
JM3D & JM3D-LLM: Elevating 3D Representation With Joint Multi-Modal Cues
Quantum Gated Recurrent Neural Networks
RankFeat&RankWeight: Rank-1 Feature/Weight Removal for Out-of-Distribution Detection
Explicit View-Labels Matter: A Multifacet Complementarity Study of Multi-View Clustering
Towards High-Quality and Disentangled Face Editing in a 3D GAN
Interpretable Optimization-Inspired Unfolding Network for Low-Light Image Enhancement
Training Networks in Null Space of Feature Covariance With Self-Supervision for Incremental Learning
Enhancing Object Detection With Fourier Series
Hybrid-Prediction Integrated Planning for Autonomous Driving
DHVT: Dynamic Hybrid Vision Transformer for Small Dataset Recognition
DeepSN-Net: Deep Semi-Smooth Newton Driven Network for Blind Image Restoration
TDGI: Translation-Guided Double-Graph Inference for Document-Level Relation Extraction
Anchors Crash Tensor: Efficient and Scalable Tensorial Multi-View Subspace Clustering
Towards Accurate Post-Training Quantization of Vision Transformers via Error Reduction
Robust Asymmetric Heterogeneous Federated Learning With Corrupted Clients
Heterogeneous Feature Re-Sampling for Balanced Pedestrian Attribute Recognition
Instruction-Guided Scene Text Recognition
Auto-Pairing Positives Through Implicit Relation Circulation for Discriminative Self-Learning
HybrIK-X: Hybrid Analytical-Neural Inverse Kinematics for Whole-Body Mesh Recovery
Non-Uniform Exposure Imaging via Neuromorphic Shutter Control
Generalized Task-Driven Medical Image Quality Enhancement With Gradient Promotion
Image Quality Assessment: Exploring Joint Degradation Effect of Deep Network Features via Kernel Representation Similarity Analysis
Latent Weight Quantization for Integerized Training of Deep Neural Networks
Conditional Diffusion Models for Camouflaged and Salient Object Detection
Distributionally Location-Aware Transferable Adversarial Patches for Facial Images
One-for-All: Towards Universal Domain Translation With a Single StyleGAN
Clarify Confused Nodes via Separated Learning
Learning the Optimal Discriminant SVM With Feature Extraction
Video DataFlywheel: Resolving the Impossible Data Trinity in Video-Language Understanding
Few-Shot Class-Incremental Learning for Classification and Object Detection: A Survey
Torsion Graph Neural Networks
VimTS: A Unified Video and Image Text Spotter for Enhancing the Cross-Domain Generalization
Scaling Spike-Driven Transformer With Efficient Spike Firing Approximation Training
On Testing and Learning Quantum Junta Channels
Towards Robust Point Cloud Recognition With Sample-Adaptive Auto-Augmentation
DiffTF++: 3D-Aware Diffusion Transformer for Large-Vocabulary 3D Generation