Research
Research Interest
Improving the efficiency and performance of large-scale Vision-Language Models (VLMs)
AI security, including agentic defensive AI for autonomous security systems and deepfake detection
Large-scale Data filtering and curation for efficient pre-training
Social AI, focusing on human-centric and socially aware intelligence
Reinforcement Learning for adaptive and generalizable decision-making systems
Autonomous driving-related perception and decision-making tasks
Generative modeling with Diffusion models
Improving the efficiency and performance of large-scale Vision-Language Models (VLMs)
- [P10] Isotropic Embedding Perturbations for Robust Vision Language Encoders
- [P8] ECC: Encoder-Centric Corruption for Fine-Grained Vision in VLMs
- [P6] Enhancing Alignment for Unified Multimodal Models via Semantically-Grounded Supervision
- [P3] iConFormer: Dynamic Parameter-Efficient Tuning with Input-Conditioned Adaptation
AI security, including agentic defensive AI for autonomous security systems and deepfake detection
- [P5] Stabilizing Robustness Transfer in Adversarial Distillation with Controlled Teacher Adaptation
- [P10] Isotropic Embedding Perturbations for Robust Vision Language Encoders
Large-scale Data filtering and curation for efficient pre-training
- [P9] CORE: Corruption-Reconstruction based Data Filtering Network
- [C7] Salience-Based Adaptive Masking: Revisiting Token Dynamics for Enhanced Pre-training
- [C6] Emerging Property of Masked TOken for Effective Pre-training
- [P2] SG-MIM: Structured Knowledge Guided Efficient Pre-training for Dense Prediction
Social AI, focusing on human-centric and socially aware intelligence
Reinforcement Learning for adaptive and generalizable decision-making systems
- [C8] A Simple Framework for Generalization in Visual RL under Dynamic Scene Perturbations
- [C5] Environment Agnostic Representation for Visual Reinforcement Learning
- [C4] Local-Guided Global: Paired Similarity Representation for Visual Reinforcement Learning
- [J1] Learning Disentangled Skills for Hierarchical Reinforcement Learning through Trajectory Autoencoder with Weak Labels
Autonomous driving-related perception and decision-making taskS
- [C10] RobIA: Robust Instance-aware Continual Test-time Adaptation for Deep Stereo
- [C9] TADFormer: Task-Adaptive Dynamic TransFormer for Efficient Multi-Task Learning
- [C2] Sequential Cross Attention Based Multi-Task Learning
- [C1] Adaptive Confidence Thresholding for Monocular Depth Estimation
- [J5] UniTT-Stereo: Unified Training of Transformer for Enhanced Stereo Matching
- [J4] Global Structural Knowledge Distillation for Semantic Segmentation
- [J3] MaDis-Stereo: Enhanced Stereo Matching via Distilled Masked Image Modeling
Generative modeling with Diffusion models
- [P7] Adaptive Noise Injection, Bootstrapping Denoising Diffusion for Generalized Recognition
Research Direction
AI Principles – Study the fundamental structure and learning mechanisms of modern AI models
Scalability & Efficiency – Design architectures and training strategies for scalable and efficient AI
Generalization – Develop AI systems that adapt across domains, tasks, and data modalities
Foundation Model – Research large-scale models from a core architectural and training perspective
Academic Sustainability – Make large models trainable even with limited resources compared to industry