I am Ph.D. student in Computer Science at Yonsei University.
My research focuses on investigating the behavior and inner workings of large-scale multimodal transformers, with an emphasis on interpretability-driven model improvement and alignment in systems such as Large Vision-Language Models (LVLMs) and Diffusion Transformers (DiTs).
Furthermore, I am also interested in research that analyzes and develops novel user experiences through the latest multimodal transformer science and engineering.
NeurIPS 2025
NeurIPS 2025 W
Technical Report
CVPR 2025
ICLR 2025
Technical Report