avatar

Seil Kang

PhD Student
Yonsei University
seil [at] yonsei [dot] ac [dot] kr


Short Bio

I am Ph.D. student in Computer Science at Yonsei University.

My research focuses on investigating the behavior and inner workings of large-scale multimodal transformers, with an emphasis on interpretability-driven model improvement and alignment in systems such as Large Vision-Language Models (LVLMs) and Diffusion Transformers (DiTs).

Furthermore, I am also interested in research that analyzes and develops novel user experiences through the latest multimodal transformer science and engineering.

Last Update:

News

Publications [Google Scholar]

  1. Interpretable Motion-Attentive Maps: Spatio-Temporally Localizing Concepts in Video Diffusion Transformers teaser CVPR 2026

  2. ViKey: Enhancing Temporal Understanding in Videos via Visual Prompting teaser CVPR 2026
    Yeonkyung Lee, Dayun Ju, Youngmin Kim, Seil Kang, Seong Jae Hwang
    CVPR 2026

  3. Rare Text Semantics Were Always There in Your Diffusion Transformer teaser NeurIPS 2025
    Seil Kang*, Woojung Han*, Dayun Ju, Seong Jae Hwang
    NeurIPS 2025

  4. Interpreting Attention Heads for Image-to-Text Information Flow in Large Vision-Language Models teaser NeurIPS 2025 W
    Jinyeong Kim, Seil Kang, Jiwoo Park, Junhyeok Kim, Seong Jae Hwang
    NeurIPS 2025 Mechanistic Interpretability Workshop (Spotlight, <13%)

  5. Neuron-Level Approach for Multi-Hop Reasoning in Large Vision-Language Models teaser Technical Report
    Seil Kang, Jinyeong Kim, Seong Jae Hwang
    Technical Report

  6. Your Large Vision Language Model Only Needs A Few Attention Heads for Visual Grounding teaser CVPR 2025
    Seil Kang, Jinyeong Kim, Junhyeok Kim, Seong Jae Hwang
    CVPR 2025 (Highlight, <3%)

  7. See What You Are Told: Visual Attention Sink in Large Multimodal Models teaser ICLR 2025
    Seil Kang*, Jinyeong Kim*, Junhyeok Kim, Seong Jae Hwang
    ICLR 2025

  8. FALCON: Frequency Adjoint Link with CONtinuous Density Mask for Fast Single Image Dehazing teaser CVPRW 2025

  9. WoLF: Wide-scope Large Language Model Framework for CXR Understanding teaser Technical Report
    Seil Kang, Donghyun Kim, Junhyeok Kim, Hyo Kyoung Lee, Seong Jae Hwang
    Technical Report