Thoughts on AI, Research, and Life
BrainExplore: Large-Scale Discovery of Interpretable Visual Representations in the Human Brain
An automated framework that discovers thousands of interpretable visual concepts encoded across the human visual cortex—from "hands holding objects" to "forest scenes"—revealing fine-grained brain representations previously unreported.
Read MoreThe Evolution of Multimodal Model Architectures: A Taxonomy
A comprehensive walkthrough of Wadekar et al.'s taxonomy of multimodal architectures. Discover the four distinct architectural patterns—Type-A through Type-D—that define how models like GPT-4V, LLaVA, and Gemini combine vision, language, and other modalities.
Read MoreThe Linear Representation Hypothesis and the Geometry of Large Language Models
A walkthrough of Park et al.'s ICML 2024 paper on the linear representation hypothesis. We explore how high-level concepts are represented as linear directions in language model representation spaces, with formal definitions and theoretical foundations.
Read More