Divya Kothandaraman
I am a senior researcher at Dolby Laboratories. I earned my PhD in Computer Science from the University of
Maryland, College Park. My dissertation, "Learning from Less Data: Perception and Synthesis," explores
data-efficient AI. Prior to this, I was an undergraduate at the Indian Institute of
Technology Madras, where
I obtained a bachelors degree in Electrical Engineering, and masters degree in Data Sciences.
My research interests encompass the intersection of generative AI, computer vision, and multi-modal learning.
My recent works range from developing novel methods for generative AI
tasks in controllable image and video generation such as personalization, novel-view synthesis,
and prompt mixing to developing deep learning based solutions for computer vision tasks such as
domain adaptation and video action recognition.
Email  / 
CV  / 
Google Scholar
 / 
Twitter  / 
Github
|
|
Latest News
- (Jan 2025) Joined Dolby Laboratories as a Senior Researcher!
- (Nov 2024) ImPoster has been accepted to COLING 2025!
- (Nov 2024) Defended my PhD!
- (Sep 2024) Gave a talk on novel view synthesis at the ECCV 2024 Wild3D Workshop!
- (May 2024) New paper on prompt mixing using the Black Scholes model is on ArXiv.
- (Mar 2024) Gave a talk at UCL! Slides here.
- (Nov 2023) HawkI is on ArXiv!
- (Sep 2023) Aerial Diffusion has been accepted to Siggraph Asia 2023.
- (May 2023) Interning at Google DeepMind.
- (Jan 2023) Differentiable FAR has been accepted to ICRA 2023.
- (Oct 2022) Two papers have been accepted to WACV 2023.
- (July 2022) FAR: Fourier Aerial Video Recognition has been accepted to ECCV 2022.
|
|
Prompt Mixing in Diffusion Models using the Black Scholes Algorithm
Divya Kothandaraman,
Ming Lin, Dinesh Manocha
ArXiv
arXiv
/ GitHub
An approach for prompt mixing using novel perspectives from the Black Scholes model in economics and finance.
|
|
HawkI: Homography and Mutual Information Guidance for 3D-free Single Image to Aerial View
Divya Kothandaraman,
Tianyi Zhou, Ming Lin, Dinesh Manocha
ArXiv
arXiv
/ GitHub
Mutual information and inverse perspective mapping guidance for text-controlled aerial view synthesis from a single input image using diffusion models.
|
|
ImPoster: Text and Frequency Guidance for Subject Driven Action Personalization using Diffusion Models
Divya Kothandaraman,
Kuldeep Kulkarni, Sumit Shekhar, Balaji Vasan Srinivasan, Dinesh Manocha
COLING 2025
arXiv
/ GitHub
An approach for subject and action personalization using prompting techniques and concepts from image and signal processing.
|
|
Text Prompting for Multi-Concept Video Customization by Autoregressive Generation
Divya Kothandaraman,
Kihyuk Sohn, Ruben Villegas, Paul Voigtlaender, Dinesh Manocha, Mohammad Babaeizadeh
AI4CC Workshop at CVPR 2024
arXiv
Sequential and controlled autoregressive generation of the desired custom concepts for multi-concept customized video generation with transfoermer models.
|
|
Aerial Diffusion: Text Guided Ground-to-Aerial View Translation from a Single Image using Diffusion Models
Divya Kothandaraman,
Tianyi Zhou, Ming Lin, Dinesh Manocha
Siggraph Asia 2023 (Conference Proceedings, Technical Communications)
arXiv
/ GitHub
A text-guided image to image diffusion model to generate aerial views from a single ground-view image.
|
|
Differentiable Frequency-based Disentanglement for Aerial Video Action Recognition
Divya Kothandaraman,
Ming Lin, Dinesh Manocha
ICRA 2023
arXiv
/ GitHub
A differentiable feature disentanglement method to learn "static salient" and "dynamic salient" regions for aerial video action recognition.
|
|
SALAD: Source-free Active Label Agnostic Domain Adaptation
Divya Kothandaraman,
Sumit Shekhar, Abhilasha Sancheti, Manoj Ghuhan, Tripti Shukla, Dinesh Manocha
WACV 2023
arXiv
/ GitHub
A generic source-free active domain adaptation method that can handle shifts in output label space.
|
|
FAR: Fourier Aerial Video Recognition
Divya Kothandaraman,
Tianrui Guan, Xijun Wang, Sean Hu, Ming Lin, Dinesh Manocha
ECCV 2022
Project Page
/ arXiv
/ GitHub
An efficient aerial video action recognition method, with novel frequency domain techniques, vis-a-vis, Fourier object disentanglement and Fourier attention.
|
|
GANav: Group-wise Attention Network for Classifying Navigable Regions in Unstructured Outdoor Environments
Tianrui Guan
Divya Kothandaraman,
Rohan Chandra
Dinesh Manocha
IROS 2022 and RSS 2022
Project Page
/ arXiv
/ bibtex
An attention-based segmentation method for identifying safe and navigable regions in off-road terrains.
|
|
SS-SFDA : Self-Supervised Source-Free Domain Adaptation for Road Segmentation in Hazardous Environments
Divya Kothandaraman,
Rohan Chandra
Dinesh Manocha
ICCV Workshops 2021
Project Page
/ arXiv
/ YouTube
/ GitHub
/ bibtex
A self-supervised learning approach for source free unsupervised road segmentation in adverse weather environments and low light conditions.
|
|
BoMuDA: Boundless Multi-Source Domain Adaptive Segmentation in Unconstrained Environments
Divya Kothandaraman,
Rohan Chandra
Dinesh Manocha
ICCV Workshops 2021
Project Page
/ arXiv
/ YouTube
/ GitHub
/ bibtex
A multi-source boundless unsupervised domain adaptation algorithm for semantic segmentation in unstructured environments.
|
|
Domain Adaptive Knowledge Distillation for Driving Scene Semantic Segmentation
Divya Kothandaraman,
Athira Nambiar
Anurag Mittal
WACV Workshops 2021
Paper
/ YouTube
/ GitHub
/ bibtex
An approach for domain adaptive semantic segmentation in models with limited memory.
|
|
Deep Atrous Guided Filter for Image Restoration in Under Display Cameras
Varun Sundar ,
Sumanth Hegde*,
Divya Kothandaraman ,
Kaushik Mitra
ECCV Workshops, 2020
ArXiv
/ YouTube
/ Project Page
/ bibtex
Guided Filters when incorporated in a deep network can efficiently recover severely degraded, mega-pixel resolution images.
|
|