Research
I'm interested in developing systems that integrate vision-language models with formal logic to improve the reliability and interpretability of video-based reasoning. My research focuses on constructing pipelines to reason about temporal event sequences for video understanding, generation, and agents by combining deep learning with automata-theoretic methods.
|
|
NeuS-QA: Grounding Long-Form Video Understanding in Temporal Logic and Neuro-Symbolic Reasoning
Sahil Shah,
S P Sharan*,
Harsh Goel*,
Minkyu Choi,
Mustafa Munir,
Manvik Pasula,
Radu Marculescu,
Sandeep Chinchali
AAAI, 2026
arXiv
Training-free pipeline that identifies logical event sequences in video, boosting VQA accuracy by over 10% on causal and multi-step reasoning tasks.
|
|
A Challenge to Build Neuro-Symbolic Video Agents
Sahil Shah,
Harsh Goel,
Sai Shankar Narasimhan,
Minkyu Choi,
S. P. Sharan,
Oguzhan Akcin,
Sandeep Chinchali
NeuS, 2025
paper
/
arXiv
/
code
Combining neuro-symbolic reasoning with video perception enables agents that can interpret, predict, and act on temporal events—not just recognize them.
|
|
We'll Fix it in Post: Improving Text-to-Video Generation with Neuro-Symbolic Feedback
Minkyu Choi*,
S. P. Sharan*,
Harsh Goel,
Sahil Shah,
Sandeep Chinchali
Submitted to ICLR, 2026
arXiv
Neuro-symbolic feedback enables zero-shot refinement of generated videos, boosting temporal and semantic alignment by nearly 40% without retraining.
|
|
Neuro-Symbolic Evaluation of Text-to-Video Models using Formal Verification
S. P. Sharan*,
Minkyu Choi*,
Sahil Shah,
Harsh Goel,
Mohammad Omama,
Sandeep Chinchali
CVPR, 2025
project page
/
paper
/
arXiv
/
code
Formally verifying a video against a temporal logic specification yields a text-to-video metric that aligns 5× better with human judgment than existing scores.
|
|
COFFEE: a High-Performance Approach to Convex Optimization for Thermodynamic Equilibrium Computations
Fu-Yao Yu*,
Sahil Shah*,
Yash Mittal,
Paul Bessler,
Aamir Mohsin,
Jeffrey Geng,
Arnav Vats,
David Soloveichik
SIEDS, 2025   (Best Paper)
project page
/
paper
/
code
A trust-region convex solver for molecular equilibrium runs 2× faster and 10⁷× more accurately than prior tools, scaling to large biochemistry datasets.
|
|
Real-Time Privacy Preservation for Robot Visual Perception
Minkyu Choi*,
Yunhao Yang*,
Neel P Bhatt*,
Kushagra Gupta,
Sahil Shah,
Aditya Rai,
David Fridovich-Keil,
Ufuk Topcu,
Sandeep Chinchali
TMLR, 2025
arXiv
Blurring objects based on logical specifications enables real-time video privacy with 95%+ compliance, provable guarantees, and seamless robot deployment.
|
|
Towards Neuro-Symbolic Video Understanding
Minkyu Choi,
Harsh Goel*,
Mohammad Omama*,
Yunhao Yang,
Sahil Shah,
Sandeep Chinchali
ECCV, 2024   (Oral Presentation)
project page
/
paper
/
arXiv
/
code
Decoupling perception and temporal reasoning with TL-based state machines boosts long-range event identification by up to 15% on self-driving datasets.
|
|