Partnership opportunities

Save on your pass

Call to action
Your text goes here. Insert your content, thoughts, or information in this space.
Button

Back to speakers

Aman
Dalmia
Machine Learning Engineer
Staples
Aman Dalmia is a Machine Learning Engineer specializing in Natural Language Processing (NLP) and Multimodal AI. With a background in AI research and development, he has worked on NLP applications, including recommendation systems, intent recognition and document intelligence. Currently, he is working at Staples, where he enhances AI-driven product search, recommendations and ranking. Passionate about AI's impact on real-world applications, Aman continues to explore advancements in machine learning, large language models, and multimodal AI.
Button
20 November 2025 12:30 - 13:00
Visual question answering: Solving geometry problems with VLMs
his talk presents GeoCoder, a novel approach to solving geometry problems in Visual Question Answering (VQA) using vision-language models (VLMs) that generate and execute modular code. Traditional VLMs often struggle with precise calculations and correct formula use. GeoCoder addresses this by fine-tuning a VLM (LLaVA 1.5 7B) to produce executable code, improving accuracy and interpretability. We demonstrate how this semi-parametric method outperforms chain-of-thought reasoning across various problem complexities, leveraging retrieval-augmented generation and code execution for more reliable solutions.