Locations ▼

Get invited

Partnership opportunities

Secure your pass

Call to action

Your text goes here. Insert your content, thoughts, or information in this space.

Button

Back to speakers

Deep

Shah

Engineering Lead

Google

Deep Shah is an Engineering Leader at Google leading strategic initiatives in personalization for Google Search. He has been the driver for many large-scale discovery features on Search, including Preferred Sources and Stay Up to Date on Evolving Topics, which empower users to prioritize trusted publishers and quickly access novel information. These contributions have been widely recognized by global media outlets, including The Verge, TechCrunch, and Bloomberg, which have highlighted his work as a "game-changer", "incredible" for the user and a significant advancement in the fight against misinformation. Previously, he was leading the abuse detection for Google maps, where he developed novel unsupervised machine learning models to combat spam and fraudulent activity acting as a defensive mechanism detecting abuse other online models missed. Deep holds an M.S. in Computer Science from the University of Illinois Urbana-Champaign and has been recognized with multiple awards and honors for his leadership and contributions in large-scale recommendation and discovery systems.

Button

15 April 2026 16:00 - 16:30

Scaling and optimizing agentic workflows

As agentic systems move from prototype to production, managing inference costs, latency, and throughput becomes a critical engineering challenge. This technical session explores three practical architectural patterns for optimizing large language model operations in autonomous agents. First, we will dive into query-aware routing, demonstrating how to dynamically match incoming agent tasks with the most efficient model size without sacrificing capability. Next, we will cover semantic caching strategies that bypass redundant model calls for frequently executed tasks, drastically reducing latency. Finally, we will examine advanced prompt-design techniques for token-efficient reasoning, contrasting traditional verbose reasoning methods to minimize computational overhead while preserving logical rigor. By the end of this session, participants will have concrete methodologies to engineer highly efficient and economically viable agentic architectures.