15 April 2025 16:10 - 16:30
Trust is a feature: Building AI evaluations
Gartner finds that by the end of last year, at least 50% of generative AI projects were abandoned. The ones that shipped share one trait: the team treated evaluation as a product process, not a QA checkbox.
In this session, Pearl will make the case that trust is not a safety disclaimer and it's a measurable, shippable product feature.
Drawing on the Harmless-Honest-Helpful (HHH) alignment framework, recent LLM evaluation research, and real deployment patterns from teams in production, she'll walk through how product and engineering teams can close the gap between a compelling demo and a system that users actually rely on.