Sign In
Register

Partnership opportunities

Secure your seat

Call to action
Your text goes here. Insert your content, thoughts, or information in this space.
Button

Back to speakers

Coy
Cardwell
Principal Engineer
First Line Software
Coy is a Principal Engineer who’s obsessed with making technology actually work for people not the other way around. He brings a practical, human-centred approach to engineering, turning complex systems into tools that teams genuinely enjoy using.
Button
02 December 2025 14:30 - 15:00
Panel | Designing robust LLM evaluation frameworks for performance and alignment
In this session, we’ll explore how to build robust evaluation frameworks that go beyond benchmarks to capture safety, bias, hallucination, and task-specific performance. We’ll cover practical methods for stress-testing LLMs in enterprise settings, defining custom evaluation metrics, and building scalable pipelines to validate model behavior over time. → Key techniques for measuring alignment, reliability, and safety. → How to operationalise evaluation workflows across teams and use cases.