AI testing and evaluation
A deep dive into AI quality and security, evaluation frameworks, bias detection, and building reliable and robust AI systems. Hosted by Aleksandr Meshkov, who is an AI evaluation architect with 13 years of experience
Podcasting since 2026 • 1 episodes
AI testing and evaluation
Latest Episodes
AI Evaluation. Episode 1. Practical approach to using LLM-as-a-Judge effectively
Episode Description: In this episode, we dive into a practical, three-step approach to transform LLMs from unpredictable evaluators into reliable and transparent tools. Stop relying on vague instructions like "evaluate relevance" ...
•
Season 1
•
Episode 1
•
20:33