Expert course

RAG-LLM Evals & Test Automation for Beginners

Understand, Evaluate & Test RAG - LLM's (AI based Systems) from Scratch using RAGAS-Python-Pytest Framework

Rating: 4.41,443 ratings10,656 students8.5 total hours59 lectures

Open course in DoJo Back to courses

RAG-LLM Evals & Test Automation for Beginners

Expert

Course facts

Last updated 01/2026
English English [Auto], French [Auto] , 5 more
Instructor: Rahul Shetty Academy
retrieval-augmented generation and knowledge systems

What you'll learn

Practical outcomes

How Custom Large Language Models (LLM) are designed using Retrieval Augmented Generation (RAG) Architecture
Common Benchmarks/Metrics used in Evaluating RAG based LLM’s
Introduction to RAGAS Evaluation framework for evaluating/test LLM’s
Practical Scripts generation to automate and assert the Metrics Score of LLM’s.
Automate Scenarios such as Single turn interactions and Multi turn interactions with LLM’s using RAGAS Framework
Generate Test Data for evaluating the Metrics of LLM using RAGAS Framework.
Create RAGAS Pytest Evaluation Framework to assert the Metrics of RAG- (Custom) LLM’s

Curriculum

13 sections • 59 lectures • 8h 43m total length

Introduction to AI concepts - LLM's & RAG LLM's6 lectures • 44min

What this course offers? FAQ"s -Must Watch09:12
Course outcome - Setting the stage of expectation00:35
Introduction to Artificial Intelligence and LLM's - How they work06:17
Overview of popular LLM"s and Challenges with these general LLM's06:15
What is Retrieval Augmented Generation (RAG)? Understand its Architecture11:00
End to end flow in RAG Architecture and its key advantages10:32

Understand RAG (Retrieval Augmented Generation) - LLM Architecture with Usecase3 lectures • 19min

Misconceptions - Why RAG LLM's - cant we solve problem with traditional methods?05:26
Should I use LLM data if the information in RAG is missing? - Best Practices07:09
Optional - Overview how code looks in building RAG LLM's applications06:51

Getting started with Practice LLM's and the approach to evaluate /Test5 lectures • 30min

Course resources download00:09
Demo of Practice RAG LLM's to evaluate and write test automation scripts06:51
Understanding implementation part of practice RAG LLM's to understand context08:36
Understand conversational LLM scenarios and how they are applied to RAG Arch05:47
Understand the Metric benchmarks for Document Retrieval system in LLM08:12

Setup Python & Pytest Environment with RAGAS LLM Evaluation Package Libraries4 lectures • 31min

Install and set the path of Python in windows OS10:16
Install and set the path of Python in MAC OS10:26
Install RAGAS Framework packages and setup the LLM Test project09:35
Python & Pytest Basics - Where to find them in the tutorial?00:30

Programmatic solution to evaluate LLM Metrics with Langchain and RAGAS Libraries5 lectures • 1hr

Making connection with OpenAI using Langchain Framework for RAGAS15:49
End to end -Evaluate LLM for ContextPrecision metric with SingleTurn Test data20:38
Metrics document download00:04
Communicate with LLM's using API Post call to dynamically get responses09:51
Evaluate LLM for Context Recall Metric with RAGAS Pytest Test example13:22

Optimize LLM Evaluation tests with Pytest Fixtures & Parameterization techniques3 lectures • 31min

Build Pytest fixtures to isolate OpenAI and LLM Wrapper common utils from test07:56
Introduction to Pytest Parameterization fixtures to drive test data externally10:13
Reusable utils to isolate API calls of LLM and have test only on Metric logic13:18

Evaluate LLM Core Metrics and importance of EvalDataSet in RAGAS Framework5 lectures • 47min

Understand LLM's Faithfulness and Response relevance metrics conceptually04:56
Build LLM Evaluation script to test Faithfulness benchmarks using RAGAS09:42
Reading Test data from external json file to LLM evaluation scripts09:58
Understand how Metrics are used at different places of RAG LLM Architecture10:34
Factual Correctness - Build a single Test to evaluate multiple LLM metrics12:02

Upload LLM Evaluation results & Test LLM for Multi Conversational Chat History5 lectures • 44min

Understand EvaluationDataSet and how it help in evaluating Multiple metrics09:41
Important Note00:23
Upload the LLM Metrics evaluation results into RAGAS dashboard portal visually08:21
How to evaluate RAG LLM with multi conversational history chat07:59
Build LLM Evaluation Test which can evaluate multi conversation - example17:42

Create Test Data dynamically to evaluate LLM & Generate Rubrics Evaluation Score4 lectures • 56min

How to Create Test Data using RAGAS Framework to evaluate LLM15:02
Load the external docs into Langchain utils to analyze and extract test data08:52
Install and configure NLTK package to scan the LLM documents & generating tests20:11
Generate Rubrics based Criteria Scoring to evaluate the quality of LLM responses11:46

Conclusion and next steps!1 lecture • 4min

1 slide Recap of concepts learned from the course04:29

Optional - Learn Python Fundamentals with examples14 lectures • 2hr 4min

Python hello world Program with Basics08:35
Datatypes in python and how to get the Type at run time05:17
List Datatype and its operations to manipulate12:47
Tuple and Dictionary Data types in Python with examples08:28
If else condition in python with working examples03:10
How to Create Dictionaries at run time and add data into it07:55
How loops work in Python and importance of code idendation08:58
Programming examples using for loop - 104:17
Programming examples using While loop - 210:27
What are functions? How to use them in Python10:46
OOPS Principles : Classes and objects in Python07:38
What is Constructor and its role in Object oriented programming13:38
Inheritance concepts with examples in Python12:12
Strings and its functions in python09:53

Optional - Overview of Pytest Framework basics with examples3 lectures • 32min

What are pytest fixtures and how it help in enhancing tests10:29
Understand scopes in Pytest fixtures with examples11:59
Setup and teardown setup using Python fixtures with yield keyword09:04

Bonus Lecture1 lecture • 1min

Bonus Lecture01:29

Who it is for

Software Engineers
Quality Assurance Engineers
Software Testers

Course description

Overview

LLMs are everywhere! Every business is building its own custom AI-based RAG-LLMs to improve customer service. But how are engineers testing them? Unlike traditional software testing, AI-based systems need a special methodology for evaluation. This course starts from the ground up, explaining the architecture of how AI systems (LLMs) work behind the scenes. Then, it dives deep into LLM evaluation metrics. This course shows you how to effectively use the RAGAS framework library to evaluate LLM metrics through scripted examples. This allows you to use Pytest assertions to check metric benchmark scores and design a robust LLM Test/evaluation automation framework.

What will you learn from the course? High level overview on Large Language Models (LLM) Understand how Custom LLM’s are built using Retrieval Augmented Generation (RAG) Architecture Common Benchmarks/Metrics used in Evaluating RAG based LLM’s Introduction to RAGAS Evaluation framework for evaluating/test LLM’s Practical Scripts generation to automate and assert the Metrics Score of LLM’s. Automate Scenarios such as Single turn interactions and Multi turn interactions with LLM’s using RAGAS Framework Generate Test Data for evaluating the Metrics of LLM using RAGAS Framework. By end of the course, you will be able to create RAGAS Pytest Evaluation Framework to assert the Metrics of RAG- (Custom) LLM’s Important Note: This course covers Top 7 Metrics which are commonly used to Evaluate and test the LLM’s. Same logic can be applied to rest of any other metric evaluations. Handson Experience: Course provides Practice RAG -LLM for you for Handson, but at scripting phase, you need a basic subscription of Open AI to access their API’s (Minimal 10$ credit will suffice) Course Prerequisites: Python, PyTest basics are required to understand the Framework. We have 2 dedicated sections at the end of this course which gives you necessary knowledge on Python & Pytest required to follow the course. Basic knowledge on API Testing.

Instructor

Rahul Shetty Academy

Rahul Shetty Academy Helping 1M+ Students Become Job-Ready QA Engineers "Nothing is Impossible. It all depends on how we are Trained on it." "Teaching is my Passion. And it's my Profession. The only Business I know is Spreading Knowledge." I'm Rahul Shetty (aka- Venkatesh), a QA instructor with a 15-year track record. Over 1 Million QA professionals from 195 countries have taken my courses on Selenium, Playwright, AI Testing, Software Testing (Jira), API Testing, Cypress, Postman, Appium, JMeter, and more..." I lead top QA initiatives both online and offline — through Rahul Shetty Academy, one of the leading EdTech platforms for QA training; QASummit, a premier offline conference brand; and RS TekSolutions, my software consulting firm. Together, these ventures have helped hundreds of thousands of students master testing and automation, transforming their careers as Automation Engineers "Many QA professionals aspire to learn cutting-edge automation, but 90% abandon their goals. It's not a lack of courses, but finding the right mentor who understands the QA mindset and tailors their teaching accordingly." "As a QA engineer with nearly two decades of experience, I get it. I've built my courses strategically, focusing on practical skills and career growth. I believe I've cracked the code for teaching automation testing, and I'm thrilled to share it with you." "My online courses are the most comprehensive available. You gain not only up-to-date, job-relevant knowledge but also a lifelong mentor who's helped countless QA engineers level up." There isn't a day when I don't receive student success emails from across the globe about landing a new job, how my courses have changed their lives and career for the better and how they are respected and appreciated at the workplace after gaining new knowledge and experience from my course. "Join my courses and gain the skills and mentorship to achieve your QA career goals!"