Intermediate course
Multimodal RAG: AI Search & Recommender Systems with GPT-4
Mastering Multimodal RAG: Build AI-Powered Search & Recommender Systems with GPT-4, CLIP, and ChromaDB
Intermediate
Course facts
- Last updated 11/2024
- English English [Auto], French [Auto] , 3 more
- Instructor: Paulo Dichone | Software Engineer, AWS Cloud Practitioner & Instructor
- retrieval-augmented generation and knowledge systems
What you'll learn
Practical outcomes
- Understand and implement Retrieval-Augmented Generation (RAG) with multimodal data (text, images).
- Build AI-powered search and recommender systems using GPT-4, CLIP, and ChromaDB.
- Generate and utilize text and image embeddings to perform multimodal searches.
- Develop interactive applications with Streamlit to handle user queries and provide AI-driven recommendations
Curriculum
8 sections • 24 lectures • 1h 35m total length
Introduction3 lectures • 6min
- Introduction & Prerequisites02:37
- Course Structure00:32
- WATCH THIS DEMO - What You'll Build in This Course02:37
Download Code and Resources2 lectures • 2min
- PLEASE Watch this - How To Get the Source Code02:16
- Download source code00:01
Development environment Setup1 lecture • 1min
- Development Environment Setup - Overview01:04
RAG (Retrieval Augmented Generation) and Multimodal Systems Deep Dive3 lectures • 17min
- RAG Systems - Deep Dive Crush Course04:17
- RAG Benefits and Practical Application04:14
- Multimodal RAG - Overview & Motivation and Benefits - How it Works08:54
Search in a Multimodal RAG System3 lectures • 11min
- How Search is Integrated into a Multimodal RAG System - Full Workflow03:19
- Why Multimodal Search is so Powerful02:53
- Visual Explanation Why Multimodal Search is so Powerful04:47
Hands-on: Multimodal Search RAG System2 lectures • 17min
- Multimodal Search System Setup - Create Embeddings of Images07:57
- Finish the Multimodal Search System09:17
Hands-On - Multimodal Recommender System8 lectures • 36min
- Multimodal Recommender System - Overview03:28
- Getting our Dataset from Hugging Face & Showing Number of Rows03:27
- Saving all Images Locally03:00
- Saving Image Embeddings to Vector Database04:51
- Testing our Multimodal Recommender System - Fetching the Correct Images03:52
- Setting up the RAG Flow - Part 108:12
- Putting it all Together and Testing the Multimodal Recommender RAG System06:29
- Adding a UI to the Multimodal Recommender System - Streamlit02:58
Next Steps2 lectures • 5min
- Next steps02:53
- Bonus Lecture01:55
Who it is for
- Aspiring AI Developers: Individuals looking to build AI-powered applications that integrate text and image data.
- Data Scientists: Professionals aiming to enhance their skills in multimodal data processing and retrieval.
- Machine Learning Engineers: Those seeking to implement advanced search and recommender systems using state-of-the-art models.
Course description
Overview
Are you ready to dive into the cutting-edge world of AI-powered search and recommender systems? This course will guide you through the process of building Multimodal Retrieval-Augmented Generation (RAG) systems that combine text and image data for advanced information retrieval and recommendations. In this hands-on course, you'll learn how to leverage state-of-the-art tools such as GPT-4, CLIP, and ChromaDB to build AI systems capable of processing multimodal data—enhancing traditional search methods with the power of machine learning and embeddings.
What You’ll Learn: Master Multimodal RAG: Understand the concept of Retrieval-Augmented Generation (RAG) and how to implement it for both text and image-based data. Build AI-Powered Search & Recommendation Systems: Learn how to construct search engines and recommender systems that can handle multimodal queries, using powerful AI models like GPT-4 and CLIP. Utilize Embeddings for Cross-Modal Search: Gain practical experience generating and using embeddings to enable search and recommendations based on text or image input. Develop Interactive Applications with Streamlit: Create user-friendly applications that allow real-time querying and recommendations based on user-provided text or image data. Key Technologies You'll Work With: GPT-4: A cutting-edge language model that powers the AI-driven recommendations. CLIP: An advanced AI model for generating image and text embeddings, making it possible to search images with text. ChromaDB: A high-performance vector database that enables fast and efficient querying for multimodal embeddings. Streamlit: A simple yet powerful framework for building interactive web applications. No prior experience with multimodal systems? No problem! This course is designed to make advanced AI concepts accessible, with detailed, step-by-step instructions that guide you through each process—from generating embeddings to building complete AI systems. Basic Python knowledge and a curiosity for AI are all you need to get started. Enroll today and take your AI development skills to the next level by mastering the art of multimodal RAG systems!
Instructor
Paulo Dichone | Software Engineer, AWS Cloud Practitioner & Instructor
Paulo Dichone | Software Engineer, AWS Cloud Practitioner & Instructor Android, Flutter, AWS, Best Selling Instructor Hi, I’m Paulo – Your Guide to Mastering Development, Cloud, and AI Engineering With a passion for empowering learners, I’ve had the privilege of teaching over 350,000 students across 175 countries. Whether you’re diving into Android, Java, Flutter, AWS Cloud, or venturing into the world of AI engineering, I’m here to help you unlock your full potential. My Expertise I bring extensive hands-on experience in: AI Engineering Mobile App Development (Android & iOS) Cross-Platform Development (Flutter, Dart, and JavaFX) AWS Cloud Solutions And now, I’m also focused on the AI engineering landscape, helping developers leverage the power of machine learning and automation in their projects. My Mission No matter where you are in your journey—whether you're just starting or looking to sharpen advanced skills—my courses are designed to make you an exceptional developer and AWS Cloud Practitioner, equipped to tackle real-world challenges. Beyond coding, I enjoy spending time with my growing family, playing the guitar and mandolin, and traveling whenever I get the chance. Ready to Get Started? Android Development: The Comprehensive Android Development Masterclass Learn Android from scratch. This beginner-friendly course covers everything you need to build Android apps confidently, no prior experience required. The Complete Intermediate Android Masterclass Master essential Android topics like WorkManager API, ROOM Database, and background operations to level up your mobile development skills. Cross-Platform & Web Development: Flutter & Dart - The Complete App Development Course Develop beautiful iOS and Android apps with a single codebase using Dart and Flutter. AngularDart - Build Dynamic Web Apps with Angular & Dart Learn one of the most powerful web frameworks, Angular, combined with Dart to create interactive web applications. TornadoFX - Build JavaFX Applications with Kotlin Craft amazing desktop apps using Kotlin and JavaFX, taking advantage of Kotlin’s simplicity and expressiveness. AWS Cloud Mastery: Amazon EC2 Master Class (Includes Auto Scaling & Load Balancer) Amazon ECS & Fargate Masterclass Amazon EKS with Kubernetes AWS AppSync & Amplify AWS Lambda and Serverless Framework These courses are designed to make you proficient with cloud technologies, covering key AWS services to help you build scalable and efficient cloud solutions. Master Java: Java Masterclass - Beginner to Expert Guide Learn Java from the ground up and gain the skills to build powerful applications. Java Design Patterns - The Complete Masterclass Develop reliable, maintainable software using proven design patterns that are fundamental to professional Java programming. The Future of Development: AI Engineering I'm passionate about helping students explore the intersection of AI and software development. In my upcoming AI courses, I’ll show you how to integrate AI solutions into mobile apps and cloud systems, empowering you to become a cutting-edge developer with AI capabilities. I look forward to welcoming you to my courses and being part of your journey to becoming the best developer, cloud practitioner, and AI engineer you can be. See you inside?
