Overslaan naar inhoud

RAG Low-Cost

Begint
Eindigt
Aan de kalender toevoegen:
Course Description 

As organisations increasingly rely on large volumes of internal documents, the ability to build reliable, cost-effective AI systems for document search and question answering has become a key technical capability. This intensive module provides participants with a comprehensive, hands-on understanding of modern document processing, natural language processing, large language models, and Retrieval-Augmented Generation (RAG) pipelines.

The course focuses on applied data science techniques for text and documents, covering embeddings, semantic similarity, vectorisation, and vector search. Participants will learn how to design and implement a complete RAG pipeline locally, including document ingestion, chunking strategies, indexing, retrieval, and controlled text generation. Particular attention is given to understanding the limitations of large language models, common sources of hallucination, and the quality-control mechanisms required to produce trustworthy outputs.

Through a realistic, compliance-oriented business scenario, participants will build and evaluate a low-cost RAG system tailored to SME needs. The module emphasises transparency, traceability, and validation, teaching participants how to enforce source citation, verify retrieved content, and assess system reliability. Open-source and lightweight local solutions are used throughout to minimise operational and integration costs.

By the end of the course, participants will have designed and deployed a fully functional RAG pipeline, documented their architectural and design decisions, and gained the practical expertise needed to build internal AI-powered knowledge systems that are accurate, auditable, and fit for real-world organisational use.

Course Content 

Day 

Content 

1

Theory: Data Science, NLP, LLMs, embeddings, chunking, vector databases (FAISS/ChromaDB)

2

Practice: Building a low-cost RAG pipeline, data handling, indexing strategies. 

3

Use Case: Full pipeline application on regulatory compliance scenario, testing, deliverables. 


Learning Outcomes 
  • Understand the foundations of applied Data Science for text and documents 
  • Master embeddings, semantic similarity, and vectorisation 
  • Build an end-to-end RAG pipeline locally: ingestion, chunking, indexing, retrieval, generation
  • Understand LLM limitations, hallucination sources, and quality-control mechanisms 
  • Apply RAG techniques to a realistic, compliance-focused business scenario 
Practical Work 
  • Build a RAG system using official internal documents 
  • Implement strict citation of sources and full traceability 
  • Create validation mechanisms (similarity checks, source verificatio) 
  • Deploy using open-source and lightweight local solutions 
Deliverables 
  • A fully functioning low-cost RAG pipeline (script/GitHub)
  • A structured vector index (embeddings + metadata) 
  • A simple architecture diagram of the pipeline 
  • A mini technical report with design decisions and recommendations  


Target Audience

Technical professionals looking to build internal knowledge systems with AI-assisted answers. Developers, data analysts, and IT staff wanting to reduce integration costs using open-source solutions. 

Prerequisite

Python knowledge required 

Trainer

Ilker Makine is an engineer and AI practitioner working at the intersection of data, knowledge management, and real-world applications. He is the founder of Tekno-Family ASBL and actively works on applied AI projects in regulated environments. This training focuses on RAG (Retrieval-Augmented Generation): from fundamentals to concrete architectures. Participants will explore document ingestion, retrieval strategies, evaluation, and limitations. The goal is to move from theory to deployable, trustworthy RAG systems.


Price 

Thanks to the support of the European Commission and Innoviris in the framework of the EDIH sustAIn.brussels, SMEs and midcaps receive this training free of charge (0€),  in the context of de minimis aid. Large companies and participants without a company pay 3067€ participant.

Practical Information: 

Language : English (Bilingual exchanges FR/EN welcome) 

Location: BeCentral, Cantersteen 12, 1000 Brussels. 

Format: In person, interactive, hands-on. 

Participants: Max 18 participants. 

Duration: 12 hours (over 3 days) 

Questions: 

Yavuz Sarikaya - Programme Manager 

 yavuz.sarikaya@ulb.be