Tuesday, March 3, 2026

Building a Web Scraping RAG Pipeline in .NET Using GitHub Models

Retrieval‑Augmented Generation (RAG) has quickly become one of the most practical ways to build AI systems that work with real‑world data instead of relying only on a model’s training knowledge.

Most RAG examples today are written in Python and often depend on frameworks like LangChain. While powerful, this leaves a gap for developers working primarily in .NET and C#.

In this post, I’ll walk through a .NET Console application I built that implements a complete RAG pipeline using GitHub Models for embeddings and chat completion.

What This Project Does?

At a high level, this application allows you to:

Provide a website URL
Automatically scrape and clean the page content
Convert the content into embeddings
Store embeddings in an in‑memory vector store
Ask questions about the website
Get accurate, context‑aware answers generated by an AI model

All of this runs inside a .NET Console App, making it easy to debug, extend, and later expose as an API.

Dot Net World

Pages

Tuesday, March 3, 2026

Building a Web Scraping RAG Pipeline in .NET Using GitHub Models

What This Project Does?