Tuesday, March 3, 2026

Building a Web Scraping RAG Pipeline in .NET Using GitHub Models

Retrieval‑Augmented Generation (RAG) has quickly become one of the most practical ways to build AI systems that work with real‑world data instead of relying only on a model’s training knowledge.

Most RAG examples today are written in Python and often depend on frameworks like LangChain. While powerful, this leaves a gap for developers working primarily in .NET and C#.

In this post, I’ll walk through a .NET Console application I built that implements a complete RAG pipeline using GitHub Models for embeddings and chat completion.

What This Project Does?

At a high level, this application allows you to:

  1. Provide a website URL
  2. Automatically scrape and clean the page content
  3. Convert the content into embeddings
  4. Store embeddings in an in‑memory vector store
  5. Ask questions about the website
  6. Get accurate, context‑aware answers generated by an AI model

All of this runs inside a .NET Console App, making it easy to debug, extend, and later expose as an API.

^ Scroll to Top