Building a Web Scraping RAG Pipeline in .NET Using GitHub Models

Retrieval‑Augmented Generation (RAG) has quickly become one of the most practical ways to build AI systems that work with real‑world data instead of relying only on a model’s training knowledge.

Most RAG examples today are written in Python and often depend on frameworks like LangChain. While powerful, this leaves a gap for developers working primarily in .NET and C#.

In this post, I’ll walk through a .NET Console application I built that implements a complete RAG pipeline using GitHub Models for embeddings and chat completion.

What This Project Does?

At a high level, this application allows you to:

Provide a website URL
Automatically scrape and clean the page content
Convert the content into embeddings
Store embeddings in an in‑memory vector store
Ask questions about the website
Get accurate, context‑aware answers generated by an AI model

All of this runs inside a .NET Console App, making it easy to debug, extend, and later expose as an API.

Architecture Overview

The application follows a simple but effective RAG architecture:

Each step is implemented as a separate service, making the system easy to test and reuse.

How the RAG Pipeline Works?

Web Scraping

The application fetches the website HTML and extracts visible content from tags like: p, h1–h6,li, span, div.

Script and style tags are removed to reduce noise.
Text Chunking

The cleaned text is split into overlapping chunks.This improves retrieval accuracy and mirrors how modern RAG systems operate. Example: Chunk size: 300 characters and Overlap: 50 characters
Embeddings with GitHub Models

Each chunk is converted into a vector embedding using GitHub Models, allowing the system to perform semantic similarity searches.

The same embedding model is used for:
- Website content
- User questions
This ensures meaningful vector comparisons.
Vector Similarity Search

When a question is asked:
- The query is embedded
- Cosine similarity is calculated against all stored vectors
- The top K most relevant chunks are selected
This acts as the knowledge retrieval step.
Prompt Construction

The final prompt sent to the chat model contains:
- Retrieved context chunks
- Conversation history
- The user’s question
This keeps responses accurate, grounded, and conversational.
Chat Completion

The prompt is sent to a GitHub Models chat model, which generates the final answer. If the context does not contain relevant information, the model is instructed to clearly say so.

Running the Application

dotnet restore
dotnet run

Console Commands

Enter a website URL to initialize the RAG pipeline
Ask questions about the website content
Type new to load another website
Type quit to exit

Check out the full source code on GitHub:

Happy coding!! 😊

1 comment:

SusabMarch 25, 2026 at 3:46 AM
Navigating the 2026 landscape requires a website chatbot integration that treats identity verification as a core feature. We need a system that can securely validate professional credentials before granting access to our exclusive whitepapers or scheduling a senior consultation. I am looking for a tech partner that treats credentialing as a core part of our business infrastructure and provides weekly strategic insights into our funnel health, ensuring our technical standing remains optimized.

Dot Net World

Pages

Tuesday, March 3, 2026

Building a Web Scraping RAG Pipeline in .NET Using GitHub Models

What This Project Does?

Architecture Overview

How the RAG Pipeline Works?

Web Scraping

Text Chunking

Embeddings with GitHub Models

Vector Similarity Search

Prompt Construction

Chat Completion

Running the Application

Console Commands

1 comment: