Retrieval-Augmented Generation · GraphRAG · Hallucination Mitigation

RCM Knowledge Assistant

A true RAG system — not prompt-stuffing — over public revenue-cycle-management references. It retrieves from a vector index, augments with a domain knowledge graph, answers only from what it found with inline citations, and refuses when its sources don’t cover the question.

31
Public docs
32
Indexed chunks
47/590
Graph nodes/edges
384
Vector dim

RCM Knowledge Assistant

Grounded RAG · inline citations · confidence-gated

Ask me anything about revenue-cycle management — CARC/RARC codes, denials, appeals, prior authorization, the claim lifecycle. Every answer is grounded in retrieved public sources with inline [n] citations, and I refuse to answer when my sources don't cover it.

Try an example

How it works

The retrieval pipeline

Each question runs the full guardrailed RAG loop below. Embeddings are computed locally with Xenova/all-MiniLM-L6-v2 (free, no embedding API); generation uses Groq llama-3.3-70b-versatile.

1

Guard

Rate-limit by IP, then a prompt-injection filter blocks attempts to override the “answer only from context” contract or exfiltrate the system prompt.

2

Embed

The query is embedded into the same 384-dimension MiniLM vector space as the corpus, locally and for free — no embedding-API cost or key.

3

Retrieve + gate

Cosine similarity ranks the top-k chunks. A confidence gate (best score < 0.35) refuses out-of-scope questions instead of hallucinating — the core hallucination-mitigation control.

4

Graph-augment

A domain knowledge graph (CARC/RARC codes → denial reasons → appeal steps) expands the vector hits with graph-adjacent chunks, surfacing related context pure vector search would miss.

5

Ground & cite

A strict system prompt instructs the model to answer only from the numbered context and cite sources as [n]. Citations are reconciled against what the answer actually used.

6

Return

The API returns { answer, citations, confidence }. The UI renders inline [n] chips, a sources panel with the real public URLs, and a confidence badge.

Why this exists

A domain-grounded RAG proof of work

This is a portfolio demo for an AI Architect application. It targets the role’s exact asks — RAG and GraphRAG, retrieval guardrails, and hallucination mitigation — applied to the revenue-cycle domain. The guardrails are real: a confidence gate that refuses, an injection filter, citation reconciliation, rate limiting, and a deterministic extractive fallback so the system is grounded even without an LLM key. It is labeled honestly as graph-augmented retrieval, and built entirely on public standards (X12 / WPC / CMS).