Technically

Technically

Share this post

Technically
Technically
What is RAG?
Copy link
Facebook
Email
Notes
More

What is RAG?

Retrieval Augmented Generation is a way to make AI models more personalized

Justin's avatar
Justin
Oct 07, 2024
∙ Paid
68

Share this post

Technically
Technically
What is RAG?
Copy link
Facebook
Email
Notes
More
5
Share

Subscribers have been asking me to write more about AI and how it works, and so I shall. This post — and others going into the mechanics of AI — is for paid subscribers only.

TL;DR

Retrieval Augmented Generation (RAG) is a way to make LLMs like GPT-4 more accurate and personalized to your specific data.

  • LLMs are powerful as hell, but they’re also generic: they’re trained on all data on the internet ever!

  • RAG helps you get more personalized responses tailored to your data by embedding your data in your model prompts

  • RAG relies on the model’s context window, which is how much data in can take in a prompt

  • Today’s RAG pipelines are pretty complex and rely on embedding models and vector databases

Alongside old school fine tuning, RAG is becoming the standard way to get better, more personalized results out of state of the art LLMs.

Back to the future: training models

The funny thing about RAG is that the basic concept has been around for as long as machine learning has. Long time readers will recall that back in the day, I studied Data Science in undergrad. “Old School” machine learning, before everyone was calling it AI, was entirely predicated on training a new model for every problem. 

Keep reading with a 7-day free trial

Subscribe to Technically to keep reading this post and get 7 days of free access to the full post archives.

Already a paid subscriber? Sign in
© 2025 Justin
Privacy ∙ Terms ∙ Collection notice
Start writingGet the app
Substack is the home for great culture

Share

Copy link
Facebook
Email
Notes
More