LLMs Demystified

Understand how Large Language Models actually work

Why I Built This

When ChatGPT launched, I was like everyone else—completely blown away. But as a DevOps engineer, I have this annoying habit: I can't just use something without understanding how it works.

So I tried to learn. And hit a wall.

Every article threw around words like transformers, attention mechanisms, embeddings, and tokens as if I should already know what they mean. Papers were filled with equations that made my eyes glaze over.

And the explanations? "Neural networks learn patterns from data." Cool. But HOW? What does that actually mean?

I could use the API. I could prompt engineer my way through problems. But I had no idea what was happening inside that black box.

So I'm building this page. Not because I've figured it all out—but because teaching is how I learn. If you're frustrated like I was, let's figure this out together.

LLMs Are Not Magic. They're Prediction Machines.

Here's the uncomfortable truth that took me way too long to grasp: Large Language Models don't "understand" anything. They don't "think." They don't have opinions or consciousness.

An LLM is just a very sophisticated autocomplete. Given some text, it predicts what token comes next.

The Core Loop

1📝

You type text

'The quick brown fox'

2🔢

LLM converts to numbers

Text → Tokens → Vectors

3🎯

Predicts next token

Statistically: 'jumps' is likely

Then it appends that token and repeats. Again. And again. That's it. That's the entire magic trick.

The Pipeline We'll Explore

Input BlockNOW

  • Tokenization
  • Token Embeddings
  • Position Encoding

Processor

  • Attention
  • Feed-Forward
  • Layer Norm

Output Block

  • Prediction Head
  • Softmax
  • Sampling

We're starting with the Input Block—how text becomes numbers.

BY THE END OF THIS SECTION, YOU'LL UNDERSTAND:

  • Tokenization: How "hello" becomes [15496] and why that matters
  • Embeddings: How tokens become 768-dimensional vectors that capture meaning
  • Why "cat" and "dog" are closer than "cat" and "chair" in vector space

Ready to demystify the black box? Let's start with the first step: Tokenization.