This week’s system design refresher:
Why Everyone Should Know About AI Evals: The Fundamentals Explained (Youtube video)
MCP vs Skills, Clearly Explained
5 Way to Defend Prompt Injection
How the X Algorithm Works
Why Everyone Should Know About AI Evals: The Fundamentals Explained
MCP vs Skills, Clearly Explained
Both MCP and Skills extend what an agent can do. But they solve different problems, and picking the wrong one adds cost or complexity you don't need.
The diagram breaks down the five dimensions that matter.

Integration: MCP is a client-server protocol that connects N agents to M backends through one interface. Agent Skills are folders with a SKILL. md that the agent loads on trigger.
Architecture: MCP runs as a separate process with its own runtime, speaking JSON-RPC. A Skill is just a directory: SKILL. md, optional scripts, references, and assets.
Invocation: MCP tools are called with typed parameters validated against a schema, and can be chained. Skills are invoked by the agent reading SKILL. md and running whatever commands it describes like bash, python, or curl.
Runtime: MCP servers often run in their own container or service. Skills run in the agent's own environment with no extra infra.
Where it fits: Use MCP to connect agents to live systems and data. Use Skills to give agents reusable know-how and instructions.
Over to you: What's the most interesting Skill you've come across recently?
5 Way to Defend Prompt Injection
Prompt injection tops the OWASP LLM Top 10 and there's no single fix.
Instead, you stack defenses, each one catching what the others miss.

Defenses come in two families: model-level and system-level.
Model-level defenses teach the model to resist injection.
Spotlighting wraps untrusted text in control tags like <UNTRUSTED>...</UNTRUSTED> and tells the model to treat anything inside as data, not instructions.
Instruction Hierarchy fine-tunes the model to rank the developer's system prompt above the user's message, and both above third-party content.
System-level defenses build a system around the LLM that bounds the damage.
Least-Privilege Tools: Give the agent the minimum tools it needs.
Human-in-the-Loop: Require explicit user approval before any sensitive action runs.
Planner / Executor Split: Two separate LLMs. The planner has tool access but never sees untrusted content. The executor reads untrusted content but has no tools.
No single defense is enough. Production systems like Gmail stack them, and together they make indirect injection manageable.
Over to you: what's the one defense you've seen work in production that isn't on this list?
How the X Algorithm Works

Here are the key steps:
Everything starts with a Feed Request.
The Home Mixer, the system’s orchestration layer, kicks things off by pulling your engagement history and preferences through Query Hydration.
Next, it gathers candidate posts from two sources: Thunder (posts from accounts you follow) and Phoenix Retrieval (posts from accounts you don’t follow, discovered through ML)
These candidates get enriched with metadata like author info and media details during Hydration, then pass through Filtering, which removes duplicates, old posts, blocked authors, and muted keywords.
Then comes scoring. A Grok-based transformer predicts engagement, a Weighted Scorer combines those predictions, and an Author Diversity Scorer prevents any single account from dominating your feed.
Top-scoring posts are selected, go through a final visibility filter, and become your Ranked Feed.
Over to you: What else will you add to the list of steps?
Disclaimer: This post is based on the publicly shared GitHub repo of the X algorithm by xAI
