First-principles breakdowns of AI architecture — LLM, Function Calling, MCP, Agent, RAG and beyond. No hype. No buzzwords. Just how it actually works. For engineers building AI workflows. By Harrison Guo.
Subscribe
RSS Feed: /podcast/feed.xml — Add this to Apple Podcasts, Spotify, or any podcast app.
Building an AI agent that works is easy. Building one that doesn't break is 90% of the work. In this episode, I break down the five pillars of agent architecture, the LLM vs. Code divide, and how I improved a production agent from 40% to 60% using code changes alone.
The extended 48-minute deep dive into every layer of the AI stack — tokenization costs, Function Calling in production, MCP server architecture, real-world agents (Claude Code, Cursor, Copilot), progressive disclosure, and token economics. For engineers who want the full picture.
A 22-minute first-principles breakdown of the entire AI stack. An LLM can only output text — the program does everything else. Learn how Function Calling, MCP, Agents, and Skills all follow one pattern: LLM talks, program walks.