HarrisonSec — Long-form writing on agent runtimes, distributed systems, and security

QuantumVault

Thu, 01 May 2025 00:00:00 +0000

QuantumVault: A Bootloader-Level Demo of Cold Storage Control

1. Project Motivation and Personal Backstory

QuantumVault isn’t the result of a corporate spec or a VC pitch deck. It started with something far simpler — a university course and a personal regret.

During an x86 assembly course at the University of the Fraser Valley (UFV), taught by Professor Talia Q, I was reintroduced to low-level system design. That meant working with BIOS interrupts, 16-bit memory segmentation, and even boot sector handoffs — things that most developers today will never touch.

StegoSafe

Thu, 08 May 2025 00:00:00 +0000

StegoSafe: Securing Your Bitcoin Keys Using Advanced Steganography

Executive Summary

StegoSafe represents a breakthrough in digital asset security, combining ancient steganographic principles with cutting-edge cryptography to create an innovative solution for protecting sensitive data - particularly cryptocurrency private keys. Unlike conventional security systems that rely on passwords, hardware devices, or centralized services, StegoSafe embeds encrypted fragments of private keys into ordinary image files, making them invisible to attackers while providing robust recovery mechanisms through distributed redundancy.

Secured VLAN

Fri, 16 May 2025 00:00:00 +0000

Secured VLAN: Enterprise-Grade Network Segmentation & Security

1. Project Overview & Core Objectives

Secured VLAN is a comprehensive blueprint for building a secure, scalable, and auditable enterprise network. This project demonstrates how to implement robust network segmentation, enforce least-privilege access, and centralize security controls using VLANs, ACLs, and firewalls. Designed for real-world deployment, it addresses challenges such as blurred internal-external boundaries, remote access, and evolving security threats.

Key Objectives:

Strict departmental isolation with VLANs
Enforced access policies using ACLs and firewalls
Centralized logging for compliance and future AI-driven analysis
High availability and resilience across all network layers
Scalable, modular design for future expansion

Figure 1: Example of a three-tier enterprise network topology with VLAN segmentation and DMZ.

I Tested Higgsfield's Minecraft 'Prompt-to-Build.' It Generates Shapes, Not Scenes.

Thu, 18 Jun 2026 08:00:00 +0000

Higgsfield shipped a Minecraft “prompt-to-build” feature: a mod that drops a “Supercomputer” block into your world, takes a free-text prompt, and generates a structure in-world a minute later. I spent one session putting real building prompts through it to see what it actually does, not what the landing page says it does. Eight prompts, fixed screenshots, an in-world walkthrough, and a scoring rubric.

The short version: it behaves like a single-cohesive-3D-form generator with strong canonical priors, not an architecture or scene engine.

Agent Architecture Is a Compute Allocation Problem: The Advisor Strategy, Cost-Curve Frame Recursed

Mon, 15 Jun 2026 08:00:00 +0000

In April 2026, Anthropic published a blog post called “The advisor strategy: Give agents an intelligence boost”, naming a pattern they had been A/B-testing in production: a cheaper model runs the agent loop end-to-end, an expensive model is consulted only when the cheap one hits a decision it can’t solve. They reported concrete numbers — Haiku + Opus advisor on BrowseComp at 41.2% (Haiku alone: 19.7%) at 15% of the cost of running Sonnet through the whole task.

Agent Retrieval Above the Crossover: A First-Principles Read of CodeGraph

Mon, 08 Jun 2026 08:00:00 +0000

The prior post in this series, Agent Retrieval Is a Cost Curve Problem, argued that a viable LLM-symbol-graph would need to satisfy six specific conditions — and that no existing tool had hit all six. The post went live on 2026-05-25; seven days earlier, CodeGraph had hit GitHub trending with exactly those six properties satisfied.

That’s the easy version of the update: framework predicted it, someone shipped it, here’s the existence proof. The companion piece (I Tested CodeGraph on Hono. The Tool-Call Savings Reproduce — the Cost Savings Don’t.) handles the empirical half — 40 verified-connected runs, a decision matrix, the install-or-not call. Short version of that post: the tool-call savings reproduce on an independent repo (−55%), the cost savings from the vendor benchmark don’t (+7% at Hono’s size). Fewer steps, not fewer dollars, until your repo is big enough.

I Tested CodeGraph on Hono. The Tool-Call Savings Reproduce — the Cost Savings Don't.

Mon, 01 Jun 2026 08:00:00 +0000

Two weeks ago CodeGraph hit GitHub trending — tree-sitter + SQLite/FTS5 + MCP for Claude Code, 19k+ stars in a week. The team published a benchmark on 7 repos showing 35% cheaper, 57% fewer tokens, 46% faster, 71% fewer tool calls vs. baseline.

Those are big numbers. They’re also numbers from a benchmark designed by the team that built the tool, on repos they chose. Designer bias is the #1 risk in any retrieval benchmark — when you pick the test repos and write the ground truth, you’ll consciously or unconsciously favor your own tool’s strengths.

Agent Memory Is a Cache Coherence Problem

Thu, 28 May 2026 08:00:00 +0000

This post is one half of a pair. The other half — Agent Retrieval Is a Cost Curve Problem — argues that Claude Code’s within-session code retrieval avoids RAG because the cost curve says it should. This piece argues something parallel about cross-session memory: the lossy auto-capture systems being marketed as “AI memory” are, in classical distributed-systems vocabulary, caches. They inherit every problem caches have always had, and the hype around them is mostly arguing for one side of a write-back vs write-through trade as if the other side didn’t exist.

Agent Retrieval Is a Cost Curve Problem: Why Claude Code Doesn't Use RAG

Mon, 25 May 2026 08:00:00 +0000

There’s a popular interview question making the rounds: “Why doesn’t Claude Code use RAG to retrieve code? Why grep?”

The popular answer goes: chunking breaks code structure, vectors approximate when code demands exact, indexes go stale, cold-start is slow, retrieval is a black box. All five are real. None of them are the reason.

They’re symptoms. The reason is older than RAG, older than LLMs, older than the term retrieval. It’s a cost curve.

Channels Aren't Message Passing — How Parked Goroutines OOM-Killed a Pod

Tue, 12 May 2026 08:00:00 +0000

It’s 3am. The Kafka consumer pod that’s been running cleanly for six weeks gets OOM-killed. Kubernetes restarts it. Five minutes later: OOM-killed again. Restart. OOM-killed a third time. By the fourth restart I’ve shelved the dashboard and started reading runtime/chan.go.

The code that died fit on one line:

events := make(chan Event)

I want to tell you that line is the bug. It isn’t. An unbuffered channel will happily backpressure a single producer — every send rendezvous with a receiver, the producer cannot run ahead. The channel did exactly what it was designed to do.

How I Improved an AI Agent from 40% to 60% — With A/B Test Data

Tue, 12 May 2026 00:00:00 +0000

The Setup

I was optimizing an AI agent for a production system — a creator agent that handles user requests like “make this character fiercer” or “rename this entity.” The agent runs a 5-layer pipeline: Perceive → Cognate → Decide → Act → Express, with real LLM calls at each step.

Quality was bad. Not “it doesn’t work” bad — “it works 40% of the time” bad. The remaining 60% were wrong entity targeting, infinite reasoning loops, and silent failures.

Don't Pick One AI. Run Three Against Each Other.

Sun, 03 May 2026 00:00:00 +0000

The Problem Nobody Talks About

AI can write code, generate content, analyze data, design systems, and manage projects. It’s getting better every month. The natural question: what’s left for humans?

The wrong answer: “AI will replace us.” The other wrong answer: “AI is just a tool, nothing changes.”

The right answer is uncomfortable: stop picking the best AI. Run multiple AIs in competition, and become the judge.

The Tournament Model

Three rules, learned the hard way:

Node Turns Waiting Into Events. Go Moves Context Switching Into User Space.

Mon, 27 Apr 2026 08:00:00 +0000

Most discussions of TypeScript/Node vs Go concurrency stop at the surface: Node is async, Go is threaded. That framing isn’t wrong — it just isn’t deep enough to be useful when you’re picking a runtime, debugging a tail-latency problem, or explaining to your team why one of the services keeps falling over under CPU load.

The real difference is not async vs threaded. It’s a question about where, in the system, suspended work lives — and what shape it takes when it’s resumed.

Why Your AI Agent Keeps Failing — The 90% Problem

Mon, 20 Apr 2026 00:00:00 +0000

The model is roughly 10% of what makes an AI agent work in production. The other 90% — context engineering, memory, validation, tool-call reliability — is where every team’s agents quietly stop working at scale.

This video walks through the four-layer failure model and where each kind of bug actually lives.

The 90% AI Agent Problem

Sat, 18 Apr 2026 18:00:00 +0000

Episode Summary

Building an AI agent that works is easy. Building one that keeps working is where most teams fail.

This episode breaks down the hidden 90% of agent engineering: context management, memory, tool execution, state recovery, and loop closure. I use Claude Code as the reference point, compare it with more fragile agent designs, and show how production quality often comes from code around the model, not from model changes themselves.

The 90% AI Agent Problem (Podcast Episode)

Sat, 18 Apr 2026 00:00:00 +0000

Long-form podcast version of the 90% Problem thesis. Goes deeper than the 7-minute video on each of the four layers: context, memory, validation, tool-call reliability.

If you’re scoping AI agent reliability work, this is the layered mental model the readiness review uses.

The 90% Problem: Why Most AI Agents Are Still Broken

Fri, 17 Apr 2026 00:00:00 +0000

Your Agent Works Great. Until It Doesn’t.

You built an AI agent over the weekend. It calls tools, remembers context, follows instructions. You demo it to your team. Everyone’s impressed.

Monday morning, a user types “rename Ember to Infernia.” Your agent loops 15 times, burns through your API budget, and returns a response that doesn’t contain the word “Infernia.” A rename. One entity. One operation.

I’ve been there. I ran an eval suite on a production agent — 5 test cases, 5 runs each. Pass rate: 40%. Not on hard tasks. On things like “update the right character out of six” and “rename one entity.” The model was GPT-4 class. Plenty capable. The problem was everything around the model.

Claude Code + Codex Plugin: Two AI Brains, One Terminal

Tue, 07 Apr 2026 00:00:00 +0000

You’re debugging a gnarly race condition. Claude Code has been going at it for 10 minutes — reading files, forming theories, running tests. Then it hits a wall. Same hypothesis, same failed fix, third attempt.

What if you could call in a second brain — a completely different model with fresh eyes — without leaving your terminal?

That’s what the Codex plugin for Claude Code does. It puts OpenAI’s Codex (powered by GPT-5.4) inside your Claude Code session as a callable rescue agent. Two models. Two reasoning styles. One shared codebase.

Why Claude Code's Agent Loop Is 1,421 Lines

Mon, 06 Apr 2026 00:00:00 +0000

Every AI coding agent runs the same core pattern: send context to an LLM, get back text and tool calls, execute tools, feed results back, repeat. LLM talks, program walks.

Claude Code’s implementation lives in query.ts — a 1,729-line async generator where the while(true) loop spans from line 307 to line 1728. That’s 1,421 lines of production state machine logic handling context compression, streaming tool execution, error recovery, and token budget management for millions of users.

Claude Code Deep Dive Part 4: Why It Uses Markdown Files Instead of Vector DBs

Sun, 05 Apr 2026 00:00:00 +0000

This is Part 4 of our Claude Code Architecture Deep Dive series. Part 1: 5 Hidden Features | Part 2: The 1,421-Line While Loop | Part 3: Context Engineering — 5-Level Compression Pipeline

This article replaces and deepens our earlier analysis, Claude Code’s Memory Is Simpler Than You Think. The original focused on limitations. This one focuses on why — the first-principles tradeoffs behind every design choice.

The Core Principle: Only Record What Cannot Be Derived

This single constraint governs every decision in Claude Code’s memory system:

How Claude Code Compresses Context — The 5-Level Pipeline

Sat, 04 Apr 2026 00:00:00 +0000

This is Part 3 of our Claude Code Architecture Deep Dive series. Part 1: 5 Hidden Features | Part 2: The 1,421-Line While Loop | Part 4: Memory Tradeoffs

Why Context Engineering Is the Real Moat

Every AI agent has the same fundamental constraint: a fixed-size context window. Claude’s is now up to 1M tokens. That sounds massive — until you realize a real coding session can easily generate multiples of that. Dozens of file reads, hundreds of tool calls, thousands of lines of output.

Claude Code Deep Dive Part 2: The 1,421-Line While Loop That Runs Everything

Fri, 03 Apr 2026 00:00:00 +0000

This is Part 2 of our Claude Code Architecture Deep Dive series. Part 1: 5 Hidden Features | Part 3: Context Engineering | Part 4: Memory Tradeoffs

Why Claude Code's Agent Loop Is 1,421 Lines

Observability and Billing for AI API Calls: A T-Shaped Architecture

Wed, 01 Apr 2026 08:00:00 +0000

Adding AI API calls to an existing backend is where most teams’ observability and billing instincts break. The calls look similar to any other RPC — send a JSON request, receive a JSON response. The difference is what happens to the meter. An ordinary RPC costs you deterministic compute: a few milliseconds of CPU, a few KB of network. An LLM API call costs you between $0.0001 and $1.50 depending on which model, which provider, how long the prompt was, how long the completion went, and whether the provider’s prompt cache kicked in. Same endpoint, same code path, two orders of magnitude of price variance per call.

Claude Code MEMORY.md Spec: The 4 Frontmatter Types Decoded (user / feedback / project / reference)

Wed, 01 Apr 2026 00:00:00 +0000

Updated: This analysis has been superseded by Part 4: Why It Uses Markdown Files Instead of Vector DBs — a deeper first-principles tradeoff analysis. The original article below focused on limitations; Part 4 focuses on why those design choices were made.

The Hype vs. The Source Code

After Claude Code’s source leak, one of the most talked-about discoveries was Kairos — a “permanent memory” system that consolidates your notes while you sleep. The AI community described it as a breakthrough in AI memory.

Claude Code Source Leaked: Kairos, Undercover Mode, Ultraplan — 5 Hidden Features (510K Lines)

Tue, 31 Mar 2026 00:00:00 +0000

This is Part 1 of our Claude Code Architecture Deep Dive series. Part 2: The 1,421-Line While Loop | Part 3: Context Engineering | Part 4: Memory Tradeoffs

What Happened

Anthropic shipped Claude Code v2.1.88 to npm with a 60MB source map still attached. That single file contained 1,906 source files and 510,000 lines of fully readable TypeScript. No minification. No obfuscation. Just the raw codebase, sitting in a public registry for anyone to download.

The AI Stack Explained — Extended Podcast (22 min)

Sun, 29 Mar 2026 00:00:00 +0000

Same first-principles framework as the 15-minute video — LLM talks, program walks — but with deeper exploration of each layer and the seams between them.

For listeners who want the audio-first format.

The Complete AI Architecture Deep Dive — From LLM to Autonomous Agent (48 min)

Sun, 29 Mar 2026 00:00:00 +0000

The longest cut of the AI Stack material. Where the 15-minute video gives you the framework and the 22-minute podcast gives you the talking-points walkthrough, this version goes layer-by-layer through every concept — from how tokens stream out of the LLM to how a multi-agent orchestrator decides who acts next.

For viewers who want the full mental model in one sitting.

The Complete AI Architecture Deep Dive: From LLM to Autonomous Agent (48 min)

Sat, 28 Mar 2026 09:00:00 +0000

Episode Summary

This is the extended version of Episode 1. Same first-principles framework — LLM talks, program walks — but with deeper exploration of each layer, more examples, and discussion of real-world implications.

If Episode 1 is the executive summary, this is the full technical report.

What’s Different From Episode 1

Deeper exploration of tokenization and why it matters for costs
More examples of Function Calling in production systems
Detailed walkthrough of MCP server architecture
Real-world agent implementations (Claude Code, Cursor, Copilot)
Extended discussion on progressive disclosure and token economics
Edge cases and failure modes at each layer

Consistency in Distributed Systems: Scenarios, Trade-offs, and What Actually Works

Sat, 28 Mar 2026 08:00:00 +0000

There’s an impulse, when someone first learns about consistency models in distributed systems, to want to classify the taxonomy into neat drawers. Strong here. Eventual there. Linearizable above it. Read-your-writes below. Study the diagram, pass the interview.

That taxonomy is real, but it’s not useful the way people think. Production systems don’t pick a consistency model and run with it. They pick a different model per feature, often per type of operation within a feature, and spend most of their engineering effort on the gaps between what the model provides and what users actually expect. The taxonomy is the menu. The interesting question is which dish each scenario needs.

The AI Stack Explained: LLM Talks, Program Walks

Sat, 28 Mar 2026 08:00:00 +0000

LLM, Token, Agent — They're All the Same Thing. (AI Stack Explained)

Watch the full 15-minute video walkthrough with animations.

LLM. Token. Context. Prompt. Function Calling. MCP. Agent. Skill.

The AI Stack Explained: LLM, Token, Context, Function Calling, MCP, Agent, Skill — They're All the Same Thing

Sat, 28 Mar 2026 08:00:00 +0000

Episode Summary

LLM. Token. Context. Prompt. Function Calling. MCP. Agent. Skill — 8 concepts that confuse every engineer, until you realize they’re all the same thing.

An LLM can only output text. It can’t browse the web, call APIs, or take any action. The program around it does everything else. LLM talks, program walks. That loop is how the entire AI world runs.

What We Cover

LLM — A word prediction machine. No thinking, no understanding.
Context — A genius with no memory. The program stitches history every time.
Prompt — User Prompt + System Prompt. Just what you say to the LLM.
Function Calling — The LLM outputs JSON text. The program executes it.
MCP — USB-C for AI tools. Build once, run everywhere.
Agent — The talks-and-walks loop on repeat.
Skill — Pre-written rules with progressive disclosure.
Two Questions — The buzzword shield that makes any concept transparent.

The AI Stack Explained: LLM Talks, Program Walks

Sat, 28 Mar 2026 00:00:00 +0000

LLM. Token. Context. Prompt. Function Calling. MCP. Agent. Skill — 8 concepts, 1 pattern.

This video peels back every layer of the AI stack from the bottom up, building a mental model that makes any new buzzword transparent.

gRPC Interceptors in Production: Design Patterns That Survive Real Load

Tue, 24 Mar 2026 08:00:00 +0000

gRPC interceptors are the middleware pattern, specialized for gRPC. If you’ve written HTTP middleware before, the shape is familiar — a function that wraps a call, can observe or modify the request, pass to the next handler, then observe or modify the response. The difference: gRPC’s type system makes the flavors (unary, server-stream, client-stream, bidi) explicit, and chain ordering matters more than most people realize.

Most online examples show a single toy interceptor. Production systems stack five to ten of them per service. Getting the composition right — ordering, concern separation, testability — is half of running a gRPC-based microservice well.

Go Generics, One Year In: Which Promises Held, Which Didn't

Wed, 18 Mar 2026 08:00:00 +0000

Go 1.18 shipped generics in March 2022. The two years before that were dominated by hopeful blog posts (“finally, a real type system!”) and the two years after by the predictable backlash (“why did we even bother, Go was simpler”). I’ve written production Go before and after. The honest answer is somewhere in the middle and closer to “useful for a narrower set of problems than we expected.”

This is a look back from someone who has shipped generic code in anger and reviewed a lot more of it. What held up. What didn’t. What habits to adopt and which to avoid.

Go Profiling in Anger: pprof, Escape Analysis, and Inlining Without Magic

Thu, 12 Mar 2026 08:00:00 +0000

Go’s performance culture has a ritual quality. “Use sync.Pool.” “Avoid interface boxing.” “Preallocate slices.” Copy-pasted from blog posts and applied without measurement. Sometimes helpful. Often hollow.

The honest answer is that Go performance work is mostly just profiling. Good profiling tells you what’s actually slow. Bad profiling — or no profiling — leaves you guessing. The toolchain that Go ships with is genuinely excellent; more engineers should use it, and fewer should follow checklist optimizations they haven’t measured.

sync.Pool in Go: When It Actually Helps, and When It Quietly Hurts

Thu, 05 Mar 2026 08:00:00 +0000

sync.Pool is one of those Go features that shows up prominently in “how to write fast Go” blog posts and then gets applied to everything. The result is a codebase sprinkled with pools that don’t help and sometimes hurt. Most Go code I review does not need sync.Pool. The code that does need it often uses it wrong.

This is a working engineer’s take on when pooling actually helps, when it’s wasted effort, and the specific traps it creates.

Why Failing Fast Triggers Cascading Failures in Distributed Systems

Wed, 04 Mar 2026 08:00:00 +0000

Episode Summary

Fail fast is widely accepted as a best practice in software engineering. But in distributed systems, blindly failing fast during infrastructure transitions — like Redis Sentinel failover, NATS leader election, or Kafka partition rebalancing — can turn a 12-second self-healing event into a 12-minute outage.

In this episode, we break down why this happens and walk through a concrete architectural pattern called the Failure Boundary Model that solves it.

Why Your "Fail-Fast" Strategy is Killing Your Distributed System (and How to Fix It)

Wed, 04 Mar 2026 08:00:00 +0000

It’s 2 AM. PagerDuty fires. Redis master is down. Your application, trained to fail fast, dutifully fails — every single request, all at once. By the time Sentinel promotes a new master 12 seconds later, you’ve already generated 40,000 errors and three escalation calls. The system recovered on its own. Your application didn’t let it.

This is the story of how “good engineering” can make a 12-second infrastructure event into a 12-minute outage — and how to design boundaries that prevent it.

RPC vs NATS: It's Not About Sync vs Async — It's About Who Owns Completion

Sat, 28 Feb 2026 08:00:00 +0000

A team I worked with once migrated an order-placement path from gRPC to NATS because “it’s decoupled and faster.” The old flow was simple: the web service called PlaceOrder via gRPC, got back an order ID, rendered success to the user. The new flow: web service publishes order.place to NATS, an order-service consumes it and processes asynchronously.

Within three weeks they had three kinds of incidents on rotation:

Duplicate orders — retry on the publisher side meant the same order was placed twice when the first publish actually succeeded but the ack was slow.
Lost orders — consumer crashed mid-process; no ack meant NATS redelivered, but the consumer had already partially committed state, so redelivery was rejected by a dedup check. The order just… disappeared from the user’s perspective.
Dark-failure support tickets — users reported “I clicked buy and nothing happened.” From the publisher side, everything looked fine. From the consumer side, processing time had drifted from 50 ms to 45 seconds because a downstream DB had a slow query, and the web team had no telemetry on the consumer side.

The retro landed on a single sentence: we thought we were changing the transport; we actually changed who owned the completion of the work.

NATS vs Kafka vs MQTT: Same Category, Very Different Jobs

Tue, 24 Feb 2026 08:00:00 +0000

The number of times I’ve watched a team pick a message system based on “Company X uses it” is depressing. Right behind it: the team that picks the one they already know, regardless of whether it fits the workload. NATS, Kafka, and MQTT get lumped together because they all pass messages between processes. That’s like lumping trucks, sedans, and motorbikes together because they all have wheels.

They are three different tools for three different shapes of problem. Once you know the axes that matter, the decision is usually easy.

Docker × Kubernetes: What They Really Changed (It's Not What You Think)

Sat, 21 Feb 2026 08:00:00 +0000

“A Docker container is basically a lightweight VM, right?” No. That sentence alone causes more architectural misunderstandings than any other in modern backend engineering. A VM virtualizes hardware. A container is a set of Linux kernel features — namespaces, cgroups, overlay filesystems — wrapped in a nicer CLI. Same host kernel, same memory space, same attack surface if the kernel has a bug. The marketing that says otherwise has cost teams real money in misconfigured production.

Scale-Up vs Scale-Out: Why Every Language Wins Somewhere

Fri, 20 Feb 2026 08:00:00 +0000

I worked with a team that rewrote a critical service from Go to Rust because “performance.” Six months later, the service was 30% faster, the team was miserable, and feature velocity had dropped to a crawl. Meanwhile the competitor team, still on Go, had shipped four new features.

We did the postmortem eventually. The service handled maybe 2,000 requests per second on a 4-core machine. CPU utilization sat around 20%. Rust’s extra speed bought us exactly nothing — the bottleneck was downstream database latency. What it cost us was every feature we didn’t ship while writing unsafe, fighting the borrow checker, and nursing the team through the learning curve.

Testing Real-World Go Backends Isn't What Many People Think

Wed, 18 Feb 2026 08:00:00 +0000

I’ve reviewed enough Go backend test suites to notice a pattern. The services with the most unit tests are often the ones with the most production incidents. Not because unit tests cause incidents — because the teams writing unit tests and calling it a day weren’t testing the things that actually broke.

Production bugs in distributed Go backends don’t usually look like “function computed wrong value.” They look like:

“The context deadline didn’t propagate into the background goroutine, so under load it leaked.”
“Two services agreed on the happy path, but the error-shape contract diverged six months ago, and now one returns status.Code(codes.Unavailable) where the other expects codes.ResourceExhausted.”
“The retry logic is race-y. With test-scale traffic it works; at 10x production it double-charges.”
“The database migration works on SQLite (our test DB) but not Postgres 15’s stricter planner.”

No unit test catches those. A different set of test shapes does.

Observability and Cost Attribution: Why One Pipeline Isn't Enough

Sun, 15 Feb 2026 08:00:00 +0000

A team I worked with tried to build their billing system on top of their tracing pipeline. The idea was clean: every operation already generates a span; spans already have duration and attributes; adding user_id and billable_units to each span lets finance query the trace store to compute invoices. One pipeline, less infrastructure. Beautiful.

Six weeks before the first billing cycle, the wheels came off. The tracing system was sampling at 10% because full-capture was too expensive. The sampler was head-based, meaning whether a trace got kept was decided at request entry, long before the code knew whether the request was billable. Some users got charged for 10% of their actual usage; others got free service. Nobody’s invoice agreed with the other team’s report.

Go Context in Distributed Systems: What Actually Works in Production

Fri, 13 Feb 2026 08:00:00 +0000

The bug was alive for three weeks. On a normal day it cost nothing. On the day it activated, it nearly took the service down.

The pattern was simple. An HTTP handler had to fetch data from three downstream gRPC services and merge the results. The team had done the disciplined thing: set a 5-second deadline on the request context, propagate it all the way through to the handler, use errgroup for parallelism. Except — and you’ve probably seen this one — the fan-out looked like this:

Go's Concurrency Is About Structure, Not Speed: chan and context as Lifecycle Primitives

Fri, 21 Nov 2025 08:00:00 +0000

For a while, I thought channels were Go’s way of doing message passing. Something like Erlang processes or actors, except with a simpler syntax. That understanding is fine if you’re writing tutorials. It is not fine when you’ve just OOM-killed a pod for the third time in an hour because your worker pool wasn’t really a pool.

The moment it clicked for me was during a production incident. A Kafka consumer service had been humming along for months at about 1,000 messages per second. Then an upstream team replayed twelve hours of events into the topic at once — roughly 1.2 million messages in two minutes.

IronSys: A Production Blueprint for Modern Concurrency

Wed, 22 Oct 2025 08:00:00 +0000

In the last post I walked through the four concurrency pillars — shared memory + locks, CSP, actors, STM — and argued that real systems mix them on purpose. Someone reasonably asked: okay, but what does that actually look like? Fair question. Abstract taxonomy is less useful than a worked example.

IronSys is that worked example. It’s a composite blueprint — not a real service, but representative of a class of services I’ve designed, helped design, or debugged in production. Let’s say it’s a mid-sized backend system: public API, stateful user sessions, streaming data in, aggregation and reporting out. The kind of thing that appears in the middle of any serious platform.

From Locks to Actors: The Four Pillars of Modern Concurrency

Mon, 20 Oct 2025 08:00:00 +0000

Most working engineers have spent ninety percent of their concurrent-programming life in one model: shared memory protected by locks. Threads that all see the same variables. Mutexes around the critical sections. Hope and care. It’s the model every OS textbook teaches, every mainstream language supports, and every senior engineer has a horror story about.

It’s also not the only option. Or even the best one, for many of the problems it gets used for. Three other models — CSP, actors, and software transactional memory — have been around for decades, mature enough for production, and each solves a class of problems that lock-based designs handle poorly.

Why Go Handles Millions of Connections: User-Space Context Switching, Explained

Mon, 13 Oct 2025 08:00:00 +0000

Somewhere around 40,000 concurrent connections, your Java service falls over. Not from CPU, not from network — from memory, because every connection is a thread and every thread wants its own megabyte of stack. By the time you’ve finished Googling whether this is a -Xss problem or a ulimit problem, Ops has already bumped the box to 64 GB and you’ve pushed the wall back another 20,000 connections. Linear in RAM. It never ends.

From Real Mode to Protected Mode: Building Custom GDT & IDT for x86 Security

Wed, 13 Aug 2025 00:00:00 +0000

The single most consequential transition in x86: 16-bit real mode → 32-bit protected mode. Once you cross it, you get segment privilege levels, larger address space, and the foundation every OS layer above you takes for granted.

This video walks the GDT and IDT setup from scratch, then performs the lgdt / far jump that flips the CPU into the new mode. Hand-coded, no OS underneath.

How to Build & Debug a Custom ARM64 Linux Kernel with Yocto, QEMU, and GDB

Tue, 05 Aug 2025 00:00:00 +0000

The full pipeline for someone who wants to debug ARM64 kernel code without sinking weeks into toolchain setup: Yocto for the build, QEMU as the target, GDB attached over the QEMU GDB stub. By the end you can break on any kernel function and inspect register state.

Companion to the CoreTracer + SentinelEdge projects — the same toolchain that makes those experiments reproducible.

eBPF on Linux: kprobe vs fentry (tracing) — Hooking Internals & Assembly Analysis

Mon, 04 Aug 2025 00:00:00 +0000

Two ways to attach an eBPF program to a kernel function: kprobe (old, INT3-based, more overhead) and fentry (newer, BPF trampoline, lower overhead). This video disassembles both attachment paths so you can see exactly where the hook intercepts the function and why fentry is the production-grade choice.

Foundation work for the SentinelEdge project — anything serious about kernel-level observability ends up here.

Rust vs C: Assembly Comparison — Ownership & Bounds Checking

Mon, 04 Aug 2025 00:00:00 +0000

Two languages, same hardware, different machine code. This video disassembles a small benchmark in both Rust and C and walks through what the compiler actually emits for ownership transfers, array indexing with bounds checks, and immutability.

If you’ve heard “Rust is zero-cost” without seeing the assembly, this is where the claim either holds up or doesn’t.

SentinelEdge: Advanced eBPF Kernel Security & Systems Architecture

Mon, 04 Aug 2025 00:00:00 +0000

A walkthrough of SentinelEdge — the project where I bring eBPF, kernel-level tracing, and production-style observability together into one framework. The video covers the architecture choices and how each component (loader, BPF programs, userspace consumer) fits.

Background work for the AI Operator track too: many of the same observability ideas transfer directly to monitoring LLM call graphs at production scale.

GitHub: SentinelEdge
Companion: eBPF kprobe vs fentry — hooking internals

Cache Miss, TLB Miss & False Sharing — The Performance Killers in 3 Minutes

Sun, 03 Aug 2025 00:00:00 +0000

Same code, different cache layout, 10× performance gap. This video runs three benchmarks back-to-back to show how cache misses, TLB misses, and false sharing each show up in perf counters — and why none of them are visible in a flame graph.

Part of the CoreTracer project: the kind of bottleneck that you can only debug after you know to look for it.

Store→Load Reordering: x86 vs ARM64 Real-World Test

Sun, 03 Aug 2025 00:00:00 +0000

Same C code, two architectures, two different observable behaviors. x86-64’s TSO model hides store→load reordering most of the time; ARM64’s weaker model lets you see it directly. This video runs a minimal test that demonstrates the difference and prints the reorder count.

If you’ve written lock-free code that “worked on x86 but broke on ARM,” this is the underlying mechanism.

Blog post: store→load reordering — x86 vs ARM64 and what fences do

Debugging Real-Mode Bootloader in GDB (CS Changed, Symbols Broken)

Sat, 02 Aug 2025 00:00:00 +0000

When CS changes mid-execution — for example, a far jump from your bootloader’s load address into your second-stage code — GDB’s symbol resolution silently breaks. You see addresses but no names, and stepping looks like it’s into the void.

This video shows the workflow that keeps symbols valid across segment changes: explicit symbol-file loads at the new segment base, plus the add-symbol-file trick that nobody writes down in tutorials.

Debugging Ubuntu 6.8 x86_64 Kernel with GDB & QEMU — Disable KASLR Without Rebuild

Sat, 02 Aug 2025 00:00:00 +0000

The x86_64 counterpart to the ARM64 walkthrough. Build Ubuntu 6.8 kernel from source with debug symbols, boot under QEMU, attach GDB. The interesting part: disabling KASLR via boot parameter so your symbol addresses actually match the running kernel — without recompiling.

If you’ve ever had GDB show wildly wrong line numbers in kernel debugging, KASLR is almost certainly why.

Blog post: Ubuntu 6.8 kernel debug — the KASLR-mismatch lesson

How x86 Jumps REALLY Work: EFLAGS Truth with GDB + pwndbg

Fri, 11 Jul 2025 00:00:00 +0000

Real reverse engineers know: every x86 conditional jump is decided by EFLAGS bits — ZF, SF, CF, OF — not by what your source code looked like. This short demonstration uses GDB + pwndbg to make the flags visible while stepping through a hand-crafted instruction sequence.

If you’ve ever wondered why a jge fires when you expected a jl, the answer is always in the flag register.

Blog post: x86 jumps and EFLAGS — what GDB shows you

Rust vs C Assembly: Complete Performance and Safety Analysis - Panic or Segfault?

Tue, 08 Jul 2025 10:00:00 +0000

🦀 Rust vs C Assembly Performance Analysis: Panic or Segfault?

Complete Deep Dive into Memory Safety, Performance, and Compiler Behavior

🔥 When your array goes out of bounds, does your program crash with a segfault or gracefully panic?
The answer reveals everything about modern systems programming languages and their runtime behavior.

🚀 The Ultimate Systems Programming Showdown

In the arena of systems-level development, C reigns as the battle-tested veteran with decades of proven performance, while Rust emerges as the ambitious challenger promising memory safety without performance costs. Both languages can build operating systems, device drivers, and embedded firmware. But Rust boldly claims to be “memory-safe by default,” while C is often criticized as “the art of walking through a minefield.”

How to Build a Bootloader from Scratch — Just Assembly, No OS

Wed, 18 Jun 2025 00:00:00 +0000

This is not GRUB. It’s a hand-written Stage-1 bootloader written in raw x86 assembly, booting directly from BIOS with no operating system and no standard library underneath it.

Part of the NanoBoot project — building up from BIOS POST to a working protected-mode jump, one register at a time.

Legacy Compatibility Lab: My Full Stack for Reviving Dead Software

Fri, 30 May 2025 00:00:00 +0000

Legacy Compatibility Lab: My Full Stack for Reviving Dead Software

“Fix software that shouldn’t be running anymore — and make it run again.”

Overview

This is my custom-built stack for analyzing, patching, and rebuilding legacy Windows software — covering everything from VC++6/Delphi apps on Windows XP to modern Win10/Win11 binaries that must remain compatible with Win7 or even XP.

Placed 2nd at BSides Vancouver 2025 Blue Team CTF – With Just ChatGPT and Stubbornness

Sun, 25 May 2025 00:00:00 +0000

Placed 2nd at BSides Vancouver 2025 Blue Team CTF – With Just ChatGPT and Stubbornness

Summary

I placed 2nd at the BSides Vancouver 2025 Blue Team CTF using nothing but ChatGPT and sheer persistence. This post covers how I tackled the competition with no team, no fancy toolkits, and a whole lot of stubborn curiosity. Here’s how it went.

HarrisonSec Lab Tour | A Journey Through Computing Evolution

Sat, 17 May 2025 00:00:00 +0000

About This Video

In this lab tour, I share my personal journey through computing history and security evolution. From my first encounter with Windows 98 to modern AI-driven security research, this is a story of passion, persistence, and technical mastery.

Need Expert Help with Security or Legacy Systems?

Facing security challenges with legacy systems or need advanced security consulting? I specialize in:

Legacy system security and modernization (Legacy Projects)
Advanced vulnerability research
Custom exploit development
Enterprise security architecture

Email Me

Complete Secure Company Network Design

Mon, 12 May 2025 00:00:00 +0000

Video Presentation

Why This Video is Worth Your Time

This extensive tutorial presents a complete enterprise network security design and implementation using Cisco Packet Tracer. At over 3 hours long, it’s a comprehensive resource that’s particularly valuable for security professionals, network engineers, and IT students looking to understand how proper network segmentation works in practice.

What makes this video exceptional is that it demonstrates a complete working implementation of several critical security concepts including:

Buffer Overflow Attack Explained

Sat, 10 May 2025 00:00:00 +0000

Why This Video Is Worth Watching

This is one of the clearest and most intuitive explanations of buffer overflow attacks I’ve seen. Dr. Mike Pound not only covers the theoretical foundations but demonstrates the complete attack implementation process, from stack memory manipulation to successfully gaining root privileges on a Linux system. The step-by-step approach makes complex memory security concepts accessible to viewers with various technical backgrounds.

Technical Analysis

1. Buffer Overflow Fundamentals

Dr. Pound explains the essence of buffer overflows with exceptional clarity - when a program attempts to write data beyond pre-allocated memory space, the excess data overwrites adjacent memory regions. In security, this isn’t just a bug; it’s a vulnerability that can be weaponized to execute arbitrary code. The video demonstrates how writing past the end of a buffer can overwrite critical stack data, including the return address.

LegacyOps™ | Expert Legacy Software Recovery & Security

Mon, 05 May 2025 00:00:00 +0000

LegacyOps™ — Fixing What Others Fear to Touch

Legacy systems are everywhere. They're running hospitals, factories, universities, and old ERP suites.
But nobody wants to maintain them — until they break.

HarrisonSec — Long-form writing on agent runtimes, distributed systems, and security

QuantumVault

QuantumVault: A Bootloader-Level Demo of Cold Storage Control

1. Project Motivation and Personal Backstory

StegoSafe

StegoSafe: Securing Your Bitcoin Keys Using Advanced Steganography

Executive Summary

Secured VLAN

Secured VLAN: Enterprise-Grade Network Segmentation & Security

1. Project Overview & Core Objectives

I Tested Higgsfield's Minecraft 'Prompt-to-Build.' It Generates Shapes, Not Scenes.

Agent Architecture Is a Compute Allocation Problem: The Advisor Strategy, Cost-Curve Frame Recursed

Agent Retrieval Above the Crossover: A First-Principles Read of CodeGraph

I Tested CodeGraph on Hono. The Tool-Call Savings Reproduce — the Cost Savings Don't.

Agent Memory Is a Cache Coherence Problem

Agent Retrieval Is a Cost Curve Problem: Why Claude Code Doesn't Use RAG

Channels Aren't Message Passing — How Parked Goroutines OOM-Killed a Pod

How I Improved an AI Agent from 40% to 60% — With A/B Test Data

The Setup

Don't Pick One AI. Run Three Against Each Other.

The Problem Nobody Talks About

The Tournament Model

Node Turns Waiting Into Events. Go Moves Context Switching Into User Space.

Why Your AI Agent Keeps Failing — The 90% Problem

Related

The 90% AI Agent Problem

Episode Summary

The 90% AI Agent Problem (Podcast Episode)

Related

The 90% Problem: Why Most AI Agents Are Still Broken

Your Agent Works Great. Until It Doesn’t.

Claude Code + Codex Plugin: Two AI Brains, One Terminal

Why Claude Code's Agent Loop Is 1,421 Lines

Claude Code Deep Dive Part 4: Why It Uses Markdown Files Instead of Vector DBs

The Core Principle: Only Record What Cannot Be Derived

How Claude Code Compresses Context — The 5-Level Pipeline

Why Context Engineering Is the Real Moat

Claude Code Deep Dive Part 2: The 1,421-Line While Loop That Runs Everything

Why Claude Code's Agent Loop Is 1,421 Lines

Observability and Billing for AI API Calls: A T-Shaped Architecture

Claude Code MEMORY.md Spec: The 4 Frontmatter Types Decoded (user / feedback / project / reference)

The Hype vs. The Source Code

Claude Code Source Leaked: Kairos, Undercover Mode, Ultraplan — 5 Hidden Features (510K Lines)

What Happened

The AI Stack Explained — Extended Podcast (22 min)

Related

The Complete AI Architecture Deep Dive — From LLM to Autonomous Agent (48 min)

Related

The Complete AI Architecture Deep Dive: From LLM to Autonomous Agent (48 min)

Episode Summary

What’s Different From Episode 1

Links

Consistency in Distributed Systems: Scenarios, Trade-offs, and What Actually Works

The AI Stack Explained: LLM Talks, Program Walks

LLM, Token, Agent — They're All the Same Thing. (AI Stack Explained)

The AI Stack Explained: LLM, Token, Context, Function Calling, MCP, Agent, Skill — They're All the Same Thing

Episode Summary

What We Cover

Links

The AI Stack Explained: LLM Talks, Program Walks

gRPC Interceptors in Production: Design Patterns That Survive Real Load

Go Generics, One Year In: Which Promises Held, Which Didn't

Go Profiling in Anger: pprof, Escape Analysis, and Inlining Without Magic

sync.Pool in Go: When It Actually Helps, and When It Quietly Hurts

Why Failing Fast Triggers Cascading Failures in Distributed Systems

Episode Summary

Why Your "Fail-Fast" Strategy is Killing Your Distributed System (and How to Fix It)

RPC vs NATS: It's Not About Sync vs Async — It's About Who Owns Completion

NATS vs Kafka vs MQTT: Same Category, Very Different Jobs

Docker × Kubernetes: What They Really Changed (It's Not What You Think)

Scale-Up vs Scale-Out: Why Every Language Wins Somewhere

Testing Real-World Go Backends Isn't What Many People Think

Observability and Cost Attribution: Why One Pipeline Isn't Enough

Go Context in Distributed Systems: What Actually Works in Production

Go's Concurrency Is About Structure, Not Speed: chan and context as Lifecycle Primitives

IronSys: A Production Blueprint for Modern Concurrency

From Locks to Actors: The Four Pillars of Modern Concurrency

Why Go Handles Millions of Connections: User-Space Context Switching, Explained

From Real Mode to Protected Mode: Building Custom GDT & IDT for x86 Security

Related