Is Go just 'faster' than Node?

Per-request, no — they're often comparable on I/O-bound workloads. The real difference is what each model degrades into. Node's event loop is single-threaded for JS code, so a CPU-bound task blocks every other request on that process. Go's goroutines parallelize across cores naturally, so the same CPU spike just uses another core.

Is `await` in Node multithreaded?

No. `await` is syntactic sugar for a Promise state machine. The function suspends, the code after `await` is captured as a continuation, and the JS main thread returns to the event loop. When the Promise resolves, the continuation goes back onto the task queue. One thread, many suspended states.

What's the actual unit of scheduling in each runtime?

Node schedules continuations — the captured tail of an async function after `await`. Go schedules goroutines — full call stacks the runtime can suspend at a blocking point and resume later. That single difference cascades into everything else.

Why does Go feel more 'low-level' than Node?

Go isn't closer to the hardware — it's closer to the scheduler. Go pulls the M:N scheduler, goroutine stacks, and netpoller into the language runtime. Node leaves the comparable pieces (libuv, V8, callback queue) loosely coupled in C++. Both are user-space, but Go's runtime owns the coordination.

When should I pick Node over Go?

Node is excellent when CPU work per request is light and the bottleneck is fan-out to slow downstream systems — BFFs, API gateways, websocket hubs, real-time aggregation, SSR. Pick Go when CPU work is sustained, when you need predictable parallelism across cores, or when the service is long-lived infrastructure (gateways, schedulers, message brokers, kubelets).

Node Turns Waiting Into Events. Go Moves Context Switching Into User Space.

Most discussions of Node vs Go concurrency stop at 'async vs threaded.' The real split is deeper — where does context switching happen, and what is the unit of scheduling?

April 27, 2026

Harrison Guo

10 min read

System Design Backend Engineering

Most discussions of TypeScript/Node vs Go concurrency stop at the surface: Node is async, Go is threaded. That framing isn’t wrong — it just isn’t deep enough to be useful when you’re picking a runtime, debugging a tail-latency problem, or explaining to your team why one of the services keeps falling over under CPU load.

The real difference is not async vs threaded. It’s a question about where, in the system, suspended work lives — and what shape it takes when it’s resumed.

tl;dr — Both Node and Go refuse to let the CPU sit idle while a request waits on I/O. They disagree on the unit of scheduling. Node’s unit is the continuation — the tail of an async function captured as a callback. Go’s unit is the goroutine — a full call stack the runtime can suspend and resume in user space. That single decision cascades into every other property of each runtime.

The Wrong Question

“Async vs threaded” is the wrong frame because it makes you think the choice is between paradigms. It isn’t. Both runtimes have already made the same fundamental decision: do not block an OS thread waiting for slow external work. The interesting choice is how they implement that.

The actually useful question is:

When a request is waiting for I/O — for a database, an HTTP call, a Redis round-trip, a file read — what does the CPU do, and where does the suspended state of that request live?

Once you frame it that way, Node and Go aren’t opposites. They’re two answers to the same question.

What Both Models Are Solving

Most production web services don’t bottleneck on CPU. They bottleneck on waiting. Database queries, Redis, downstream HTTP, message queues — every request spends most of its wall-clock time idle, holding a state somewhere, waiting for a packet to come back.

The naive blocking model — one OS thread per request, parked on a syscall until the response lands — collapses around a few thousand concurrent connections. Memory per thread, scheduler overhead, and kernel context-switch cost are all real. By 40,000 connections you’re out of RAM, not CPU.

So both Node and Go answer the same architectural question:

Don’t pin an expensive resource (an OS thread, a CPU core) to a request that’s just waiting.

The disagreement is about which resource gets freed up, and what mechanism is used to capture and later resume the work.

Node’s Answer: Turn Waiting Into an Event

Node’s model can be summarized in one line: the JS main thread only executes code that’s already ready to run.

Look at this:

const user = await db.getUser(id);
return user;

It reads as if the function is paused, blocking on the database. It isn’t. Here’s what actually happens:

1. db.getUser(id) starts the query (non-blocking I/O via libuv)
2. The async function suspends
3. Everything after `await` is captured as a continuation
4. The JS main thread returns to the event loop
5. The thread services other ready callbacks
6. When the DB response arrives, the Promise resolves
7. The continuation is queued
8. The event loop runs it on the main thread

The transformation is roughly:

const user = await db.getUser(id);
sendResponse(user);

becomes:

db.getUser(id).then(user => {
  sendResponse(user);
});

That .then(...) callback is the continuation. It is the unit of scheduling in Node.

The event loop is the dispatcher: it watches for I/O readiness via libuv, for resolved Promises, for timers — and pulls the corresponding callback onto the JS thread when it’s ready to run. One thread can manage tens of thousands of concurrent connections, because at any given moment only a handful of them have work that’s actually ready.

So Node’s high concurrency isn’t one thread per request. It’s:

One thread managing a large set of suspended I/O states, each represented as a continuation waiting on an event.

This is event-driven concurrency in its precise sense — the runtime turns “waiting” into a registered event, and only resumes the captured continuation when the event fires.

The hard limit shows up the moment your code stops waiting. A single CPU-bound operation:

while (true) { /* heavy work */ }

…holds the JS main thread, and every other request on this process is dead until it returns. The event loop has nowhere else to run. Worker threads, child processes, or splitting CPU work into a separate service are real fixes, but they’re escape hatches — they exist because the core model has only one main thread executing JS.

Go’s Answer: Move Context Switching Into User Space

Go writes synchronous code:

user := db.GetUser(id)
sendResponse(user)

There is no await. There is no callback. The function looks like it blocks on the database. And yet the program scales to hundreds of thousands of concurrent operations on modest hardware.

The trick is that the scheduling boundary has been moved. Where Node has the programmer mark the suspension point with await and the runtime captures a continuation, Go lets the programmer write straight-line code and has the runtime suspend the entire goroutine when it hits a blocking I/O call.

This is the central insight, and the cleanest one-line statement of Go’s concurrency model:

Go’s essence is the user-space-ification of context switching.

A goroutine isn’t an OS thread. It’s a small (initially 2 KB) stack and a register snapshot, managed by the Go runtime. The runtime maps a large number of goroutines (G) onto a small number of OS threads (M) using a set of scheduling contexts (P). This is the GMP model:

G — a goroutine. The unit of scheduling. Cheap to create, cheap to suspend.
M — an OS thread. There are usually only a few — GOMAXPROCS worth.
P — a scheduling context. Decides which G runs on which M.

many G  →  Go scheduler  →  few M  →  CPU cores

When a goroutine hits a blocking syscall or a channel wait, the Go runtime:

Suspends the goroutine — saves its stack and registers
Detaches it from the current M
Schedules another runnable goroutine onto that M
When the original goroutine’s wait completes, marks it runnable again
Some M, eventually, picks it up and resumes it from the suspension point

All of that — the suspend, the resume, the switch from one goroutine to another on the same OS thread — happens without entering the kernel. There’s no clone(2), no full context switch, no scheduler queue in kernel space. The bookkeeping is all in user space.

That’s the user-space-ification. The CPU still has to switch contexts when work shifts between goroutines, but the cost is roughly a function call plus a stack swap — not a kernel-mediated thread switch.

The programmer-visible consequence is that you write code that looks synchronous, and the runtime makes it concurrent.

The Unit of Scheduling

The cleanest way to compare the two models is to ask: what does each runtime actually schedule?

	Node / TypeScript	Go
Unit of scheduling	callback / Promise continuation	goroutine
What’s captured at suspension	the tail of an async function	the full call stack + registers
How code looks	explicit `async`/`await`	straight-line synchronous
Suspension marked by	the programmer (`await`)	the runtime (any blocking op)
Suspended work lives in	task queue inside the JS engine	goroutine stack in user-space heap
Kernel involvement	epoll/kqueue/IOCP via libuv	epoll/kqueue/IOCP via netpoller
CPU parallelism	one main JS thread; needs workers/cluster for cores	M:N scheduler runs goroutines across cores natively
What breaks under CPU load	the entire event loop	nothing — scheduler runs another G on another M

The two columns describe deeply different mental models, but both belong to the same family. They’re both user-space concurrency runtimes that avoid kernel thread-per-request. They differ in where the suspension is captured (the language vs. the call stack) and how broad the scheduler’s mandate is (events vs. execution).

Where the Boundaries Diverge: CPU-Bound Work

Node and Go look interchangeable on I/O-bound workloads. They diverge sharply the moment CPU work enters the picture.

Node’s event loop has one job: dispatch ready callbacks onto a single JS thread. If a callback runs for 200ms doing JSON parsing or hashing or anything CPU-bound, the loop is frozen for those 200ms. Every other suspended continuation has to wait. Throughput collapses.

Go’s runtime has a different mandate. It doesn’t only manage waiting — it also manages execution. If you spawn:

go task1()
go task2()
go task3()

…the scheduler is happy to put each goroutine on a different M, run them on different cores in true parallel, and preempt long-running goroutines so they don’t starve the rest of the runtime. CPU-bound goroutines aren’t a special case to work around. They’re just goroutines.

That’s why Go’s concurrency model covers more ground:

Node’s model mainly solves non-CPU-bound concurrency — network I/O, database waits, downstream API calls. Go’s model solves I/O waiting and CPU parallelism with the same primitive.

This isn’t a knock on Node. The event loop is brilliant at what it’s designed for: lots of slow waits, light per-request CPU. It’s the natural shape of API gateways, BFFs, websocket hubs, real-time aggregation, and most of the JSON-shuffling that makes up modern web backends.

But the shape of the workload determines the shape of the runtime that fits. Sustained CPU work, mixed CPU + I/O pipelines, long-lived infrastructure services — those are workloads where Go’s scheduler-driven model has more headroom built in.

Two Answers to the Same Question

Strip away the implementation details and the two runtimes are answering the same question with different abstractions:

Concurrency at scale is the problem of what to do with the CPU while a request waits on I/O.

Node’s answer:

Turn the wait into an event
Capture the rest of the function as a continuation
Resume the continuation when the event fires
One thread cycling through ready continuations

Go’s answer:

Run the request on a goroutine
When it blocks, suspend the goroutine in user space
Schedule another runnable goroutine onto the OS thread
When the original wait completes, resume the goroutine

Two ways of solving the same waste. One state-machines it. The other lowers the cost of context switching far enough that you can afford to keep one execution flow per request.

Two answers to one question: one is events, implemented as a state machine. The other is low-cost user-space context switching.

Both refuse to let an OS thread sit blocked. They just disagree about whether the right unit of suspension is a continuation or a goroutine.

When to Pick Which

A few useful rules of thumb that fall out naturally from the model differences:

High-fanout, I/O-heavy backends (BFFs, GraphQL aggregators, websocket hubs, SSR, lightweight orchestration) — Node is excellent. The event loop’s strength is its weakness’s mirror image: in workloads that almost never run heavy CPU per request, the event loop is all upside. The single-thread limit doesn’t bite if nothing tries to bite it.
Long-lived infrastructure (gateways, message brokers, schedulers, control planes, kubelets-of-some-kind) — Go is a more honest fit. These services have sustained CPU work, predictable parallelism requirements, and benefit from goroutine + channel idioms. It is not a coincidence that Kubernetes, etcd, NATS, Docker, and most of the cloud-native control plane is Go.
Mixed CPU + I/O pipelines — anything that does meaningful work per request (parsing, transformation, encoding, light ML) on top of I/O — Go scales more gracefully. Node can do it, but you’ll end up with worker pools and IPC, which is just rebuilding the M:N scheduler in userland with worse ergonomics.
Fast iteration on web shapes, type sharing front-to-back, JSON-heavy product surfaces — Node/TypeScript wins on developer velocity. The runtime is good enough for the workload these products actually have.

Neither runtime is a replacement for the other. They live at different abstraction layers — Node packages an event-loop runtime, Go packages a userspace scheduler — and the question is which abstraction matches your workload’s shape.

The Closing Line

If you remember one thing from this:

Node turns waiting into events. Go turns execution flows into schedulable units. Both refuse to let the CPU sit idle while I/O blocks — they just disagree on what the unit of scheduling should be.

That’s the whole story. Everything else — await vs go, libuv vs netpoller, callback queue vs GMP, single-thread bottleneck vs CPU-bound resilience — falls out of that one disagreement.

Tags: golang go nodejs typescript concurrency event-loop goroutines scheduler context-switching user-space-scheduling backend-engineering distributed-systems

🎧 More Ways to Consume This Content

HarrisonSecurityLab Podcast

Comments

This space is waiting for your voice.

Comments will be supported shortly. Stay connected for updates!

Preview of future curated comments

This section will display user comments from various platforms like X, Reddit, YouTube, and more. Comments will be curated for quality and relevance.

Node Turns Waiting Into Events. Go Moves Context Switching Into User Space.

Most discussions of Node vs Go concurrency stop at 'async vs threaded.' The real split is deeper — where does context switching happen, and what is the unit of scheduling?

Table of Contents

The Wrong Question

What Both Models Are Solving

Node’s Answer: Turn Waiting Into an Event

Go’s Answer: Move Context Switching Into User Space

The Unit of Scheduling

Where the Boundaries Diverge: CPU-Bound Work

Two Answers to the Same Question

When to Pick Which

The Closing Line

🎧 More Ways to Consume This Content

Comments

Leave a Comment

Node Turns Waiting Into Events. Go Moves Context Switching Into User Space.

Most discussions of Node vs Go concurrency stop at 'async vs threaded.' The real split is deeper — where does context switching happen, and what is the unit of scheduling?

Table of Contents

The Wrong Question

What Both Models Are Solving

Node’s Answer: Turn Waiting Into an Event

Go’s Answer: Move Context Switching Into User Space

The Unit of Scheduling

Where the Boundaries Diverge: CPU-Bound Work

Two Answers to the Same Question

When to Pick Which

The Closing Line

🎧 More Ways to Consume This Content

[ Get_One_Essay_A_Week ]

Related Articles

Why Go Handles Millions of Connections: User-Space Context Switching, Explained

Go Context in Distributed Systems: What Actually Works in Production

Go's Concurrency Is About Structure, Not Speed: chan and context as Lifecycle Primitives

Comments

Leave a Comment

[ Connect_With_Me ]