Isn't Rust just strictly better than Go for backend services?

No. Rust wins when you need to push a single machine as hard as it can go and you can afford the development velocity cost. Go wins when you need to spawn lots of concurrent work cheaply and ship features fast. Different scaling paths, different trade-offs.

Why is Python still popular if it's 'slow'?

Because 'slow' is measured wrong. For the workloads Python dominates — ML/data, glue scripts, small web services — throughput isn't the constraint. Time-to-working-feature is. Python buys you that. Ignore this at your peril.

When does scale-up actually matter more than scale-out?

When the per-request work has non-trivial CPU cost (HFT, game servers, computational simulation, compression/encoding pipelines) or when state sharing across nodes is expensive (sub-microsecond consistency). Most CRUD backends are scale-out problems.

Is 'scale-up' just a euphemism for 'we can't afford a bigger cluster'?

Sometimes, but not always. Scale-up architectures dramatically reduce operational complexity — one machine, one process, one log. Sometimes the right answer is a single beefy box running Rust. Companies discover this after scale-out complexity ate their ops budget.

If I have to pick one language for a new company, what's the safe default?

Go or Kotlin for most backends. Both give you decent performance, readable code, ecosystems with production-tested libraries, and hiring pools that don't require PhDs. Start there, pull in Rust for the 5% of work that genuinely benefits.

Scale-Up vs Scale-Out: Why Every Language Wins Somewhere

The 'which language is fastest' benchmark wars miss the real question. Rust, Go, Java, and Python aren't competing on the same axis. They're tuned for different scaling strategies — and picking the wrong one costs you years.

February 20, 2026

Harrison Guo

9 min read

System Design Backend Engineering

I worked with a team that rewrote a critical service from Go to Rust because “performance.” Six months later, the service was 30% faster, the team was miserable, and feature velocity had dropped to a crawl. Meanwhile the competitor team, still on Go, had shipped four new features.

We did the postmortem eventually. The service handled maybe 2,000 requests per second on a 4-core machine. CPU utilization sat around 20%. Rust’s extra speed bought us exactly nothing — the bottleneck was downstream database latency. What it cost us was every feature we didn’t ship while writing unsafe, fighting the borrow checker, and nursing the team through the learning curve.

That incident taught me the question I wish I’d learned earlier: what are you actually scaling, and does the language buy you the right kind of scale?

tl;dr — Language benchmarks optimize for one axis: per-request performance. Real systems have multiple axes — throughput, latency, concurrency, developer velocity, operational complexity, memory efficiency. Rust, Go, Java, Python aren’t competing to be “fastest.” They’re different answers to different bets about what you’re going to scale. Pick by fit, not by leaderboard.

The Two Kinds of Scale

At the top level, two strategies dominate:

Scale-up: make one machine do more. Vertical scaling. Faster CPUs, more RAM, specialized hardware, lower per-operation cost.
Scale-out: add more machines. Horizontal scaling. Cheaper commodity hardware, more concurrency, lots of work running in parallel.

These aren’t just infrastructure decisions. They’re reflected in the language and ecosystem you pick. A language optimized for scale-up (Rust, C++) has different priorities than one optimized for scale-out (Go, Elixir) or one optimized for neither but for developer leverage (Python, Ruby).

The big confusion comes from mixing axes. “Rust is faster than Go” is true on per-op microbenchmarks and irrelevant if your workload is I/O-bound service-to-service traffic. “Python is slow” is true in a compute-bound loop and irrelevant for a 500-QPS API that spends 95% of its time waiting on PostgreSQL.

Where Each Language Actually Wins

quadrantChart
    title Language fit by what you're scaling
    x-axis Scale-out (many machines / cheap concurrency) --> Scale-up (one machine, pushed hard)
    y-axis Prototype velocity --> Production rigor
    quadrant-1 "Scale-up + rigor
(Rust · C++ · Zig)"
    quadrant-2 "Scale-out + rigor
(Go · Java/Kotlin)"
    quadrant-3 "Scale-out + velocity
(Python · Ruby · Node)"
    quadrant-4 "Scale-up + velocity
(narrow niche)"
    Rust: [0.85, 0.85]
    "C++": [0.92, 0.88]
    Go: [0.25, 0.75]
    "Java/Kotlin": [0.30, 0.80]
    Python: [0.25, 0.25]
    Ruby: [0.25, 0.30]
    Node: [0.30, 0.35]

Rough positioning — not a benchmark, a fit map. The language you pick should live near the kind of scaling your system actually demands.

Rust / C++ / Zig — Scale-up champions

These languages dominate when per-machine throughput is the bottleneck and you can afford the engineering cost. That’s a narrower set of problems than Twitter would have you believe, but the problems that exist are real:

High-frequency trading engines — microseconds matter, GC pauses are unacceptable, every cache line counts.
Inference engines — llm.cpp, vllm, mistral.rs. Memory layout, SIMD, custom kernels.
Databases and storage engines — ScyllaDB, TiKV, Foundation internals. State machines that live forever and must not leak.
Network data planes — Cloudflare’s Pingora, proxies at the edge.
Game engines, audio/video encoding, embedded.

The pattern: one box, pushed hard, for years. Memory safety matters because bugs compound over time. Performance matters because throughput per core is the product.

The cost: every commit is slower. Refactoring is expensive. Onboarding is measured in months, not weeks. The compile times are what they are. You pay this cost every day the service exists.

Go — Scale-out champion

Go hits a specific sweet spot: cheap concurrency, predictable performance, fast-to-ship code, and easy to hire for. It’s a scale-out language.

Thousands of goroutines per core, 2KB stacks, user-space context switching. The “cost of one more waiter” is nearly zero.
Standard library is enough for 80% of backend work — HTTP server, JSON, SQL, crypto.
Compilation is fast enough to stay in flow. Iteration loop feels similar to a dynamic language.
Minimalism is aggressive. One person can read the whole language in a weekend. New hires are productive in days.

Where it loses: per-op performance. Go’s GC is fine but not invisible. Zero-copy generic code is harder to write than in Rust. The type system doesn’t prevent the entire class of bugs Rust’s does.

Go’s bet: the problem you’re most likely to have is “I need to handle 10x the concurrent work with 2x the code.” Not “I need this loop to be 5% faster.” For most backend services, that bet is right.

Java / Kotlin — Mature scale-out with runtime depth

The JVM is what you want when the workload is scale-out but you need runtime flexibility Go doesn’t give you:

A mature JIT that optimizes hot paths beyond what AOT can.
Rich profiling and monitoring (JFR, async-profiler, flight recorder) that makes post-deploy tuning feasible.
A library ecosystem that, after 25 years, has a mature library for basically anything.
Kotlin on top gives you modern syntax and coroutines without leaving the ecosystem.

Where it loses: startup time, memory overhead, operational complexity (GC tuning is a real job), the occasional “it works on my JDK 11 but the prod JDK 17 changed something.” Also: hiring is harder than Go now, at least in my corner of the industry.

Java’s bet: “you’ll still be running this service in ten years, and you want to be able to tune its runtime when that day comes.” For large enterprises with deep infrastructure, that bet pays off. For a startup shipping its first three services, the overhead is not worth it.

Python / Ruby — Developer-velocity champions

The forgotten-but-dominant answer: languages that optimize neither scale-up nor scale-out, but scale-the-team.

Fast to write, fast to read, fast to debug.
Massive libraries for data, ML, scripting, DSLs.
Easy to onboard anyone — CS students, data scientists, analysts.
Prototype-to-production path is shorter than anywhere else.

Where they lose: per-core throughput, concurrency (the GIL is real), memory. Python and Ruby are not your language for a 100K QPS service.

But a lot of real companies don’t need a 100K QPS service. They need to get a thing working, put it in front of users, and iterate. If your current problem is “we need to ship the next feature this week,” Python might be the right answer even if a Rust version would technically run faster.

Python’s bet: throughput isn’t the constraint yet. Time-to-shipped-feature is. For most companies most of the time, that’s correct.

The Axes Nobody Talks About

Beyond scale-up/scale-out, a few axes decide more projects than raw performance.

Developer-velocity per week

“I can ship a feature and have it in production by Friday” beats “this service is 2x faster” most of the time. Measure it. If your current stack requires a two-day ceremony to deploy a one-line change, throughput is not your problem. Velocity is.

Operational complexity

Scale-up is operationally cheaper than scale-out. One machine, one process, one log. Scale-out gives you better redundancy but also distributed-systems problems — consistency, ordering, partial failure, chaos engineering. If your team is three people, the operational complexity of a 20-node scale-out cluster may eat more time than the language choice saves.

Memory efficiency per dollar

At cloud scale, memory is expensive. A Rust service that fits in 2GB where a Java service needs 8GB is a 4x savings on every instance. Multiply by thousands of instances and “per-op performance” stops being the interesting number — per-GB cost starts to matter.

Hiring pool

The language with the deepest talent pool in your market is usually the right answer for a new system, all else equal. A marginal technical improvement isn’t worth a six-month hiring pipeline.

Learning curve shape

Some languages have shallow onboarding (Go, Python) and a long tail of depth. Others have steep onboarding (Rust, Haskell) and you’re productive only after the ramp. For a senior team on a long-lived system, steep is fine. For a fast-moving team, steep is expensive.

The Pattern I See Repeated

A company starts small, picks Python or Ruby, builds the thing, ships to production. Ten employees. One codebase. Life is fast.

They grow to fifty engineers. The monolith cracks. Some services get rewritten in Go for concurrency and operational simplicity. A few performance-critical ones get written in Rust. Data infra sits on the JVM (Kafka, Spark, Flink). A few internal tools stay in Python because the team knows it and it works.

Five years in, the stack is polyglot. Nobody regrets it. What they regret is the six months they spent trying to make a single-language stack work past its comfort zone — the Python team pushing for “just async more things,” or the Rust team fighting the borrow checker on code that could have been Go, or the Java team explaining to a new hire why the stack trace is 400 lines long.

The pattern: pick the language that fits the service, not the service that fits the language.

How I Ask the Question Now

When someone proposes “let’s build this new thing in X,” I ask:

What’s the expected traffic profile, and what’s the per-request work shape?
Is this scale-up limited (per-machine throughput) or scale-out limited (concurrent work)?
Who’s going to write this, and how fast do we need them productive?
Who’s going to operate this, and what’s their tooling comfort?
Does this interact with an existing ecosystem (JVM data platform, Rust security infra)?
How long does it have to live?

The answer to those five questions usually lands me on one of three languages for 80% of systems I see: Go, Rust, or (for data-adjacent work) Kotlin on the JVM. Python still shows up for tools and glue. Everything else is contextual.

The benchmarks don’t help. Per-op microbenchmarks answer questions nobody is actually asking. The right question is which axes matter for this system, and which language’s bet lines up with those axes.

The Argument I’ve Stopped Having

I still see engineers argue about whether Rust or Go is “better.” Both are good languages. Both are bad choices for problems they weren’t designed for. The meaningful question is which kind of scale you’re paying for — and the honest answer is almost always a mix, evolving over time.

The Rust rewrite I opened with wasn’t a bad decision because Rust is a bad language. It was a bad decision because we weren’t scale-up limited. We were downstream-database limited. No language could help with that.

Know which scale you’re buying, and buy it on purpose.

Why Go Handles Millions of Connections: User-Space Context Switching, Explained — the design decision behind Go’s scale-out bet.
Go’s Concurrency Is About Structure, Not Speed — what you actually get with Go, and what you don’t.
NATS vs Kafka vs MQTT: Same Category, Very Different Jobs — applying the same fit-vs-benchmark thinking to messaging.

🎧 More Ways to Consume This Content

HarrisonSecurityLab Podcast

I occasionally advise small teams on backend reliability, Go performance, and production AI systems. Learn more: /services

Comments

This space is waiting for your voice.

Comments will be supported shortly. Stay connected for updates!

Preview of future curated comments

This section will display user comments from various platforms like X, Reddit, YouTube, and more. Comments will be curated for quality and relevance.

Scale-Up vs Scale-Out: Why Every Language Wins Somewhere

The 'which language is fastest' benchmark wars miss the real question. Rust, Go, Java, and Python aren't competing on the same axis. They're tuned for different scaling strategies — and picking the wrong one costs you years.

Table of Contents

The Two Kinds of Scale

Where Each Language Actually Wins

Rust / C++ / Zig — Scale-up champions

Go — Scale-out champion

Java / Kotlin — Mature scale-out with runtime depth

Python / Ruby — Developer-velocity champions

The Axes Nobody Talks About

Developer-velocity per week

Operational complexity

Memory efficiency per dollar

Hiring pool

Learning curve shape

The Pattern I See Repeated

How I Ask the Question Now

The Argument I’ve Stopped Having

🎧 More Ways to Consume This Content

Comments

Leave a Comment

Scale-Up vs Scale-Out: Why Every Language Wins Somewhere

The 'which language is fastest' benchmark wars miss the real question. Rust, Go, Java, and Python aren't competing on the same axis. They're tuned for different scaling strategies — and picking the wrong one costs you years.

Table of Contents

The Two Kinds of Scale

Where Each Language Actually Wins

Rust / C++ / Zig — Scale-up champions

Go — Scale-out champion

Java / Kotlin — Mature scale-out with runtime depth

Python / Ruby — Developer-velocity champions

The Axes Nobody Talks About

Developer-velocity per week

Operational complexity

Memory efficiency per dollar

Hiring pool

Learning curve shape

The Pattern I See Repeated

How I Ask the Question Now

The Argument I’ve Stopped Having

Related

🎧 More Ways to Consume This Content

[ Agent_Architecture_Notes ]

Related Articles

Why Go Handles Millions of Connections: User-Space Context Switching, Explained

IronSys: A Production Blueprint for Modern Concurrency

From Locks to Actors: The Four Pillars of Modern Concurrency

Comments

Leave a Comment

[ Connect_With_Me ]