Networking
gRPC & Protocol Buffers
A contract-first binary RPC framework — function calls that cross the network
gRPC is a binary remote-procedure-call framework that runs over HTTP/2 and serializes messages with Protocol Buffers — a schema-defined, length-prefixed wire format that is typically 3–10× smaller and faster to parse than JSON.
- TransportHTTP/2
- EncodingProtobuf binary
- Payload vs JSON≈ 3–10× smaller
- Call types4 (unary + 3 streaming)
- Field key on wirevarint tag, not name
Interactive visualization
Press play, or step through manually. The visualization is yours to drive — try it before reading on.
Watch the 60-second explainer
A condensed visual walkthrough — narrated, captioned, under a minute.
How gRPC and Protobuf work together
Calling a function on another machine should feel like calling one in your own process. That's the promise of a remote procedure call (RPC): you write balance = account.GetBalance(id) and the runtime quietly serializes the arguments, ships them across the network, runs the function on a remote server, and hands you back the result. gRPC — released by Google in 2015 and now a CNCF project — is the modern, high-performance take on this idea. The "g" originally stood for gRPC; each release picks a new backronym for fun.
gRPC has two halves that you should keep mentally separate, because people conflate them constantly:
- Protocol Buffers (Protobuf) — the data layer. A schema language (
.protofiles) plus a compact binary encoding for the messages you send. - gRPC — the transport and dispatch layer. It defines
serviceblocks in the same.protofile, generates client stubs and server skeletons, and moves Protobuf bytes over HTTP/2.
The whole thing is contract-first. You write one schema, run the protoc compiler, and it emits type-safe code in your language of choice — Go, Java, Python, C++, Rust, TypeScript, and a dozen more from one source of truth:
syntax = "proto3";
package bank.v1;
message BalanceRequest {
string account_id = 1; // field tag 1
}
message BalanceReply {
int64 cents = 1; // field tag 1
string currency = 2; // field tag 2
}
service Accounts {
rpc GetBalance(BalanceRequest) returns (BalanceReply);
rpc WatchBalance(BalanceRequest) returns (stream BalanceReply);
}
The numbers after each field (= 1, = 2) are the heart of the system. They are the field's permanent identity on the wire. The name account_id exists only in your source code and never travels across the network.
The wire format: tags, varints, and length prefixes
Every field in a Protobuf message is encoded as a key–value pair. The key packs the field tag and a 3-bit wire type into a single varint: key = (field_tag << 3) | wire_type. There are six wire types; the two you'll see most are 0 (varint — for ints, bools, enums) and 2 (length-delimited — for strings, bytes, and nested messages).
A varint (variable-length integer) stores a number using as few bytes as possible. Each byte gives 7 bits of payload; the high bit (the "continuation bit") is set if more bytes follow. So values 0–127 take one byte, 128–16,383 take two, and so on. Encode the integer 300:
300 in binary: 1 0010 1100
split into 7-bit groups (little-endian): 0101100 0000010
add continuation bit: 1 0101100 0 0000010
bytes: 0xAC 0x02
Now encode the message BalanceReply { cents: 300, currency: "USD" }. Field 1 (cents, varint) → key (1<<3)|0 = 0x08, then the varint 0xAC 0x02. Field 2 (currency, length-delimited) → key (2<<3)|2 = 0x12, length 0x03, then the bytes U S D:
08 AC 02 12 03 55 53 44 // 8 bytes total
The equivalent JSON — {"cents":300,"currency":"USD"} — is 30 bytes. The decoder reads a varint key, looks up the tag in the generated schema to learn the field, reads the value by its wire type, and repeats until the buffer ends. Fields can arrive in any order; unknown tags are skipped by reading their wire type and stepping over the bytes. That skip-the-unknown behavior is exactly what makes Protobuf forward-compatible.
Parsing is a single linear pass — O(n) in the byte length of the message — with no string scanning, no quote-escaping, and no number-to-text conversion. That's where most of the speed comes from versus JSON, which must tokenize, un-escape strings, and parse decimal text into machine integers.
Why gRPC needs HTTP/2
gRPC doesn't invent its own transport — it leans entirely on HTTP/2, and the streaming call types only exist because of HTTP/2's design:
- Multiplexing. Many independent streams share one TCP connection. A bidirectional gRPC call is just an HTTP/2 stream where both endpoints send DATA frames whenever they like.
- Binary framing. HTTP/2 already speaks in length-prefixed binary frames, so wrapping a length-prefixed Protobuf message is natural — each gRPC message is a 1-byte compression flag, a 4-byte big-endian length, then the bytes.
- Header compression (HPACK). Repeated metadata (auth tokens, content-type) is compressed across requests on the same connection.
- Trailers. gRPC reports the final status code in HTTP/2 trailers sent after the body — which is precisely why browsers, lacking trailer access, can't speak raw gRPC.
The four call types fall straight out of this:
| Call type | Request | Response | Typical use |
|---|---|---|---|
| Unary | one | one | ordinary method call: GetBalance |
| Server streaming | one | stream | tail a log, push price updates |
| Client streaming | stream | one | upload chunks, then get a summary |
| Bidirectional | stream | stream | chat, real-time telemetry |
When to choose gRPC (and when not to)
- Internal service-to-service traffic in a microservice mesh — this is the sweet spot. Low latency, strong contracts, and code generation in every language keep dozens of services in sync.
- Polyglot systems where a Go service must call a Python service must call a Java service. One
.protois the shared truth. - High-throughput or streaming workloads — telemetry, video pipelines, ML inference — where payload size and parse cost actually move the bill.
- Strict, evolvable contracts where you want the compiler to catch a field-type change before it ships.
Reach for something else when: your API is public and consumed by browsers or third parties (use REST or GraphQL — humans can read JSON, can't read varints); you need ad-hoc, curl-friendly debugging; or you run on lossy mobile networks where HTTP/2's single-connection head-of-line blocking bites and HTTP/3 over QUIC would serve you better.
gRPC vs REST, GraphQL, and plain JSON-RPC
| gRPC + Protobuf | REST + JSON | GraphQL | JSON-RPC | |
|---|---|---|---|---|
| Wire format | binary (varint-keyed) | text (JSON) | text (JSON) | text (JSON) |
| Transport | HTTP/2 only | HTTP/1.1 or 2 | usually HTTP/1.1 | any (HTTP, TCP, pipe) |
| Typical payload size | baseline (1×) | ≈ 3–10× larger | ≈ 3–10× larger | ≈ 3–10× larger |
| Schema / contract | required (.proto) | optional (OpenAPI) | required (SDL) | optional |
| Code generation | first-class, all langs | via OpenAPI tooling | via codegen tooling | rare |
| Streaming | native (4 modes) | SSE / chunked hacks | subscriptions (WS) | none built in |
| Browser-native | no (needs gRPC-Web) | yes | yes | yes |
| Human-debuggable | no (needs reflection) | yes (curl, eyes) | yes | yes |
| Over-/under-fetching | fixed message shape | over-fetches common | client picks fields | fixed |
The headline trade is machine efficiency and type safety versus human ergonomics. gRPC's binary frames and generated stubs are ideal between your own services; REST and GraphQL keep their lead where a human or a browser is the consumer and where being able to curl the endpoint is worth more than shaving 200 bytes.
What the numbers actually say
- Payload size: 3–10× smaller. A representative 220-byte JSON object commonly lands at 30–60 bytes in Protobuf because field names disappear and integers are varint-packed. Over a million calls that's roughly 220 MB of JSON versus ~45 MB of Protobuf — about 175 MB of egress saved per million messages.
- Parse cost. Protobuf decode is a linear key/value scan with no string un-escaping; benchmarks routinely show serialize+deserialize 2–6× faster than the equivalent JSON round-trip for nested objects, and the gap widens with deeper nesting.
- Connection reuse. HTTP/2 multiplexing means one TCP+TLS handshake (≈ 1–2 RTT, often 50–150 ms on the open internet) is amortized across thousands of calls, instead of paying setup per request as naive HTTP/1.1 clients do.
- Field tags ≤ 15 cost one byte. The tag+wire-type key is a varint, so tags 1–15 fit in a single byte while 16+ need two. Protobuf's own guidance: reserve tags 1–15 for the fields you send most often. Costing it out, mis-numbering a hot field as tag 16 adds one byte to every message that carries it.
JavaScript: encoding a Protobuf message by hand
You normally let generated code do this, but encoding a message by hand makes the wire format concrete. Here is a minimal encoder for the BalanceReply above:
// Append a varint (unsigned) to a byte array.
function writeVarint(bytes, value) {
while (value > 0x7f) {
bytes.push((value & 0x7f) | 0x80); // 7 bits + continuation bit
value = Math.floor(value / 128); // logical >>> 7, safe past 2^31
}
bytes.push(value & 0x7f);
}
// key = (tag << 3) | wireType
function writeKey(bytes, tag, wireType) {
writeVarint(bytes, (tag << 3) | wireType);
}
function encodeBalanceReply({ cents, currency }) {
const out = [];
// field 1: cents — wire type 0 (varint)
writeKey(out, 1, 0);
writeVarint(out, cents);
// field 2: currency — wire type 2 (length-delimited)
writeKey(out, 2, 2);
const utf8 = new TextEncoder().encode(currency);
writeVarint(out, utf8.length);
out.push(...utf8);
return Uint8Array.from(out);
}
const buf = encodeBalanceReply({ cents: 300, currency: "USD" });
console.log([...buf].map(b => b.toString(16).padStart(2, "0")).join(" "));
// → 08 ac 02 12 03 55 53 44
Notice the encoder never writes the words "cents" or "currency". The reader recovers them from the schema by tag number — which is also why an out-of-date schema silently mislabels fields rather than erroring.
Python: defining and serving a gRPC service
In real projects you run protoc (or grpcio-tools) on the .proto to generate bank_pb2.py (messages) and bank_pb2_grpc.py (service stubs), then implement the server:
import grpc
from concurrent import futures
import bank_pb2, bank_pb2_grpc
class Accounts(bank_pb2_grpc.AccountsServicer):
def GetBalance(self, request, context):
cents = lookup_balance(request.account_id) # your DB call
return bank_pb2.BalanceReply(cents=cents, currency="USD")
def WatchBalance(self, request, context): # server streaming
for cents in balance_changes(request.account_id):
yield bank_pb2.BalanceReply(cents=cents, currency="USD")
def serve():
server = grpc.server(futures.ThreadPoolExecutor(max_workers=10))
bank_pb2_grpc.add_AccountsServicer_to_server(Accounts(), server)
server.add_insecure_port("[::]:50051")
server.start()
server.wait_for_termination()
# Client side — the stub turns a network call into a method call:
def fetch():
with grpc.insecure_channel("localhost:50051") as channel:
stub = bank_pb2_grpc.AccountsStub(channel)
reply = stub.GetBalance(bank_pb2.BalanceRequest(account_id="acct-42"))
print(reply.cents, reply.currency) # 300 USD
The server returns a generator for the streaming method — each yield becomes one HTTP/2 DATA frame, and the client iterates the responses as they arrive. The stub.GetBalance(...) call looks local; under the hood it serializes the request, opens an HTTP/2 stream, and blocks for the reply.
Variants and the wider ecosystem
gRPC-Web. Browsers can't touch HTTP/2 frames or trailers, so gRPC-Web sends a slightly different framing that a proxy (Envoy, or grpc-web's bundled proxy) translates to/from real gRPC. It supports unary and server-streaming only — no client or bidirectional streaming.
Connect (connectrpc). A newer protocol from Buf that speaks gRPC, gRPC-Web, and a plain-HTTP/JSON mode from one server, so a browser can hit the same handler with a normal POST. It's gaining traction precisely because it removes the gRPC-Web proxy.
proto3 vs proto2. proto3 (the default since 2016) dropped required fields and explicit defaults to make schema evolution safer; proto2 still exists where you need presence tracking, though proto3 later re-added optional for the same purpose.
Other serializers. Cap'n Proto and FlatBuffers go further — they're zero-copy: you read fields directly out of the buffer with no parse step at all, trading larger messages for sub-microsecond access. Apache Thrift is the older Facebook contemporary that bundles its own RPC and serialization.
gRPC reflection & grpcurl. Because binary frames aren't human-readable, servers can expose a reflection service so tools like grpcurl fetch the schema at runtime and let you call methods with JSON on the command line — the closest gRPC gets to curl.
Common bugs and edge cases
- Renumbering or reusing a field tag. The cardinal sin. Old data keyed by the old tag will be silently misread as the new field. Always
reservethe tags (and names) of deleted fields so they can't be recycled. - Confusing "field not set" with "field set to default." In proto3, scalar fields have no presence by default — a
centsof0and an unsetcentsserialize identically (both omitted). Useoptionalor a wrapper type when you genuinely need to tell "zero" from "absent." - Signed integers in plain
int64. Negative numbers in a regular varint always take 10 bytes because of sign extension. Usesint32/sint64, which zig-zag encode so small magnitudes stay small. - Assuming browser support. Wiring a frontend straight to a gRPC backend fails; you need gRPC-Web plus a proxy, or Connect.
- Head-of-line blocking on lossy links. All gRPC streams share one TCP connection, so a single lost packet stalls every multiplexed call until retransmission. On mobile or satellite links, HTTP/3 over QUIC sidesteps this.
- Unbounded message size. gRPC defaults to a 4 MB receive limit; large file transfers silently fail with
RESOURCE_EXHAUSTEDunless you raise the limit or stream the payload in chunks. - Treating deadlines as optional. gRPC has per-call deadlines that propagate across hops. Forgetting to set one means a slow downstream service can pin client threads indefinitely.
Frequently asked questions
What is the difference between gRPC and Protocol Buffers?
Protocol Buffers is the serialization format — a schema language plus a compact binary wire encoding for structured messages. gRPC is the RPC framework that uses Protobuf to define service methods and ships those messages over HTTP/2. You can use Protobuf without gRPC (as a storage or messaging format), but gRPC depends on Protobuf by default.
Why is Protobuf smaller than JSON?
Protobuf drops field names from the wire entirely — each field becomes a small integer tag instead of a quoted string like "customer_id". Integers are varint-encoded so small numbers take one byte, and there are no braces, quotes, or whitespace. A message that is 220 bytes of JSON often compresses to 30–60 bytes of Protobuf, roughly 3–10× smaller.
What are the four gRPC call types?
Unary (one request, one response — like a normal function call), server streaming (one request, a stream of responses), client streaming (a stream of requests, one response), and bidirectional streaming (both sides stream independently over the same HTTP/2 connection). Streaming is possible because HTTP/2 multiplexes many message frames on one TCP connection.
Why can't I call gRPC directly from a browser?
Browsers don't expose the raw HTTP/2 frames and trailers that gRPC relies on, so a browser fetch can't speak the gRPC wire protocol. The workaround is gRPC-Web, a variant that runs through a proxy (Envoy or grpc-web's own) which translates between the browser-friendly framing and real gRPC. gRPC-Web also can't do client or bidirectional streaming.
What does it mean that Protobuf fields are forward and backward compatible?
Because the wire format keys fields by number, not name, you can add a new optional field without breaking old readers — they skip the unknown tag — and old messages still parse in new code, which sees the missing field as its default. The rule is: never reuse or renumber an existing field tag, and reserve tags of deleted fields so they can't be recycled.
Is gRPC always faster than REST?
For service-to-service traffic on a fast network, gRPC usually wins on payload size, parse cost, and connection reuse — measured throughput gains of 2–10× are common. But on tiny payloads the difference shrinks, the binary format is hard to debug by eye, and HTTP/2 head-of-line blocking can hurt on lossy links where HTTP/3 over QUIC would do better. REST over JSON is still simpler for public, human-readable APIs.