Git's Wire Protocol

For a tool one types thirty times a day, Git is exceedingly reticent about what it does on the wire. git fetch blinks back a few lines about counting and resolving deltas; git push reports a tally and exits; and the conversation between client and server, which is the entire point of the exercise, passes invisibly behind a progress bar. I set out to follow what actually happens — to trace the bytes, in order, from the moment one presses Return — and the account that follows is what I found when I did.

The pitch, shorn of ornament, is this. When one fetches, the client and the server perform a brief negotiation about which commits each of them already has, and the server then ships, in a single delta-compressed stream, only the objects the client lacks. The protocol that carries this conversation is older than HTTPS, older than most of the libraries one uses to speak to Git, and — what is more surprising — it has been retrofitted to run over HTTPS without altering its essential shape. It is a framed, line-oriented protocol, of the kind one might have written in 1995 and not been embarrassed about; and it has aged better than most of its contemporaries, for reasons I shall come to.

pkt-line: The Frame the Whole Thing Sits Inside

Before any of the higher conversation is intelligible, one must understand its unit. Git’s wire protocol is composed of pkt-lines, and a pkt-line is a four-character hexadecimal length prefix followed by the bytes the length describes. The length includes itself. That is the whole specification, and one can read it in a sentence; the trouble is that everything else is built on top of it, and reading a real Git transcript requires that one parse pkt-lines in one’s head with no greater effort than one parses sentences in English.

An example. The line

plain

001e# service=git-upload-pack\n

is a pkt-line of total length 0x001e = 30 bytes. The four-byte prefix 001e plus the 26 bytes of payload (# service=git-upload-pack followed by a single \n) sum to exactly 30. One does not need to count; one need only trust the prefix.

There are three special pkt-lines, and a great deal of the protocol’s elegance depends on them. A line beginning 0000 is a flush packet — it has no payload, and its meaning is “I am done with this section.” A line beginning 0001 is a delim packet, introduced with Protocol v2, and it separates capability lines from arguments within a single command. A line beginning 0002 is a response-end packet, also a v2 addition, used to mark the end of a response in a stateless connection where several command-responses share a single transport. Lengths from 0004 (an empty payload, which the spec allows but discourages) up to fff0 (65520 bytes total, of which 65516 are payload) carry data. Everything else is reserved.

A typical advertisement opens like this:

plain

001e# service=git-upload-pack\n
0000
0148<sha1> HEAD\0multi_ack_detailed no-done side-band-64k thin-pack...\n
003d<sha1> refs/heads/main\n
0040<sha1> refs/heads/develop\n
0000

The first pkt-line is a service announcement. The first 0000 flushes that section. Then the server emits one pkt-line per advertised ref, with the very first ref carrying a NUL byte and, after it, the server’s capability list. The second 0000 ends the advertisement.

Two observations are worth lingering over. The first is that the capability list is wedged into the first ref’s pkt-line rather than given its own packet — a piece of cheerful 2005 thrift that has survived intact. The second is that the framing is self-synchronising. If one drops into the middle of a Git stream with no prior context, one need only read four bytes, parse a hex length, skip that many bytes, and one is exactly at the start of the next pkt-line. There is no preamble to find, no state to recover. This is the kind of property that does not announce itself on the day a protocol is designed, but which one is grateful for every time one writes a packet sniffer.

The Two Halves of the Conversation: upload-pack and receive-pack

Git’s network operations come in two flavours, and the names are unfortunately back-to-front from the point of view of the user. git fetch and git clone invoke upload-pack on the server — the server is “uploading” the pack to the client. git push invokes receive-pack on the server — the server is “receiving” the pack from the client. The verbs are written from the server’s perspective. One either learns this and is forever calm, or one does not and is occasionally bewildered.

For the rest of this account I shall mostly follow the fetch path, because it is where the negotiation lives, and the negotiation is the most interesting thing about the protocol.

A fetch consists of three phases. First, the server advertises its refs and capabilities — it tells the client what it has. Second, the client and server negotiate which commits they have in common, so that the server need not ship history the client already possesses. Third, the server constructs a packfile containing exactly the objects the client lacks, and streams it back.

The first phase is straightforward. The third phase is mostly file-format work. The interesting phase, in the engineering sense, is the second.

The Negotiation

Consider the problem the negotiation is trying to solve. The client wants some refs — refs/heads/main, say — at a particular SHA-1. It has some history of its own, which may or may not overlap with the server’s. It would be cheap, and dim-witted, for the server simply to ship every commit reachable from the requested ref. It would be expensive, and equally dim-witted, for the client to enumerate every commit it has and ask the server to filter.

Git’s negotiation threads its way between these. The client emits a sequence of want lines — “I want commit X, and Y, and Z” — followed by a flush. It then emits a sequence of have lines — “I have commit A, and B, and C, …” — pausing periodically to let the server respond. The server, on receiving each batch of have lines, checks whether any of them are ancestors of any object reachable from the requested wants. If they are, the server emits an ACK for that commit and the negotiation has made progress.

The client’s job, at this point, is to walk its own commit graph cleverly. It does not send every commit it has — that would defeat the point. Instead, it sends in waves: first its branch tips, then their parents, then their grandparents, increasing the search radius until either the server acknowledges enough commits to determine a common ancestor, or the client exhausts its history and the server is obliged to send everything from the roots.

A simple fetch in flight looks something like this:

plain

0067want <sha1-of-wanted-commit> multi_ack_detailed side-band-64k thin-pack...\n
0032want <sha1-of-another-wanted-commit>\n
0000
0032have <sha1-of-local-commit-1>\n
0032have <sha1-of-local-commit-2>\n
...
0009done\n

The server responds with ACK <sha1> for each have it recognises, or NAK\n if none of the current batch matched. When the client decides it has nothing more useful to offer — or the server announces it has found a sufficient common base — the client sends done, and the server commits to producing a packfile.

There are three negotiation modes the protocol supports, and the rules differ subtly between them. In the oldest mode — plain multi_ack — the server emits ACK <sha1> continue for each common commit it finds, and the client must decide on its own when to stop. In multi_ack_detailed, which is what almost any modern client uses, the server distinguishes between two kinds of ACK: common (we both have this commit) and ready (we have enough; you can send done whenever you like). The detailed mode lets the server hint at when the conversation should end without requiring it. The negotiation, in other words, is a polite exchange in which the server is permitted to say “you may stop now” but not “you must.”

The algorithm that drives the client’s choice of which haves to send is, in practice, a bounded walk over its own commit graph. Git starts from each local ref tip and walks parents in roughly chronological order, in batches of 32. The canonical client does not stop and wait between batches; it pipelines them, keeping 32 haves in flight at all times — when one batch ends in a flush, the next 32 are already on the wire by the time the server is ready to answer. If the server acknowledges enough of them — typically when it has found commits on every side of the merge frontier — the walk can stop. There is a skip-list of commits already explored to avoid revisiting branch joins, and a common set of commits the server has acknowledged, used to prune the search. None of this is exposed to the user, and rightly so; but it is the difference between a fetch that ships fifty commits and one that ships fifty thousand.

The engineering judgment in this design is, to my mind, the most admirable thing about Git’s wire protocol. The negotiation is iterative, not enumerative. The client never has to send its whole graph. The server never has to send its whole graph. Each round trip prunes the space the next round trip must explore, and the conversation converges, in the median case, in two or three round trips. There are pathological cases — branches with no common ancestor, brand-new clones — and the protocol handles them gracefully by falling through to “send everything reachable from the wants,” which is, after all, the worst case it could possibly perform.

The Packfile

Once the negotiation has determined a set of objects the client lacks, the server’s task is to deliver them. It does so by constructing a packfile — a single binary stream containing every needed blob, tree, commit, and tag, compressed with zlib and, where it helps, expressed as a delta against another object already in the stream.

The packfile format is brief enough to describe in one paragraph. A 12-byte header (PACK magic, four-byte version, four-byte object count), followed by the objects themselves, followed by a 20-byte SHA-1 trailer that checksums the entire pack. Each object starts with a variable-length header that encodes its type — OBJ_COMMIT, OBJ_TREE, OBJ_BLOB, OBJ_TAG, OBJ_OFS_DELTA, or OBJ_REF_DELTA — and its uncompressed size. The body is zlib-compressed.

Delta objects deserve a separate sentence. A delta object is not a complete object; it is a sequence of instructions for reconstructing one object from another. OBJ_OFS_DELTA says “the base object is N bytes earlier in this packfile”; OBJ_REF_DELTA says “the base object is the one with SHA-1 X, which I trust you can find.” The instructions themselves are tiny: copy these bytes from the base, then insert these literal bytes, then copy these other bytes. For a file that has had one comment edited, the delta is a few dozen bytes; the entire object’s reconstruction is therefore a few dozen bytes plus a pointer to its base. This is how a Linux-kernel-sized repository fits into the gigabyte range rather than the terabyte one.

A thin-pack, which is the default for fetches, is permitted to use OBJ_REF_DELTA against objects the client already has but which are not present in the pack itself. The client, on receipt, must fatten the pack by resolving those external base references before it can index it. This is one of those decisions that looks like a compromise on first reading and turns out, on second reading, to be the obvious thing — there is no point shipping a base object the client already possesses merely to keep the pack self-contained.

After the pack arrives, the client indexes it: it computes a .idx file containing the SHA-1 of every object, its offset in the pack, and a small fanout table keyed by the first byte of the SHA-1 so that lookups can binary-search within a 256th of the index. Modern .idx files — version 2, which has been the default since 2007 — open with a four-byte magic number \377tOc and a version word, and to the fanout and the SHA-1 table they add a CRC32 per object (used to detect corruption when objects are copied between packs) and a split pair of offset tables, four bytes wide for the common case and eight bytes wide for the entries that need them, so that packs larger than four gibibytes can still be addressed. The pack is now usable. Until it is indexed, it is, strictly speaking, an opaque blob with a checksum at the end.

Smart HTTP: Tunnelling an SSH-Era Protocol Through HTTPS

Git was originally spoken over git://, a bespoke protocol on TCP port 9418, and over SSH. Both are perfectly serviceable on the open Internet of 2005; both are increasingly inconvenient on the corporate Internet of 2015 onwards, where outbound TCP to arbitrary ports is firewalled and the only reliable transport is HTTPS on port 443. Git’s response was smart HTTP, which is the mechanism by which one can git clone https://github.com/... and have the whole pkt-line conversation take place over what looks, to any firewall, like an ordinary web request.

Smart HTTP uses two endpoints. The first is GET /info/refs?service=git-upload-pack (or ...=git-receive-pack). The server responds with the same ref advertisement one would have seen on the wire of the native protocol, prefixed with a # service=...\n pkt-line and ended with a flush. The Content-Type is application/x-git-upload-pack-advertisement, which is the polite way of telling intermediate caches not to mangle it.

The second endpoint is POST /git-upload-pack (or /git-receive-pack), to which the client sends its want/have conversation as the request body — a sequence of pkt-lines, exactly as one would have written them on a socket — and from which it receives the packfile as the response body. The Content-Types are application/x-git-upload-pack-request and application/x-git-upload-pack-result respectively.

The trick that makes this work, and that took me a moment to appreciate, is that the protocol is stateless from HTTP’s point of view but stateful from Git’s. The whole want/have negotiation has to happen in a single HTTP POST. The client cannot say have <sha1> in one request, get an ACK, and decide what to send in the next request — at least not in v1. Instead, the client computes its entire opening offer locally, sends it in one body, and the server responds with whatever it can. If the conversation needs more rounds, the client opens a fresh POST and sends its next offer (including everything from the previous round it now knows the server has). Each POST is a complete sub-conversation that begins with the client speaking and ends with the server.

This is not, technically, an efficient use of HTTP. Each round trip is a fresh request. The TLS session may be reused, but the application-level state is not. Yet it works, because the negotiation typically converges in one or two rounds, and the cost of an extra POST is, in the median case, an extra hundred milliseconds against the seconds of pack streaming that follow. The protocol’s tolerance for inefficient transports turns out to be a feature.

Protocol v2: The Quieter Conversation

In 2018 Git introduced a second version of the wire protocol, and it is worth a section on its own — not because it changes the fundamentals, but because it changes what one might call the politics of the opening exchange.

The trouble with v1 is the ref advertisement. Whenever a client connects, the server sends every ref it has — every branch, every tag, every pull-request pointer that the hosting provider has tucked into a hidden namespace. For a small repository this is invisible. For a repository with three hundred thousand refs — and there are such repositories — it is several megabytes of pkt-lines transmitted before the client has so much as said what it wants. The cost falls hardest on git ls-remote, which throws all of that information away after reading the one line it asked for.

Protocol v2 inverts this. The client, on connecting, declares the protocol version in whatever way the transport allows: over ssh:// and file:// it sets the GIT_PROTOCOL=version=2 environment variable in the remote process; over the native git:// protocol it appends version=2 as an additional NUL-delimited extra-parameter inside the initial pkt-line request; and over HTTP it sends a Git-Protocol: version=2 header. Three transports, three mechanisms, the same message. The server, recognising v2, advertises its capabilities — but not its refs. The client then issues an explicit command: ls-refs (with optional ref-prefix filters), or fetch, or object-info. The server responds with only the information that command requested.

A v2 conversation thus begins something like:

plain

000eversion 2\n
0015agent=git/2.40.0\n
001fls-refs=unborn ref-in-want\n
0019fetch=shallow filter\n
0012server-option\n
0017object-format=sha1\n
0000

— a capability advertisement, with no refs in sight. The client now sends a command pkt-line and a delim packet (0001) before its arguments:

plain

0014command=ls-refs\n
0015agent=git/2.40.0\n
0001
0009peel\n
000csymrefs\n
001bref-prefix refs/heads/\n
0000

The ref-prefix argument restricts the response to refs under refs/heads/. The server returns only those refs. The advertisement has been narrowed from “everything the server has” to “exactly what the client asked about,” and on a large repository the difference is several orders of magnitude of bandwidth.

The other notable addition in v2 is ref-in-want. In v1, a client must phrase its wants in terms of object IDs (SHA-1s), which it must have learned from a prior ref advertisement. In v2, the client may say want-ref refs/heads/main directly, and the server resolves the ref to an object ID server-side. This eliminates a race condition that v1 had quietly tolerated — between the moment the client read the ref advertisement and the moment it sent its wants, the ref might have moved — and it eliminates the need to advertise refs the client did not ask about.

Protocol v2 did not change the packfile format. It did not change the negotiation algorithm. It changed only the opening — and that turned out to be the right place to change, because the opening was where the protocol’s age showed most clearly. The fundamentals — pkt-lines, packfiles, the iterative want/have conversation — held up perfectly well; what had aged badly was the unconditional ref advertisement.

What I Actually Learned

Three things seem to me worth writing down.

The first is that pkt-line framing is the unsung hero of this protocol. A four-byte hex length prefix is an exceedingly modest design choice — there are protocols of comparable age that do far cleverer things with framing — and it is exactly that modesty which has allowed every other piece of the protocol to evolve without disturbing the bytes underneath. New commands, new capabilities, new modes of negotiation, even a whole new protocol version, have all been added without changing the framing by a single byte. One does not always notice the parts of a protocol that hold the others up; pkt-lines are such a part.

The second is that the negotiation is the most interesting thing in the entire system, and it is also the thing one is least likely to notice using Git day to day. The fact that git fetch ships fifty kilobytes after a routine pull, rather than the gigabytes one’s repository contains, is the work of a small, careful algorithm that walks the commit graph in batches and trades round trips for bandwidth. It is the kind of cleverness that pays for itself many millions of times a day, and that is invisible to anyone who has not gone looking for it.

The third — and this is the one I should like to leave standing — is that Git’s wire protocol is a study in avoiding premature elegance. It is not RPC. It is not gRPC. It is not Protocol Buffers. It is text-shaped binary, with a hex length prefix and a flush packet, written by people who were, one suspects, less interested in being modern than in being correct. The protocol has been retrofitted to HTTPS, extended with capabilities, given a wholly new version — and the original 2005 pkt-line is still the unit of every message. I do not think one could ask a protocol design to age better than that.

The source, for anyone curious to read further, lives in Documentation/ inside the Git repository — particularly gitprotocol-common.adoc, gitprotocol-pack.adoc, gitprotocol-v2.adoc, gitprotocol-http.adoc, and gitformat-pack.adoc. (These were Documentation/technical/*.txt until Git 2.38 in 2022 rehomed them, renamed them, and reset their extension to AsciiDoc; older posts on the protocol still cite the old paths.) They are not, as documentation goes, light reading; but they are the closest thing the protocol has to a specification, and the bytes one sees on the wire match them with a fidelity one wishes were more common.