[TDF infra announce/request] reduce downstream traffic by 50× on git clones with `git config protocol.version 2`
guilhem at libreoffice.org
Thu Jul 16 13:59:57 UTC 2020
On Sat, 04 Jul 2020 at 06:26:05 +0200, Guilhem Moulin wrote:
> TL;DR: run `git config protocol.version 2` in your local clones of the
> core and online repositories. That should reduce downstream traffic by
> over 50x (from 9MiB to 150kiB) in no-op `git fetch` commands.
> Client-side this only applies to git ≥2.18.0.
Quick follow up to summarize some IRC discussions:
* `git push` commands have been reported to be much slower and occasionally
yield an upload of several MiB (even for single patches) with version 2 of
the wire protocol. AFAICT this is not a regression, the new wire protocol
has little to no effect on pushes. Please make sure to consistently call
`git fetch $REMOTE` immediately before `git push $REMOTE` (on the same
To avoid having to decrypt your SSH key twice, you can setup the OpenSSH
authentication agent, or simply use an anonymous scheme for fetches.
Assuming the remote name to use for pushing is ‘logerrit’, the following
git config remote.logerrit.pushurl ssh://logerrit/core
git config remote.logerrit.url https://git.libreoffice.org/core
Fetching from a different remote name does *not* help (even when it has the
same URL). The problem here is that the client uses git-merge-base(1) to
guess which objects are unknown to the server and thus need to be uploaded.
If someone has pushed to the target branch since you last fetched, and if you
don't run `git fetch` again to update ‘refs/remotes/$REMOTE/’, then git won't
see that only few objects needs to be uploaded. It will instead upload a
large pack (potentially several MiB large) containing everything since the
last matching reference (likely the last tag or branch-point).
This can indeed be worse with the new wire protocol: with the old one
git-merge-base(1) had 300k+ references at hand in order to find a common
ancestor, so in practice the pack could stop before the last tag or
branch-point. But this is just out of pure luck, hence for my perspective is
not a regression: please always call `git fetch $REMOTE` before `git push
* Changing protocol.version on a given repository (i.e., `git config
protocol.version 2` without the --global flag) doesn't appear to affect
submodules. This can be verified by running
GIT_TRACE_PACKET=1 git -C submodule update --remote $SUBMODULE
and the lack of protocol version number in the handshake. To change the wire
protocol version in submodules, you'll need to run
git config -f .git/modules/$SUBMODULE/config protocol.version 2
or simply change it globally:
git config --global protocol.version 2
That being said, the new wire protocol really shines for remotes with a huge
number of references, which our submodules don't have, so for these the
improvement will be marginal (it won't hurt though).
* git upstream had an attempt at bumping the default version of the wire
protocol. It was bumped to v2 in git 2.26, and demoted back to v0 in git
2.27 as it “turned out to have some remaining rough edges.”
It seems to work fine (and has done so for 2.5 years now) at Google for
Chromium and gerrit though. I guess git upstream will bump the default
version again at some point, but given the huge performance gain I see no
reason *not* to call `git config protocol.version 2` in core and online.
PS: Please preserve the recipient list in replying.
-------------- next part --------------
A non-text attachment was scrubbed...
Size: 833 bytes
Desc: not available
More information about the LibreOffice