[TDF infra] Announcing Gitiles VCS browser (gitweb replacement) and https:// anon git URIs

Mon Oct 22 15:25:11 UTC 2018

On Mon, Oct 22, 2018 at 04:33:21PM +0200, Guilhem Moulin wrote:
> On Mon, 22 Oct 2018 at 11:51:35 +0200, Lionel Elie Mamane wrote:
>> On Wed, Oct 17, 2018 at 09:03:45PM +0200, Guilhem Moulin wrote:

>>> SSH is only used for transport, a git processed is exec()'ed on the
>>> remote just like for git-daemon(1), so the only overhead is
>>> crypto-related.  The handshake is a one-off thing, thus negligible
>>> when you're transferring a large amount of data at once; (...) As
>>> for symmetric crypto overhead, (...) the overhead should be
>>> negligible.

>> All I know is that about 1/2/3 years ago ('t was I think in some
>> coworking space in Brussels, probably a hackfest) I showed Michael
>> Meeks how to have a separate "push" url (with ssh: protocol) and
>> "pull" url (with git: protocol) and he was very happy at the
>> speed-up.

> Might be orthogonal to the git:// vs. https:// vs. ssh://
> discussion.  Gerrit uses JGit as Git implementation, while
> git-daemon(1) spawns “normal” (C-based) git-upload-pack(1)
> processes.

For us developers of LibreOffice, and thus consumers of the Gerrit /
Git service of freedesktop.org and TDF, whether the difference comes
from the protocol itself or a different git implementation on the
server to serve the different protocols is intellectually interesting
(thanks for that!), but materially inconsequential: if using git: will
be faster, we will use git:.

> I recall Norbert and I sat down during FOSDEM 2017 to solve perf
> issues with our JGit deployment.  Perhaps you configured your
> ‘remote.<name>.pushurl’ at the same time :-)

I can easily believe it was earlier.

> Anyway, it's easy enough to benchmark no-op `git fetch` on core.  master
> is currently at c99732d59bc6, and I'm fetching from the same datacenter
> to avoid metrics being polluted with network hiccups.

Yes, but no. You also test in an environment where a network RTT is
probably about one fifth to one third of a millisecond, and bandwidth
at least 100Mbps if not 1000Mbps? In that case, everything will be
fast. Time difference will be lost in noise. The interesting cases
will be:

1) Someone's out in the woods home DSL line in the woods; fiber hasn't
   come to that village yet, or has come to the town but not that
   particular street. RTT time about 50ms; bandwidth about 20 Mbps
   down (or less), much less up.

2) Case 1 added "on the other side of the world" (South-east asia?
   South America? New Zealand?), you can easily get RTT times of about
   300ms. Even if you are in a ultra-fast network (like university
   network). It is the other side of the world.

3) A coworking space that has good-for-typical-use connection, but
   then TDF does a hackfest there and a bunch of geeks (us) overflow
   the connection.

4) I'm at a conference, half listening to the presentation, half
   hacking on LibreOffice. The conference WiFi is overrun by everyone
   doing the same, people's laptops and pocket computers
   ("smartphones") automatically downloading updates (technical and
   social ones...), etc. How usable will it be? E.g. CCC (the Chaos
   Communication Congress) was known for having a totally overwhelmed
   WiFi; every year a new vendor would "gift" their better solution
   and this year the wireless network would actually be good! But
   every year it wasn't. (Has it actually improved in the last years
   that I didn't go?)

Are these protocols (or the *implementations* of these protocols) more
sensitive to RTT than another? They do more roundtrips? Or not?

-- 
Lionel