[Intel-gfx] Maintainer-review fluff (was: Re: [PATCH 01/12] drm/i915: plumb VM into object operations)

Fri Jul 26 22:43:02 CEST 2013

On Fri, Jul 26, 2013 at 10:15 PM, Ben Widawsky <ben at bwidawsk.net> wrote:
> I think the subthread Jesse started had a bunch of good points, but
> concisely I see 3 problems with our current process (and these were
> addressed in my original mail, but I guess you didn't want to air my
> dirty laundry :p):

I've cut out some of the later discussion in my mail (and that thread)
since I've figured it's not the main point I wanted to make. No fear
of dirty laundry ;-)

>
> 1. Delay hurts QA. Balking on patches because they're hard to review
> limits QA on that patch, and reduces QA time on the fixed up patches. I
> agree this is something which is fixable within QA, but it doesn't exist
> at present.

Yeah, I agree that this is an issue for developers without their
private lab ;-) And it's also an issue for those with one, since
running tests without a good fully automated system is a pian.

With discussed this a bit with Jesse yesterday on irc, but my point is
that currentl QA doesn't have a quick enough turn-around even for
testing -nightly that this would be feasible. And I also think that
something like this should be started with userspace (i.e. mesa)
testing first, which is already in progress.

Once QA has infrastructure to test arbitrary branches and once they
have enough horsepower and automation (and people to do all this) we
can take a look again. But imo trying to do this early is just wishful
thinking, we have to deal with what we have, not what we'd like to get
for Xmas.

> 2. We don't have a way to bound review/merge. I tried to do this on this
> series. After your initial review, I gave a list of things I was going
> to fix, and asked you for an ack that if I fixed those, you would merge.
> IMO, you didn't stick to this agreement, and came back with rework
> requests on a patch I had already submitted. I don't know how to fix
> this one because I think you should be entitled to change your mind.
>
> A caveat to this: I did make some mistakes on rebase that needed
> addressing. ie. the ends justified the means.

Yeah, the problem is that for really big stuff like your ppgtt series
the merge process is incremental: We'll do a rough plan and then pull
in parts one-by-one. And then when the sub-series get reviewed new
things pop up. And sometimes the reviewer is simply confused and asks
for stupid things ...

I don't think we can fix this since that's just how it works. But we
can certainly keep this in mind when estimating the effort to get
features in - big stuff will have some uncertainty (and hence need for
time buffers) even after the first review. For the ppgtt work I need
to blame myself too since the original plan was way too optimistic,
but I really wanted to get this in before you get sucked away into the
next big thing lined up (which in this case unfortunately came
attached with a deadline).

> 3a. Reworking code introduces bugs. I feel I am more guilty here than
> most, but, consider even in the best case of those new bugs being
> caught in review. In such a case, you've now introduced at least 2 extra
> revs, and 2 extra lag cycles waiting for review. That assumes further
> work doesn't spiral into more requested fixups, or more bugs. In the
> less ideal case, you've simply introduced a new bug in addition to the
> delay.

I'm trying to address this by sharing rebase BKMs as much as possible.
Since I'm the one on the team doing the most rebasing (with -internal)
that hopefully helps.

> 3b. Patch splitting is art not science.
>
> There is a really delicate balance between splitting patches because
> it's logically a functional split vs. splitting things up to make things
> easier to chew on. Now in my case specifically, I think overall the
> series has improved, and I found some crud that got squashed in which
> shouldn't have been there. I also believe a lot of the splitting really
> doesn't make much sense other than for review purposes and sometimes
> that is okay.

Imo splitting patches has two functions: Make the reviewer's life
easier (not really the developers) and have simple patches in case of
a regression which bisects to it. Ime you get about a 1-in-5
regression rate in dinq, so that chance is very much neglectable. And
for the ugly regressions where we have no clue we can easily blow
through a few man-months of engineer time to track them time.

> In my case, I had a huge patch, but a lot of that patch was actually a
> sed job of "s/obj/obj,vm/."  You came back with, "you're doing a bunch
> of extra lookups." That was exactly the point of the patch; the extra
> lookups should have made the review simpler, and could be cleaned up
> later.
>
> My point is: A larger quantity of small patches is not always easier to
> review than a small quantity of large patches. Large patch series review
> often requires the reviewer to keep a lot of context as they review.

I don't mind big sed jobs or moving functions to new files (well those
quite a bit since they're a pain for rebasing -internal). But such a
big patch needs to be conceptually really simple, my rule of thumb is
that patch size times complexity should follow a constant upper limit.
So a big move stuff patch shouldn't also rename a bunch of functions
(wasn't too happy about Chris' intel_uncore.c extract) since that
makes comparing harder (both in review and in rebasing).

If the patch is really big (like driver-wide sed jobs) the conceptual
change should approach 0. For example if you want to embed an object
you first create an access helper (big sed job, no change, not even in
the struct layout). Then a 2nd patch which changes the access helper,
but would otherwise be very small.

Imo  the big patch I've asked you to split up had lot of sed-like
things, but also a few potentially functional/conceptual changes in
it. The combination was imo too much. But that doesn't mean I won't
accept sed jobs that result in a much larger diff, just that they need
to be really simple.

> *4. The result of all this is I think a lot of the time we (the
> developers) end up writing your patch for you. While I respect your
> opinion very highly, and I think more often than not that your way is
> better, it's just inefficient.

Yeah, I'm aware that sometimes I go overboard with "my way or the
highway" even if I don't state that explicitly. Often though when I
drop random ideas or ask questions I'm ok if the patch author sticks
to his way if it comes with a good explanation attached. That at least
is one of the reason why I want to always update commit messages even
when the reviewer in the end did not ask for a code change.

Todays discussion about the loop in one of your patches in
evict_everything was a prime example: I've read through your code,
decided that it looks funny and dropped a suggestion on irc. But later
on I've read the end result and noticed that my suggestion is much
worse than what you have.

In such cases I expect developers to stand up, explain why something
is like it is and tell me that I'm full of myself ;-)

This will be even more important going forward since with the growing
team and code output I'll be less and less able to keep track of
everything. So the chance that I'll utter complete bs in a review will
only increase. If you don't call me out on it we'll end up with worse
code, which I very much don't want to.

> I'll wrap this all up with, I don't envy you. On a bunch of emails, I've
> seen you be apologetic for putting developers in between a rock, and a
> hard place (you, and program management). I recognize you have the same
> dilemma with Dave/Linus, and the rest of us developers. I think the
> overall strategy should be to improve QA, but then you have to take the
> leap of limiting your requests for reworks, and accepting QAs stamp of
> approval.

Hey, overall it's actually quite a bit of fun.

I do agree that QA is really important for a fastpaced process, but
it's also not the only peace to get something in. Review (both of the
patch itself but also of  the test coverage) catches a lot of issues,
and in many cases not the same ones as QA would. Especially if the
testcoverage of a new feature is less than stellar, which imo is still
the case for gem due to the tons of finickle cornercases.

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch