[Intel-gfx] [PATCH 1/4] introduce intel_ring_buffer structure
Zou, Nanhai
nanhai.zou at intel.com
Fri May 14 03:39:22 CEST 2010
Hi,
Thanks for reviewing the patch.
To clarify,
This patch is used for H.264/VC1 decoding.
Abstract ring buffer is also needed for our later generation GPUs,
Because there are more types of ring buffer to come.
H.264/VC1 HW decoding is a bigger requirement than mepg2.
XvMC VLD can only support mpeg2. It is rendering in server context
and lack of post processing features.
We are not going to support it later, our later media work will be
based on vaapi.
Synchronize between BSD and render ring will be done by VAAPI client.
We have run Tons of video validation upon this patch.
Our QA will do 1-2 days of regression test on different chips each time I rebase and sent out the patch
Let's get simple code works first then we can optimize it.
Thanks
Zou Nan hai
-----Original Message-----
From: Daniel Vetter [mailto:daniel at ffwll.ch]
Sent: 2010年5月12日 6:24
To: Zou, Nanhai
Cc: Anholt, Eric; Intel GFX
Subject: Re: [Intel-gfx] [PATCH 1/4] introduce intel_ring_buffer structure
Hi all,
Ok, here's my try at a review. Sorry for the rather long delay.
First things first, please put on your asbestos suits, this one's gonna be
rough;) Second: Your patch is still a pain to review for the following
reasons:
- IMHO you split-up made things worse: The completely rewritten render
ring implementation is introduced in patch 1, but only really fully used
when all 4 patches are applied. Now instead of hunting around in one
monster patch (which was hard) I have to hunt around in four different
patches to make sense of changes (rather impossible). Similarly other
stuff is splattered all over the series, e.g. I've found definitions for
the bsd ring in patch 1 instead of patch 3.
- Your doing tons of (at least at first glance) superfluously
search&replaces. Yep, inconsistent naming is ugly, but patches bloated
with tons of such changes are even uglier. Some examples:
s/ring/render_ring/
s/drm_i915_ring_buffer_t/intel_ring_buffer/
There are more. If you think these changes really are beneficial, please
do one patch per s&r with nothing else in it.
Now for the more specific issues:
- In my previous review I've proposed to replace the gpu execution
breadcrumb with something like this
struct intel_gpu_breadcrumb {
uint32_t seqno : 30;
uint32_t domain : 2;
};
Compared to your approach this doesn't waste a pointer per gem object.
Further it can be used for execution domains like keeping track of
pageflips (tossed around some ideas with Kristian on irc) or to batch up
gtt chipset flushes (probably needed to make my i855 cache coherency stuff
fast again). So yes, I'd like to see this.
- struct intel_ring_buffer looks massively overdesigned in your patch. I
think a more lightweight approach that leaves as much as possible of the
render ring implementation untouched (and doesn't move it around) is
better for at least two reasons:
- Your abstraction requires quite a few automatic changes all over. This
needlessly bloats the patch and makes finding the interesting stuff
harder.
- Greatly reduced risk of introducing regressions.
There are also a few places that look rather fishy. Mostly I simply
couldn't follow anymore what's going on due to the reasons laid out above.
Anyway, here they are:
- You (seem to) move around kernel_lost_context. For a feature that's
currently only supported for hw that was never supported with ums, this
is either fishy or needs a big comment.
- In i915_gem_flush you simply flush both ringbuffers. Now either these
two have independent caches (and only one needs to be flushed, gem is
not supposed to do unnecessary flushes) or they are coherent and only
one flush is required, period. Current code looks like it tries to paper
over some coherency issues (and I've learned to loathe these buggers).
Related to this: The active list gets split up between the to
ringbuffers. The flushing list is still global. This looks inconsistent.
- I haven't found out how synchronization between the bsd and the render
ringbuffer is handled. And if the answer is "userspace takes care" I'm
not gonna like it.
With this off the table, I still have to do some venting ;) The idea
behind splitting patches up is:
- to make life easier for reviewers. IHMO you've failed in that regard.
- to make bisecting regressions easy. Given that I've found gimmicks like
tatus_page_dmah->vaddr;
- memset(dev_priv->render_ring.status_page.page_addr
+ memset(dev_priv->render_ring.status_page.page_addr,
0, PAGE_SIZE);
}
in your patches (note the re-added semicolon at the end) your patches
definitely fail at that - they won't even compile in between. And it
looks like other people have problems simply applying the patches, too.
So please take a tad bit more care when prepping patch series. Otherwise
the hours (including thinking time) I've just put into this review feel
kinda wasted.
For a _really_ nice patch series that was a joy to read, look no further
than at Zhenyu Wang's recent connector rework (the patch series that got
merged, _not_ the first submission - but that can serve as comparison).
Oh, one last thing: Given that libIntelXvMC seems to support vld (for
mpeg2), why exactly do we need this?
Yours, Daniel
--
Daniel Vetter
Mail: daniel at ffwll.ch
Mobile: +41 (0)79 365 57 48
More information about the Intel-gfx
mailing list