[Bug 55500] [sna gen4] corrupt rendering (and flickering on redraw)

Sun Jul 7 03:36:13 PDT 2013

https://bugs.freedesktop.org/show_bug.cgi?id=55500

--- Comment #99 from Chris Wilson <chris at chris-wilson.co.uk> ---
(In reply to comment #97)
> (In reply to comment #96)
> > (In reply to comment #93)
> > > The bad part is - while before I've been getting 3.7Mchar/s in x11perf
> > > -aa10text - now it's like 1.3Mchar/s  so significantly slower.
> > 
> > That's the sacrifice, we have to stop sending commands to the GPU and wait
> > for it complete those in flight (quite frequently). Or else new rectangles
> > overwrite vertex entries still being used by later entries 
> > 
> > > So the question here would be - isn't the corruption based on  triangle
> > > surface size ? So i.e. GPU is able to  process a lot of small ones - but has
> > > bug with bigger ones ?
> > > 
> 
> But as I said before - if that would be plain hw defect - IMHO it would
> simply always appear - but it seems like it's working for a while - then
> 'something' happens - and flickering starts to appear - with (assumingly)
> same amount
> of texels/triangle/vertices - and than something again may happen,
> and the problem is gone for a while.

It does. You do not have quite as much control over your tests as you presume.

> > Not really, you have to predict when a VUE being used by the end of the
> > pipeline will be overwritten by a new rectangle at the start of the
> > pipeline. This is completely internal state - the primitive command we want
> > to feed to the GPU can contains thousands of rectangles. Instead of counting
> 
> Well I've tried even 8 max triangles - and the error appeared after a while,
> so far '6' is magic.
> 
> > rectangles, you want to start counting fragments (actually texel reads since
> > that will be the ratelimiting factor) and flush if we queue up too much work
> > for the GPU. If you also model how fast the gpu is retiring fragments so
> 
> But in case the same page is rendered with problems as well as without
> problems,
> then it doesn't look like texel read is problem, it rather looks like some
> kind of memory mapping/ordering.

No. I did not say the texel reads where the problem, just an indicator as to
how long the EU would execute any particular shader for a fragment. Also there
is only a single sampler and many EU running many more threads, so contention
will also play a factor into how long each fragment takes to process, and so
how long buffers will be active for. Look more closely at what it is going on,
it is clearly that the hardware is not tracking lifetimes of its URB correctly.

> Also is there some explanation why  intel_gpu_top is showing so much higher
> GPU usage when the flickering is visible ?

Other than the flickering correlates with GPU activity?

(In reply to comment #98)
> Well I should wait a while before posting a comment about magic value 6.
> 
> I'm now observing flickering with value 6 as well.
> 
> So yeah - it's more or less time related - and it takes more or less time
> until the problem becomes visible.
> 
> Also is there explanation with the max value 64  starts to make problems
> with text rendering in gnome terminal ?
> 
> i.e. I'd have expected if there would be a large press on GPU - but in this
> case it just appear random pixel start to be drawn instead of some letter -
> maybe some font cache corruption ?

It's still the same bug.

-- 
You are receiving this mail because:
You are the QA Contact for the bug.
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://lists.freedesktop.org/archives/intel-gfx-bugs/attachments/20130707/b1ace429/attachment.html>