[Intel-gfx] [RFC 00/22] Gen7 batch buffer command parser

Wed Nov 27 09:10:28 CET 2013

On Wed, Nov 27, 2013 at 09:32:32AM +0800, ykzhao wrote:
> On Tue, 2013-11-26 at 13:24 -0700, Volkin, Bradley D wrote:
> > On Tue, Nov 26, 2013 at 11:35:38AM -0800, Daniel Vetter wrote:
> > > Hi Brad,
> > > 
> > > On Tue, Nov 26, 2013 at 08:51:17AM -0800, bradley.d.volkin at intel.com wrote:
> > > > From: Brad Volkin <bradley.d.volkin at intel.com>
> > > > 
> > > > Certain OpenGL features (e.g. transform feedback, performance monitoring)
> > > > require userspace code to submit batches containing commands such as
> > > > MI_LOAD_REGISTER_IMM to access various registers. Unfortunately, some
> > > > generations of the hardware will noop these commands in "unsecure" batches
> > > > (which includes all userspace batches submitted via i915) even though the
> > > > commands may be safe and represent the intended programming model of the device.
> > > > 
> > > > This series introduces a software command parser similar in operation to the
> > > > command parsing done in hardware for unsecure batches. However, the software
> > > > parser allows some operations that would be noop'd by hardware, if the parser
> > > > determines the operation is safe, and submits the batch as "secure" to prevent
> > > > hardware parsing. Currently the series implements this on IVB and HSW.
> > > > 
> > > > The series is divided into several phases:
> > > > 
> > > > patches 01-09: These implement infrastructure and the command parsing algorithm,
> > > >                all behind a module parameter. I expect some discussion and
> > > > 	       rework, but hopefully there's nothing too controversial.
> > > > patches 10-17: These define the checks performed by the parser.
> > > >                I expect much discussion :)
> > > > patches 18-20: In a final pass over the command checks, I found some issues with
> > > >                the definitions. They looked painful to rebase in, so I've added
> > > > 	       them here.
> > > > patches 21-22: These enable the parser by default. It runs on all batches except
> > > >                those that set the I915_EXEC_SECURE flag in the execbuffer2 call.
> > > 
> > > I think long-term we should even scan secure batches. We'd need to allow
> > > some registers which only the drm master (i.e. owner of the display
> > > hardware) is allowed to do, e.g. for scanline waits. But once we have that
> > > we should be able to port all current users of secure batches over to
> > > scanned batches and so enforce this everywhere by default.
> > > 
> > > The other issue is that igt tests assume to be able to run some evil
> > > tests, so maybe we don't actually want this.
> > 
> > Agreed. I thought we could handle this as a follow-up task once the basic stuff is
> > in place, particularly given that we'd want to modify at least some users to test.
> > I also wasn't sure if we would want the check to be root && master, as in the current
> > secure flag, or just master.
> > 
> > W.r.t. the tests, I suppose we can just turn checking on for secure batches and see
> > what happens.
> > 
> > > 
> > > > There are follow-up patches to libdrm and to i-g-t. The i-g-t tests are very
> > > > basic and do not test all of the commands used by the parser on the assumption
> > > > that I'm likely to make the same mistakes in both the parser and the test.
> > > 
> > > Yeah, I agree that just checking whether commands all go through (or not)
> > > as expected adds very little value on top of the few tests you have done.
> > > I think we should take a look at some corner cases which might trip up
> > > your checker a bit though:
> > > - I think we should check batchbuffer chaining and make sure it works on
> > >   the vcs ring and not anywhere else (we can't ever break shipping libva
> > >   which uses this).
> > > - Some tests to trip up your parser should be done, like 3D commands that
> > >   fall off the end of the batch bo. Or commands that span page boundaries.
> > >   The later isn't an issue atm since you use vmap, but we should switch to
> > >   per-page kmap since the vmap overhead is fairly horrible.
> > 
> > Good suggestions. I'll look into these.
> Hi, Brad
>       More inputs from libva about the batchbuffer chaining.
> 
>       Now the batchbuffer chaining is widely used in libva driver. This
> is related with how the libva driver processes the image. For the
> encoding purpose, it needs to be handled based on macroblock(16x16).And
> every macroblock needs a group of GPU commands. So the GPU commands for
> all the macroblocks will be constructed in the second-level batchbuffer.
> The mode of batchbuffer chaining will bring the following benefits:
>       a. The size of second-level batch buffer can be allocated based on
> the size of handled image. For example: 1080p/720p/480p can use the
> different size.
>       b. The gpu commands in second-level batchbuffer can be constructed
> by using GPU instead of CPU, which is helpful to improve the
> performance. 
> 
>       At the same time both VCS and Render Ring are used in libva
> driver. For example: The encoding will use VCS and RCS ring. Firstly the
> RCS ring is used to execute GPU command for the motion vector/mode
> prediction. And then the VCS Ring is used to execute the GPU command for
> generating the bit-stream. So not only VCS ring uses the mode of
> batchbuffer chaining, but also the Render Ring uses the mode of
> batchbuffer chaining.

So are these 2nd level batches constructed by the gpu in some cases? That
would be fairly horribly to take into account with the batch checker ...
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch