[Intel-gfx] [PATCH] drm/i915: Allow null render state batchbuffers bigger than one page

Thu Oct 5 04:34:02 UTC 2017

On Thu, Aug 24, 2017 at 11:00:27PM +0000, Rodrigo Vivi wrote:
> On Thu, Aug 24, 2017 at 3:39 PM, Oscar Mateo <oscar.mateo at intel.com> wrote:
> >
> >
> > On 08/23/2017 05:01 PM, Rodrigo Vivi wrote:
> >>
> >> On Tue, Jul 18, 2017 at 8:15 AM, Oscar Mateo <oscar.mateo at intel.com>
> >> wrote:
> >>>
> >>>
> >>>
> >>> On 07/14/2017 08:08 AM, Chris Wilson wrote:
> >>>>
> >>>> Quoting Oscar Mateo (2017-07-14 15:52:59)
> >>>>>
> >>>>>
> >>>>>
> >>>>> On 07/13/2017 03:28 PM, Rodrigo Vivi wrote:
> >>>>>>
> >>>>>> On Wed, May 3, 2017 at 9:31 AM, Chris Wilson
> >>>>>> <chris at chris-wilson.co.uk>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>> On Wed, May 03, 2017 at 09:12:18AM +0000, Oscar Mateo wrote:
> >>>>>>>>
> >>>>>>>>       On 05/03/2017 08:52 AM, Mika Kuoppala wrote:
> >>>>>>>>
> >>>>>>>>     Oscar Mateo [1]<oscar.mateo at intel.com> writes:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     On 05/02/2017 09:17 AM, Mika Kuoppala wrote:
> >>>>>>>>
> >>>>>>>>     Chris Wilson [2]<chris at chris-wilson.co.uk> writes:
> >>>>>>>>
> >>>>>>>>
> >>>>>>>>     On Fri, Apr 28, 2017 at 09:11:06AM +0000, Oscar Mateo wrote:
> >>>>>>>>
> >>>>>>>>     The new batchbuffer for CNL surpasses the 4096 byte mark.
> >>>>>>>>
> >>>>>>>>     Cc: Mika Kuoppala [3]<mika.kuoppala at intel.com>
> >>>>>>>>     Cc: Ben Widawsky [4]<ben at bwidawsk.net>
> >>>>>>>>     Signed-off-by: Oscar Mateo [5]<oscar.mateo at intel.com>
> >>>>>>>>
> >>>>>>>>     Evil, 4k+ of nothing-ness that userspace then has to configure
> >>>>>>>> for
> >>>>>>>> itself
> >>>>>>>>     for correctness anyway.
> >>>>>>>>
> >>>>>>>>     Patch looks ok, but still question the sanity.
> >>>>>>>>
> >>>>>>>>     Is there a requirement for CNL to init the renderstate?
> >>>>>>>>
> >>>>>>>>     I would like to drop the render state init from CNL if
> >>>>>>>>     we can't find evidence that it needs it. Bspec indicates
> >>>>>>>>     that it doesnt.
> >>>>>>
> >>>>>> I'd like to drop as well, and I was hearing people around telling we
> >>>>>> didn't need anymore,
> >>>>>> however without this during power on I had bad failures...
> >>>>>>
> >>>>> The best I could get from architecture (+Raf) is that setting valid and
> >>>>> coherent values for the whole render state is required as soon as the
> >>>>> context is created, no matter who does it. If you see failures when the
> >>>>> KMD does not do it, that means the UMD must be missing something,
> >>>>> right?
> >>>>
> >>>> That is my initial response as well. The kernel does load one context,
> >>>> just so that the hardware always has space to write to on power saving.
> >>>> The only batch executed for it is the golden render state. Easy enough
> >>>> to only initialise that kernel context to isolate whether it is
> >>>> self-inflicted or that userspace overlooked something in its state
> >>>> management. (I have the view that even if userspace doesn't think it
> >>>> needs to use a particular bit of state today, tomorrow it will so will
> >>>> need it anyway!)
> >>>> -Chris
> >>>
> >>>
> >>> Rodrigo, you have access to a CNL: can you make this test? The idea is to
> >>> find out if the root cause for the failures you were seeing is the kernel
> >>> default context or in the UMD-created contexts.
> >>
> >> I'm sorry for the delay on this one.
> >>
> >> On the parts I have now I couldn't reproduce the issues I saw during
> >> power-on
> >> where null context helped.
> >>
> >> But anyways apparently we need this right?!
> >>
> >> What about the 4k+ sanity that Chris raised? Anything we should address
> >> first?
> >
> >
> > I don't think Chris had any problem with the batchbuffer being bigger than
> > 4k per se. His concern was: "why do we need to send this batchbuffer from
> > the KMD at all if the UMD has to send something very similar anyway?".
> > Even if this was true (I haven't found anybody to confirm or deny it) there
> > is still the question of the kernel context (which would never get
> > initialized to valid values by the UMD).
> 
> so, chris, rv-b? acked-by?

chris, mika, oscar...
what should we do with this?
just discard, ignore and move on without the null context for gen10+?

> 
> > The test was to only send the
> > golden state for the kernel context (and nothing else) and see if your
> > issues went away.
> >
> > Since your issues went away on their own without any golden state
> > whatsoever... does that mean Mesa fixed something they were missing during
> > the PO?
> 
> not sure what it was anymore
> 
> >
> >
> 
> 
> 
> -- 
> Rodrigo Vivi
> Blog: http://blog.vivi.eng.br