[Nouveau] "enable dri3 support without glamor" causes gnome-shell regression on nv4x

Hans de Goede hdegoede at redhat.com
Mon Aug 10 05:47:22 PDT 2015


Hi,

On 03-08-15 20:09, Ilia Mirkin wrote:
> On Mon, Aug 3, 2015 at 1:31 PM, Hans de Goede <hdegoede at redhat.com> wrote:
>> Hi,
>>
>>
>> On 03-08-15 17:36, Ilia Mirkin wrote:
>>>
>>> On Mon, Aug 3, 2015 at 9:02 AM, Hans de Goede <hdegoede at redhat.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> On 30-07-15 16:09, Ilia Mirkin wrote:
>>>>>
>>>>>
>>>>> FWIW this is a fail on nv50+ as well. See for example
>>>>> https://bugs.freedesktop.org/show_bug.cgi?id=91445
>>>>>
>>>>> My suspicion is that this is due to the lack of PUSH_KICK in the *Done
>>>>> exa handlers -- works fine with DRI2, but DRI3 has no synchronization
>>>>> and so the commands never get flushed out. Easily verified by sticking
>>>>> PUSH_KICK's everywhere.
>>>>
>>>>
>>>>
>>>> I do not believe that that is the problem, in my case it clearly
>>>> seems to be a pitch / swizzle problem rather then a synchronizarion
>>>> problem, here is what my desktop with gnome shell looks like when
>>>> using DRI2:
>>>>
>>>> https://fedorapeople.org/~jwrdegoede/nv46-gnome-shell-good.jpg
>>>>
>>>> And this is what it looks like when using DRI3:
>>>>
>>>> https://fedorapeople.org/~jwrdegoede/nv46-gnome-shell-bad.jpg
>>>>
>>>> The DRI2 screenshot is made with Mario's 2 patches on top of
>>>> current master:
>>>>
>>>> http://lists.freedesktop.org/archives/nouveau/2015-July/021740.html
>>>> http://lists.freedesktop.org/archives/nouveau/2015-July/021741.html
>>>>
>>>> And then adding Option "DRI" "2" to xorg.conf.
>>>
>>>
>>> His patches should have defaulted it to DRI 2 I think, so this is
>>> unnecessary. In fact you should have had to say "DRI" "3" to get DRI3
>>> with his patches.
>>>    --
>>>>
>>>>
>>>> I've also tried disabling EXA using Option "AccelMethod" "none",
>>>> but that seems to also automatically disable all DRI, leading to
>>>> software rendering.
>>>>
>>>> I discussed this with Ben this morning and he suggested that this
>>>> is likely a Mesa issue since with DRI3 mesa rather then the ddx
>>>> allocs the surfaces. I've tried disabling swizzling in the
>>>> mesa code by forcing nv30_miptree_create() to always take
>>>> the code path for linear textures, but that leads to the exact
>>>> same result as before that change.
>>>
>>>
>>> Ah yes. Very different problem indeed. I actually suspect it has to do
>>> with swizzling. Look at the white pattern of the moon -- it's all in a
>>> line. That means that it expected some locality and instead it got
>>> drawn all on a line. If it were merely a stride problem, I'd expect to
>>> see strips of the moon below and offset from one another.
>>>
>>> So... take a look at nv30_miptree_from_handle -- I wonder if it can
>>> now receive swizzled textures where it couldn't before.
>>
>>
>> Ok, that does go in the direction I am expecting the problem to be,
>> but I'm afraid I'm going to need a bit more guidance, what exactly
>> am I looking for in that function / which "knobs" should I try to
>> vary / play with to maybe fix this ?
>
> Unfortunately this is playing near (or past) the limits of my
> knowledge as well. My understanding is that DRI3 passes pixmaps around
> with dma-buf, aka "bo_from_handle". DRI2 uses some other mechanism
> which is not that (I think it just copies stuff around).
>
> Now on nv50+, bo's have "tile flags" (and memtype and probably other
> annoyances). The tile flags indicate the specific tiling mechanism
> used on that bo (i.e. do you do 32x32 tiles? 32x64? etc). Take a look
> at the nouveau_bo_new() call in nv50_miptree.c -- note how it takes a
> "bo config" argument. This bo config can then later be retrieved using
> some other syscall.
>
> However on nv30 there appears to not be any such thing. The
> nouveau_bo_new call just passes in NULL for creating the bo, which
> means that there's no way to recover the "are you swizzled"
> information after-the-fact.
>
> Presumably you should create a "nv04" bo config section in the union,

That already exists, and indeed gets set by the nouveau_allocate_surface
function from src/nv_accel_common.c from the ddx,

> and just pass the single "swizzled" bit through. I'm not sure what, if
> anything, is required on the kernel side for that. I don't think
> there's any optionality in how the swizzling is done for pre-nv50.
>
> Note that in the nv30_miptree logic, mt->swizzled implies that
> mt->uniform_pitch = 0, but the level pitch is set "properly" (again,
> see nv30_miptree_create).
>
> Hope this sheds some light and doesn't cause you to go in the wrong
> direction -- please take everything I say with a grain of salt -- I'm
> often a bit off on some of the details.

Thanks this was helpful, I do feel we are getting somewhere, but I do
need a bit more help.

I've added some debug printf's to nv30_miptree.c, nv30_miptree_create
and nv30_miptree_from_handle, where the latter is only used when using
dri2 (e.g. in the working case).

Doing a diff between a log of starting gnome-shell with dri vs dri3
results in this:

--- mesa.log.dri2       2015-08-10 14:18:03.182712022 +0200
+++ mesa.log.dri3       2015-08-10 14:18:33.233336338 +0200
@@ -1,8 +1,8 @@
  nv30_miptree_create 512x32 uniform_pitch 0 usage 0 flags 0
-nv30_miptree_from_handle 1x1 uniform_pitch 1024 usage 0 flags 0
+nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 512x32 uniform_pitch 0 usage 0 flags 0
-nv30_miptree_from_handle 1x1 uniform_pitch 1024 usage 0 flags 0
+nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
@@ -12,6 +12,7 @@
  nv30_miptree_create 256x256 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 512x512 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 48x48 uniform_pitch 192 usage 0 flags 0
+nv30_miptree_create 16x16 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
@@ -20,29 +21,24 @@
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
-nv30_miptree_create 16x16 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
  nv30_miptree_create 16x16 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 1920x1200 uniform_pitch 7680 usage 0 flags 0
+nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
  nv30_miptree_create 48x48 uniform_pitch 192 usage 0 flags 0
  nv30_miptree_create 16x16 uniform_pitch 0 usage 0 flags 0
-nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
-nv30_miptree_create 18x18 uniform_pitch 64 usage 0 flags 0
-nv30_miptree_from_handle 1920x1080 uniform_pitch 8192 usage 0 flags 0
+nv30_miptree_create 1920x1080 uniform_pitch 7680 usage 0 flags 0
  nv30_miptree_create 1920x1080 uniform_pitch 7680 usage 0 flags 0
  nv30_miptree_create 256x256 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 256x256 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
-nv30_miptree_from_handle 1920x1080 uniform_pitch 8192 usage 0 flags 0
+nv30_miptree_create 18x18 uniform_pitch 64 usage 0 flags 0
+nv30_miptree_create 1920x1080 uniform_pitch 7680 usage 0 flags 0
  nv30_miptree_create 48x48 uniform_pitch 192 usage 0 flags 0
-nv30_miptree_from_handle 1920x1080 uniform_pitch 8192 usage 0 flags 0
  nv30_miptree_create 16x16 uniform_pitch 0 usage 0 flags 0
-nv30_miptree_from_handle 1920x1080 uniform_pitch 8192 usage 0 flags 0
  nv30_miptree_create 1920x1080 uniform_pitch 7680 usage 0 flags 0
  nv30_miptree_create 1920x1080 uniform_pitch 7680 usage 0 flags 0
  nv30_miptree_create 1x1 uniform_pitch 0 usage 0 flags 0
  nv30_miptree_create 6x8 uniform_pitch 64 usage 0 flags 0
  nv30_miptree_create 6x8 uniform_pitch 64 usage 0 flags 0
-nv30_miptree_from_handle 1920x1080 uniform_pitch 8192 usage 0 flags 0
  nv30_miptree_create 24x24 uniform_pitch 128 usage 0 flags 0
-nv30_miptree_from_handle 1920x1080 uniform_pitch 8192 usage 0 flags 0

I've also added logging of the surface creation in the ddx, here
is one such log line:

  Surface alloc 1920x1080 at 32 pitch 8192, cfg.nv04.surf_pitch 8192 .surf_flags 2

What stands out to me is:

1) In the dri3 case there are no nv30_miptree_from_handle calls, these are
  replaced by nv30_miptree_create calls

2) These replacement calls use a different pitch for the 1920x... surfaces,
8192 vs 7680

3) The surface creation in the ddx passes in cfg.nv04.surf_... when calling
nouveau_bo_new, where as nv30_miptree_create passes in NULL

4) nv30_miptree_from_handle always sets uniform_pitch in the nv30_miptree, iow
it always assumes non-swizzled ?

The thing I would like to investigate first is 2), it seems to me that
nv30_miptree_create should not be doing pitch alignment itself, so if we
want to have the same pitch for these surfaces as in the dri2 case, we
need to fix this in the caller. Which leads to the question where is
this code being called from in the dri3 case. And this is where I'm
stuck and could use your or Benjamin's input. So do you know where the
caller is and/or how to figure out where the caller is?

Regards,

Hans





More information about the Nouveau mailing list