[PATCH] drm/amdgpu: fix drm leases being broken on radv

Thu Apr 18 09:46:48 UTC 2019

On Thu, Apr 18, 2019 at 11:22 AM Michel Dänzer <michel at daenzer.net> wrote:
>
> On 2019-04-18 11:11 a.m., Daniel Vetter wrote:
> > On Thu, Apr 18, 2019 at 10:52 AM Michel Dänzer <michel at daenzer.net> wrote:
> >> On 2019-04-18 10:26 a.m., Daniel Vetter wrote:
> >>>
> >>> Ok correction: amd has stuck out in the past too, there was some vblank vs
> >>> pageflip stuff where we needed to do some pretty clever tricks to both
> >>> have the new stricter semantics for everyone, while keeping the amd ddx
> >>> happy still. This aint new at all, we fixed up the regressions and moved
> >>> on.
> >>
> >> That sounds like something I would have been involved in, but I'm not
> >> sure what you mean...
> >>
> >> I fixed the amdgpu/radeon kernel drivers to obey the general KMS rules
> >> for vblank vs page flips, and added DRM_MODE_PAGE_FLIP_TARGET to allow
> >> userspace to take advantage of our hardware's capabilities to avoid
> >> delaying a page flip by one refresh cycle in some cases. That's all.
> >
> > That's the one I meant. At least I remember quite a pile of
> > discussions around why i915 gets to define how this stuff works, and
> > amdgpu/radeon having to follow.
> >
> > [...]
> >
> > - the vblank vs flip clarifiation that required new uapi and upgraded
> > amd ddx (really not pretty if we have to somewhat break a performance
> > feature for some users)
>
> You're totally mis-representing that IMO.
>
> The amdgpu/radeon kernel drivers were simply broken WRT established UAPI
> rules, in a way which affected our DDX drivers as well. The kernel
> driver fixes didn't require any corresponding userspace changes.

It wasn't ever defined when we merged the page-flip ioctl, and the
vblank ioctl was retrofitted/abused to also work with kms. i915
happened to guarantee that a flip immediately after a vblank wait will
hit the next vblank, radeon (this stuff is really old) had more wiggle
room. And for unthrottled vblank, this mattered since it would allow
you to update frames still, even when the rendering was really late.

The a lot of people started writing kms compositors, and due to random
reasons they mostly developed those on i915. Relying on the
i915-specific guarantee between vblank waits and page-flips.

Then the entire atomic saga happened, and since I spent way too much
time reading other kms drivers already I wanted to standardize a bunch
of the uapi grey areas. This was starting with 2014/2015. Stuff I
still remember:
- dpms behaviour, standardized on what i915 did in it's private
modeset code (the crtc helper behaviour had a pile of duct-tape
patches, but still ill-defined corner cases).
- vblank semantics around crtc on/off (atomic helpers fully relied on
drivers calling drm_vblank_on/off, which thus far only i915 did)
- and later on vblank vs timestamp, when the nonblocking helpers landed

As a result, even more people started relying on this stuff.
Eventually someone noticed that amdgpu works differently. There was a
pretty heated irc debate between you and me about why exactly amdgpu
needs to follow the i915 standard, and why intel gets to set all these
standards and a few others things. This was around 2016 or so, per git
blame of relevant at least.

We ended up adjusting amd drivers to follow i915 semantics, which
fixed scheduled flips on lots of compositors, and broke unthrottled
flipping (very slightly, but it did) on amd. page_flip_target remedied
that.

One of the lessons I took from all this that retrofitting uapi is best
done across the board, and not just for i915. It's occasionally not
possible.

> New UAPI was added to allow taking advantage of hardware capabilities in
> some cases without breaking other cases.
>
> It's a pretty text-book example of how to manage UAPI.

Well I guess we remember different aspects of that story, but to me it
was a textbook case of how retrofitting uapi has challenges. One of
the options you proposed was that we'd have a flag to select the i915
behaviour, and leave the default undefined. I think that would have
handled the vblank vs pageflip issue in a way Christian is advertising
for here. It's actually what we've done, by adding distinct
rendernodes. Unfortunate reality is that adoption of those is very
slow, and bad examples using primary nodes get copypasted all around
still. Iirc I similarly argued in the vblank vs pageflip case that
even with an explicit flag, the implicit uapi as defined by what most
people run out there wouldn't change, and amgpu/radeon alone wouldn't
be enough by far to change that alone.
-Daniel
-- 
Daniel Vetter
Software Engineer, Intel Corporation
+41 (0) 79 365 57 48 - http://blog.ffwll.ch