[RFC] Using DC in amdgpu for upcoming GPU

Mon Dec 12 07:54:54 UTC 2016

Yep, good point. We have tended to stay a bit behind bleeding edge because our primary tasks so far have been:

1. Support enterprise distros (with old kernels) via the hybrid driver (AMDGPU-PRO), where the closer to upstream we get the more of a gap we have to paper over with KCL code

2. Push architecturally simple code (new GPU support) upstream, where being closer to upstream makes the up-streaming task simpler but not by that much

So 4.7 isn't as bad a compromise as it might seem.

That said, in the case of DAL/DC it's a different story as you say... architecturally complex code needing to be woven into a fast-moving subsystem of the kernel. So for DAL/DC anything other than upstream is going to be a big pain.

OK, need to think that through.

Thanks !

________________________________
From: dri-devel <dri-devel-bounces at lists.freedesktop.org> on behalf of Daniel Vetter <daniel at ffwll.ch>
Sent: December 12, 2016 2:22 AM
To: Wentland, Harry
Cc: Grodzovsky, Andrey; amd-gfx at lists.freedesktop.org; dri-devel at lists.freedesktop.org; Deucher, Alexander; Cheng, Tony
Subject: Re: [RFC] Using DC in amdgpu for upcoming GPU

On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> Current version of DC:
>
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>
> Once Alex pulls in the latest patches:
>
>  * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7

One more: That 4.7 here is going to be unbelievable amounts of pain for
you. Yes it's a totally sensible idea to just freeze your baseline kernel
because then linux looks a lot more like Windows where the driver abi is
frozen. But it makes following upstream entirely impossible, because
rebasing is always a pain and hence postponed. Which means you can't just
use the latest stuff in upstream drm, which means collaboration with
others and sharing bugfixes in core is a lot more pain, which then means
you do more than necessary in your own code and results in HALs like DAL,
perpetuating the entire mess.

So I think you don't just need to demidlayer DAL/DC, you also need to
demidlayer your development process. In our experience here at Intel that
needs continuous integration testing (in drm-tip), because even 1 month of
not resyncing with drm-next is sometimes way too long. See e.g. the
controlD regression we just had. And DAL is stuck on a 1 year old kernel,
so pretty much only of historical significance and otherwise dead code.

And then for any stuff which isn't upstream yet (like your internal
enabling, or DAL here, or our own internal enabling) you need continuous
rebasing&re-validation. When we started doing this years ago it was still
manually, but we still rebased like every few days to keep the pain down
and adjust continuously to upstream evolution. But then going to a
continous rebase bot that sends you mail when something goes wrong was
again a massive improvement.

I guess in the end Conway's law that your software architecture
necessarily reflects how you organize your teams applies again. Fix your
process and it'll become glaringly obvious to everyone involved that
DC-the-design as-is is entirely unworkeable and how it needs to be fixed.

>From my own experience over the past few years: Doing that is a fun
journey ;-)

Cheers, Daniel
--
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch
_______________________________________________
dri-devel mailing list
dri-devel at lists.freedesktop.org
https://lists.freedesktop.org/mailman/listinfo/dri-devel
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20161212/405626a3/attachment-0001.html>