[RFC] Using DC in amdgpu for upcoming GPU
Alex Deucher
alexdeucher at gmail.com
Thu Dec 8 17:40:19 UTC 2016
On Thu, Dec 8, 2016 at 10:34 AM, Daniel Vetter <daniel at ffwll.ch> wrote:
> On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
>> Hi Daniel,
>>
>> just a quick clarification in-line about "validation" inside atomic_commit.
>>
>> On 2016-12-08 04:59 AM, Daniel Vetter wrote:
>> > Hi Harry,
>> >
>> > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
>> > > We propose to use the Display Core (DC) driver for display support on
>> > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
>> > > avoid a flag day the plan is to only support uGPU initially and transition
>> > > to older ASICs gradually.
>> > >
>> > > The DC component has received extensive testing within AMD for DCE8, 10, and
>> > > 11 GPUs and is being prepared for uGPU. Support should be better than
>> > > amdgpu's current display support.
>> > >
>> > > * All of our QA effort is focused on DC
>> > > * All of our CQE effort is focused on DC
>> > > * All of our OEM preloads and custom engagements use DC
>> > > * DC behavior mirrors what we do for other OSes
>> > >
>> > > The new asic utilizes a completely re-designed atom interface, so we cannot
>> > > easily leverage much of the existing atom-based code.
>> > >
>> > > We've introduced DC to the community earlier in 2016 and received a fair
>> > > amount of feedback. Some of what we've addressed so far are:
>> > >
>> > > * Self-contain ASIC specific code. We did a bunch of work to pull
>> > > common sequences into dc/dce and leave ASIC specific code in
>> > > separate folders.
>> > > * Started to expose AUX and I2C through generic kernel/drm
>> > > functionality and are mostly using that. Some of that code is still
>> > > needlessly convoluted. This cleanup is in progress.
>> > > * Integrated Dave and Jerome’s work on removing abstraction in bios
>> > > parser.
>> > > * Retire adapter service and asic capability
>> > > * Remove some abstraction in GPIO
>> > >
>> > > Since a lot of our code is shared with pre- and post-silicon validation
>> > > suites changes need to be done gradually to prevent breakages due to a major
>> > > flag day. This, coupled with adding support for new asics and lots of new
>> > > feature introductions means progress has not been as quick as we would have
>> > > liked. We have made a lot of progress none the less.
>> > >
>> > > The remaining concerns that were brought up during the last review that we
>> > > are working on addressing:
>> > >
>> > > * Continue to cleanup and reduce the abstractions in DC where it
>> > > makes sense.
>> > > * Removing duplicate code in I2C and AUX as we transition to using the
>> > > DRM core interfaces. We can't fully transition until we've helped
>> > > fill in the gaps in the drm core that we need for certain features.
>> > > * Making sure Atomic API support is correct. Some of the semantics of
>> > > the Atomic API were not particularly clear when we started this,
>> > > however, that is improving a lot as the core drm documentation
>> > > improves. Getting this code upstream and in the hands of more
>> > > atomic users will further help us identify and rectify any gaps we
>> > > have.
>> > >
>> > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
>> > > work on DC is public. We're currently transitioning to a public patch
>> > > review. You can follow our progress on the amd-gfx mailing list. We value
>> > > community feedback on our work.
>> > >
>> > > As an appendix I've included a brief overview of the how the code currently
>> > > works to make understanding and reviewing the code easier.
>> > >
>> > > Prior discussions on DC:
>> > >
>> > > * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
>> > > *
>> > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
>> > >
>> > > Current version of DC:
>> > >
>> > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>> > >
>> > > Once Alex pulls in the latest patches:
>> > >
>> > > * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
>> > >
>> > > Best Regards,
>> > > Harry
>> > >
>> > >
>> > > ************************************************
>> > > *** Appendix: A Day in the Life of a Modeset ***
>> > > ************************************************
>> > >
>> > > Below is a high-level overview of a modeset with dc. Some of this might be a
>> > > little out-of-date since it's based on my XDC presentation but it should be
>> > > more-or-less the same.
>> > >
>> > > amdgpu_dm_atomic_commit()
>> > > {
>> > > /* setup atomic state */
>> > > drm_atomic_helper_prepare_planes(dev, state);
>> > > drm_atomic_helper_swap_state(dev, state);
>> > > drm_atomic_helper_update_legacy_modeset_state(dev, state);
>> > >
>> > > /* create or remove targets */
>> > >
>> > > /********************************************************************
>> > > * *** Call into DC to commit targets with list of all known targets
>> > > ********************************************************************/
>> > > /* DC is optimized not to do anything if 'targets' didn't change. */
>> > > dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
>> > > {
>> > > /******************************************************************
>> > > * *** Build context (function also used for validation)
>> > > ******************************************************************/
>> > > result = core_dc->res_pool->funcs->validate_with_context(
>> > > core_dc,set,target_count,context);
>> >
>> > I can't dig into details of DC, so this is not a 100% assessment, but if
>> > you call a function called "validate" in atomic_commit, you're very, very
>> > likely breaking atomic. _All_ validation must happen in ->atomic_check,
>> > if that's not the case TEST_ONLY mode is broken. And atomic userspace is
>> > relying on that working.
>> >
>>
>> This function is not really named correctly. What it does is it builds a
>> context and validates at the same time. In commit we simply care that it
>> builds the context. Validate should never fail here (since this was already
>> validated in atomic_check).
>>
>> We call the same function at atomic_check
>>
>> amdgpu_dm_atomic_check ->
>> dc_validate_resources ->
>> core_dc->res_pool->funcs->validate_with_context
>
> Ah right, iirc you told me this the last time around too ;-) I guess a
> great example for what I mean with rolling your own world: Existing atomic
> drivers put their derived/computed/validated check into their subclasses
> state structures, which means they don't need to be re-computed in
> atomic_check. It also makes sure that the validation code/state
> computation code between check and commit doesn't get out of sync.
>
>> As for the rest, I hear you and appreciate your feedback. Let me get back to
>> you on that later.
>
> Just an added note on that: I do think that there's some driver teams
> who've managed to pull a shared codebase between validation and upstream
> linux (iirc some of the intel wireless drivers work like that). But it
> requires careful aligning of everything, and with something fast-moving
> like drm it might become real painful and not really worth it. So not
> outright rejecting DC (and the code sharing you want to achieve with it)
> as an idea here.
I think we have to make it work. We don't have the resources to have
separate validation and Linux core teams. It's not just the coding.
Much of our validation and compliance testing on Linux leverages this
as well. From our perspective, I think the pain is probably worth it
at this point. Display is starting to eclipse other blocks as far as
complexity. Not even just the complexity of lighting up complex
topologies. The really tough stuff is that display is basically a
real-time service and the hw is designed with very little margin for
error with respect to timing and bandwidth. That's where much of the
value comes from sharing resources with validation teams. For us,
that makes the potential pain of dealing with fast moving drm worth
it. This is not to say that we won't adopt more use of drm
infrastructure, we are working on it within the bounds of our resource
constraints.
Alex
More information about the amd-gfx
mailing list