[RFC] Using DC in amdgpu for upcoming GPU

Thu Dec 8 15:46:05 UTC 2016

On Thu, Dec 08, 2016 at 04:41:52PM +0100, Christian König wrote:
> Am 08.12.2016 um 16:34 schrieb Daniel Vetter:
> > On Thu, Dec 08, 2016 at 09:33:25AM -0500, Harry Wentland wrote:
> > > Hi Daniel,
> > > 
> > > just a quick clarification in-line about "validation" inside atomic_commit.
> > > 
> > > On 2016-12-08 04:59 AM, Daniel Vetter wrote:
> > > > Hi Harry,
> > > > 
> > > > On Wed, Dec 07, 2016 at 09:02:13PM -0500, Harry Wentland wrote:
> > > > > We propose to use the Display Core (DC) driver for display support on
> > > > > AMD's upcoming GPU (referred to by uGPU in the rest of the doc). In order to
> > > > > avoid a flag day the plan is to only support uGPU initially and transition
> > > > > to older ASICs gradually.
> > > > > 
> > > > > The DC component has received extensive testing within AMD for DCE8, 10, and
> > > > > 11 GPUs and is being prepared for uGPU. Support should be better than
> > > > > amdgpu's current display support.
> > > > > 
> > > > >   * All of our QA effort is focused on DC
> > > > >   * All of our CQE effort is focused on DC
> > > > >   * All of our OEM preloads and custom engagements use DC
> > > > >   * DC behavior mirrors what we do for other OSes
> > > > > 
> > > > > The new asic utilizes a completely re-designed atom interface, so we cannot
> > > > > easily leverage much of the existing atom-based code.
> > > > > 
> > > > > We've introduced DC to the community earlier in 2016 and received a fair
> > > > > amount of feedback. Some of what we've addressed so far are:
> > > > > 
> > > > >   * Self-contain ASIC specific code. We did a bunch of work to pull
> > > > >     common sequences into dc/dce and leave ASIC specific code in
> > > > >     separate folders.
> > > > >   * Started to expose AUX and I2C through generic kernel/drm
> > > > >     functionality and are mostly using that. Some of that code is still
> > > > >     needlessly convoluted. This cleanup is in progress.
> > > > >   * Integrated Dave and Jerome’s work on removing abstraction in bios
> > > > >     parser.
> > > > >   * Retire adapter service and asic capability
> > > > >   * Remove some abstraction in GPIO
> > > > > 
> > > > > Since a lot of our code is shared with pre- and post-silicon validation
> > > > > suites changes need to be done gradually to prevent breakages due to a major
> > > > > flag day.  This, coupled with adding support for new asics and lots of new
> > > > > feature introductions means progress has not been as quick as we would have
> > > > > liked. We have made a lot of progress none the less.
> > > > > 
> > > > > The remaining concerns that were brought up during the last review that we
> > > > > are working on addressing:
> > > > > 
> > > > >   * Continue to cleanup and reduce the abstractions in DC where it
> > > > >     makes sense.
> > > > >   * Removing duplicate code in I2C and AUX as we transition to using the
> > > > >     DRM core interfaces.  We can't fully transition until we've helped
> > > > >     fill in the gaps in the drm core that we need for certain features.
> > > > >   * Making sure Atomic API support is correct.  Some of the semantics of
> > > > >     the Atomic API were not particularly clear when we started this,
> > > > >     however, that is improving a lot as the core drm documentation
> > > > >     improves.  Getting this code upstream and in the hands of more
> > > > >     atomic users will further help us identify and rectify any gaps we
> > > > >     have.
> > > > > 
> > > > > Unfortunately we cannot expose code for uGPU yet. However refactor / cleanup
> > > > > work on DC is public.  We're currently transitioning to a public patch
> > > > > review. You can follow our progress on the amd-gfx mailing list. We value
> > > > > community feedback on our work.
> > > > > 
> > > > > As an appendix I've included a brief overview of the how the code currently
> > > > > works to make understanding and reviewing the code easier.
> > > > > 
> > > > > Prior discussions on DC:
> > > > > 
> > > > >   * https://lists.freedesktop.org/archives/dri-devel/2016-March/103398.html
> > > > >   *
> > > > > https://lists.freedesktop.org/archives/dri-devel/2016-February/100524.html
> > > > > 
> > > > > Current version of DC:
> > > > > 
> > > > >   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > > > 
> > > > > Once Alex pulls in the latest patches:
> > > > > 
> > > > >   * https://cgit.freedesktop.org/~agd5f/linux/tree/drivers/gpu/drm/amd/display?h=amd-staging-4.7
> > > > > 
> > > > > Best Regards,
> > > > > Harry
> > > > > 
> > > > > 
> > > > > ************************************************
> > > > > *** Appendix: A Day in the Life of a Modeset ***
> > > > > ************************************************
> > > > > 
> > > > > Below is a high-level overview of a modeset with dc. Some of this might be a
> > > > > little out-of-date since it's based on my XDC presentation but it should be
> > > > > more-or-less the same.
> > > > > 
> > > > > amdgpu_dm_atomic_commit()
> > > > > {
> > > > >    /* setup atomic state */
> > > > >    drm_atomic_helper_prepare_planes(dev, state);
> > > > >    drm_atomic_helper_swap_state(dev, state);
> > > > >    drm_atomic_helper_update_legacy_modeset_state(dev, state);
> > > > > 
> > > > >    /* create or remove targets */
> > > > > 
> > > > >    /********************************************************************
> > > > >     * *** Call into DC to commit targets with list of all known targets
> > > > >     ********************************************************************/
> > > > >    /* DC is optimized not to do anything if 'targets' didn't change. */
> > > > >    dc_commit_targets(dm->dc, commit_targets, commit_targets_count)
> > > > >    {
> > > > >      /******************************************************************
> > > > >       * *** Build context (function also used for validation)
> > > > >       ******************************************************************/
> > > > >      result = core_dc->res_pool->funcs->validate_with_context(
> > > > >                                 core_dc,set,target_count,context);
> > > > I can't dig into details of DC, so this is not a 100% assessment, but if
> > > > you call a function called "validate" in atomic_commit, you're very, very
> > > > likely breaking atomic. _All_ validation must happen in ->atomic_check,
> > > > if that's not the case TEST_ONLY mode is broken. And atomic userspace is
> > > > relying on that working.
> > > > 
> > > This function is not really named correctly. What it does is it builds a
> > > context and validates at the same time. In commit we simply care that it
> > > builds the context. Validate should never fail here (since this was already
> > > validated in atomic_check).
> > > 
> > > We call the same function at atomic_check
> > > 
> > > amdgpu_dm_atomic_check ->
> > > 	dc_validate_resources ->
> > > 		core_dc->res_pool->funcs->validate_with_context
> > Ah right, iirc you told me this the last time around too ;-) I guess a
> > great example for what I mean with rolling your own world: Existing atomic
> > drivers put their derived/computed/validated check into their subclasses
> > state structures, which means they don't need to be re-computed in
> > atomic_check. It also makes sure that the validation code/state
> > computation code between check and commit doesn't get out of sync.
> > 
> > > As for the rest, I hear you and appreciate your feedback. Let me get back to
> > > you on that later.
> > Just an added note on that: I do think that there's some driver teams
> > who've managed to pull a shared codebase between validation and upstream
> > linux (iirc some of the intel wireless drivers work like that). But it
> > requires careful aligning of everything, and with something fast-moving
> > like drm it might become real painful and not really worth it. So not
> > outright rejecting DC (and the code sharing you want to achieve with it)
> > as an idea here.
> 
> I used to have examples of such a things for other network drivers as well,
> but right now I can't find them of hand. Leave me a note if you need more
> info on existing things.
> 
> A good idea might as well be to take a look at drivers shared between Linux
> and BSD as well, cause both code bases are usually public available and you
> can see what changes during porting and what stays the same.

bsd and linux might not be a good example anymore, at least in the gfx
space - upstream linux has so massively outpaced bsd kernels that they
stopped porting and switched over to implement a shim in the bsd drm
subsystem to fully emulate the linux interfaces. I think on the networking
and storage side things are a bit better aligned still, and not quite
moving as fast, to make a more native approach on each OS feasible.
-Daniel

> 
> Regards,
> Christian.
> 
> > -Daniel
> > 
> > > Thanks,
> > > Harry
> > > 
> > > 
> > > > The only thing that you're allowed to return from ->atomic_commit is
> > > > out-of-memory, hw-on-fire and similar unforeseen and catastrophic issues.
> > > > Kerneldoc expklains this.
> > > > 
> > > > Now the reason I bring this up (and we've discussed it at length in
> > > > private) is that DC still suffers from a massive abstraction midlayer. A
> > > > lot of the back-end stuff (dp aux, i2c, abstractions for allocation,
> > > > timers, irq, ...) have been cleaned up, but the midlayer is still there.
> > > > And I understand why you have it, and why it's there - without some OS
> > > > abstraction your grand plan of a unified driver across everything doesn't
> > > > work out so well.
> > > > 
> > > > But in a way the backend stuff isn't such a big deal. It's annoying since
> > > > lots of code, and bugfixes have to be duplicated and all that, but it's
> > > > fairly easy to fix case-by-case, and as long as AMD folks stick around
> > > > (which I fully expect) not a maintainance issue. It makes it harder for
> > > > others to contribute, but then since it's mostly the leaf it's generally
> > > > easy to just improve the part you want to change (as an outsider). And if
> > > > you want to improve shared code the only downside is that you can't also
> > > > improve amd, but that's not so much a problem for non-amd folks ;-)
> > > > 
> > > > The problem otoh with the abstraction layer between drm core and the amd
> > > > driver is that you can't ignore if you want to refactor shared code. And
> > > > because it's an entire world of its own, it's much harder to understand
> > > > what the driver is doing (without reading it all). Some examples of what I
> > > > mean:
> > > > 
> > > > - All other drm drivers subclass drm objects (by embedding them) into the
> > > >    corresponding hw part that most closely matches the drm object's
> > > >    semantics. That means even when you have 0 clue about how a given piece
> > > >    of hw works, you have a reasonable chance of understanding code. If it's
> > > >    all your own stuff you always have to keep in minde the special amd
> > > >    naming conventions. That gets old real fast if you trying to figure out
> > > >    what 20+ (or are we at 30 already?) drivers are doing.
> > > > 
> > > > - This is even more true for atomic. Atomic has a pretty complicated
> > > >    check/commmit transactional model for updating display state. It's a
> > > >    standardized interface, and it's extensible, and we want generic
> > > >    userspace to be able to run on any driver. Fairly often we realize that
> > > >    semantics of existing or newly proposed properties and state isn't
> > > >    well-defined enough, and then we need to go&read all the drivers and
> > > >    figure out how to fix up the mess. DC has it's entirely separate state
> > > >    structures which again don't subclass the atomic core structures (afaik
> > > >    at least). Again the same problems apply that you can't find things, and
> > > >    that figuring out the exact semantics and spotting differences in
> > > >    behaviour is almost impossible.
> > > > 
> > > > - The trouble isn't just in reading code and understanding it correctly,
> > > >    it's also in finding it. If you have your own completely different world
> > > >    then just finding the right code is hard - cscope and grep fail to work.
> > > > 
> > > > - Another issue is that very often we unify semantics in drivers by adding
> > > >    some new helpers that at least dtrt for most of the drivers. If you have
> > > >    your own world then the impendance mismatch will make sure that amd
> > > >    drivers will have slightly different semantics, and I think that's not
> > > >    good for the ecosystem and kms - people want to run a lot more than just
> > > >    a boot splash with generic kms userspace, stuff like xf86-video-$vendor
> > > >    is going out of favour heavily.
> > > > 
> > > > Note that all this isn't about amd walking away and leaving an
> > > > unmaintainable mess behind. Like I've said I don't think this is a big
> > > > risk. The trouble is that having your own world makes it harder for
> > > > everyone else to understand the amd driver, and understanding all drivers
> > > > is very often step 1 in some big refactoring or feature addition effort.
> > > > Because starting to refactor without understanding the problem generally
> > > > doesn't work ;_) And you can't make this step 1 easier for others by
> > > > promising to always maintain DC and update it to all the core changes,
> > > > because that's only step 2.
> > > > 
> > > > In all the DC discussions we've had thus far I haven't seen anyone address
> > > > this issue. And this isn't just an issue in drm, it's pretty much
> > > > established across all linux subsystems with the "no midlayer or OS
> > > > abstraction layers in drivers" rule. There's some real solid reasons why
> > > > such a HAl is extremely unpopular with upstream. And I haven't yet seen
> > > > any good reason why amd needs to be different, thus far it looks like a
> > > > textbook case, and there's been lots of vendors in lots of subsystems who
> > > > tried to push their HAL.
> > > > 
> > > > Thanks, Daniel
> > > > 
> > > > >      /******************************************************************
> > > > >       * *** Apply safe power state
> > > > >       ******************************************************************/
> > > > >      pplib_apply_safe_state(core_dc);
> > > > > 
> > > > >      /****************************************************************
> > > > >       * *** Apply the context to HW (program HW)
> > > > >       ****************************************************************/
> > > > >      result = core_dc->hwss.apply_ctx_to_hw(core_dc,context)
> > > > >      {
> > > > >        /* reset pipes that need reprogramming */
> > > > >        /* disable pipe power gating */
> > > > >        /* set safe watermarks */
> > > > > 
> > > > >        /* for all pipes with an attached stream */
> > > > >          /************************************************************
> > > > >           * *** Programming all per-pipe contexts
> > > > >           ************************************************************/
> > > > >          status = apply_single_controller_ctx_to_hw(...)
> > > > >          {
> > > > >            pipe_ctx->tg->funcs->set_blank(...);
> > > > >            pipe_ctx->clock_source->funcs->program_pix_clk(...);
> > > > >            pipe_ctx->tg->funcs->program_timing(...);
> > > > >            pipe_ctx->mi->funcs->allocate_mem_input(...);
> > > > >            pipe_ctx->tg->funcs->enable_crtc(...);
> > > > >            bios_parser_crtc_source_select(...);
> > > > > 
> > > > >            pipe_ctx->opp->funcs->opp_set_dyn_expansion(...);
> > > > >            pipe_ctx->opp->funcs->opp_program_fmt(...);
> > > > > 
> > > > >            stream->sink->link->link_enc->funcs->setup(...);
> > > > >            pipe_ctx->stream_enc->funcs->dp_set_stream_attribute(...);
> > > > >            pipe_ctx->tg->funcs->set_blank_color(...);
> > > > > 
> > > > >            core_link_enable_stream(pipe_ctx);
> > > > >            unblank_stream(pipe_ctx,
> > > > > 
> > > > >            program_scaler(dc, pipe_ctx);
> > > > >          }
> > > > >        /* program audio for all pipes */
> > > > >        /* update watermarks */
> > > > >      }
> > > > > 
> > > > >      program_timing_sync(core_dc, context);
> > > > >      /* for all targets */
> > > > >        target_enable_memory_requests(...);
> > > > > 
> > > > >      /* Update ASIC power states */
> > > > >      pplib_apply_display_requirements(...);
> > > > > 
> > > > >      /* update surface or page flip */
> > > > >    }
> > > > > }
> > > > > 
> > > > > 
> > > > > _______________________________________________
> > > > > dri-devel mailing list
> > > > > dri-devel at lists.freedesktop.org
> > > > > https://lists.freedesktop.org/mailman/listinfo/dri-devel
> 
> 

-- 
Daniel Vetter
Software Engineer, Intel Corporation
http://blog.ffwll.ch