[RFC] Mechanism for high priority scheduling in amdgpu

Mon Jan 2 15:43:40 UTC 2017

Indeed a couple of nice numbers.

> but everything already commited
> to the HW queue is executed in strict FIFO order.
Well actually if we get a high priority submission we could 
preempt/abort everything on the ring buffer before it in theory.

Probably not as fine granularity as the hardware scheduler, but might be 
easier to get working.

Regards,
Christian.

Am 26.12.2016 um 03:26 schrieb zhoucm1:
> Nice experiment, which is exactly SW scheduler can provide.
> And as you said "I.e. your context can be scheduled into the
> HW queue ahead of any other context, but everything already commited
> to the HW queue is executed in strict FIFO order."
>
> If you want to keep consistent latency, which will need to enable hw 
> priority queue feature.
>
> Regards,
> David Zhou
>
> On 2016年12月24日 06:20, Andres Rodriguez wrote:
>> Hey John,
>>
>> I've collected bit of data using high priority SW scheduler queues,
>> thought you might be interested.
>>
>> Implementation as per the patch above.
>>
>> Control test 1
>> ==============
>>
>> Sascha Willems mesh sample running on its own at regular priority
>>
>> Results
>> -------
>>
>> Mesh: ~0.14ms per-frame latency
>>
>> Control test 2
>> ==============
>>
>> Two Sascha Willems mesh sample running on its own at regular priority
>>
>> Results
>> -------
>>
>> Mesh 1: ~0.26ms per-frame latency
>> Mesh 2: ~0.26ms per-frame latency
>>
>> Test 1
>> ======
>>
>> Two Sascha Willems mesh samples running simultaneously. One at high
>> priority and the other running in a regular priority graphics context.
>>
>> Results
>> -------
>>
>> Mesh High:    0.14 - 0.24ms per-frame latency
>> Mesh Regular: 0.24 - 0.40ms per-frame latency
>>
>> Test 2
>> ======
>>
>> Ten Sascha Willems mesh samples running simultaneously. One at high
>> priority and the others running in a regular priority graphics context.
>>
>> Results
>> -------
>>
>> Mesh High:    0.14 - 0.8ms per-frame latency
>> Mesh Regular: 1.10 - 2.05ms per-frame latency
>>
>> Test 3
>> ======
>>
>> Two Sascha Willems mesh samples running simultaneously. One at high
>> priority and the other running in a regular priority graphics context.
>>
>> Also running Unigine Heaven at Exteme preset @ 2560x1600
>>
>> Results
>> -------
>>
>> Mesh High:     7 - 100ms per-frame latency (Lots of fluctuation)
>> Mesh Regular: 40 - 130ms per-frame latency(Lots of fluctuation)
>> Unigine Heaven: 20-40 fps
>>
>>
>> Test 4
>> ======
>>
>> Two Sascha Willems mesh samples running simultaneously. One at high
>> priority and the other running in a regular priority graphics context.
>>
>> Also running Talos Principle @ 4K
>>
>> Results
>> -------
>>
>> Mesh High:    0.14 - 3.97ms per-frame latency (Mostly floats ~0.4ms)
>> Mesh Regular: 0.43 - 8.11ms per-frame latency (Lots of fluctuation)
>> Talos: 24.8 fps AVG
>>
>> Observations
>> ============
>>
>> The high priority queue based on the SW scheduler provides significant
>> gains when paired with tasks that submit short duration commands into
>> the queue. This can be observed in tests 1 and 2.
>>
>> When the pipe is full of long running commands, the effects are dampened.
>> As observed in test 3, the per-frame latency suffers very large spikes,
>> and the latencies are very inconsistent.
>>
>> Talos seems to be a better behaved game. It may be submitting shorter
>> draw commands and the SW scheduler is able to interleave the rest of
>> the work.
>>
>> The results seem consistent with the hypothetical advantages the SW
>> scheduler should provide. I.e. your context can be scheduled into the
>> HW queue ahead of any other context, but everything already commited
>> to the HW queue is executed in strict FIFO order.
>>
>> In order to deal with cases similar to Test 3, we will need to take
>> advantage of further features.
>>
>> Notes
>> =====
>>
>> - Tests were run multiple times, and reboots were performed during tests.
>> - The mesh sample isn't really designed for benchmarking, but it should
>>   be decent for ballpark figures
>> - The high priority mesh app was run with default niceness and also 
>> niceness
>>   at -20. This had no effect on the results, so it was not added above.
>> - CPU usage was not saturated while running the tests
>>
>> Regards,
>> Andres
>>
>>
>> On Fri, Dec 23, 2016 at 1:18 PM, Pierre-Loup A. Griffais 
>> <pgriffais at valvesoftware.com <mailto:pgriffais at valvesoftware.com>> wrote:
>>
>>     I hate to keep bringing up display topics in an unrelated
>>     conversation, but I'm not sure where you got "Application -> X
>>     server -> compositor -> X server" from. As I was saying before,
>>     we need to be presenting directly to the HMD display as no
>>     display server can be in the way, both for latency but also
>>     quality of service reasons (a buggy application cannot be allowed
>>     to accidentally display undistorted rendering into the HMD); we
>>     intend to do the necessary work for this, and the extent of X's
>>     (or a Wayland implementation, or any other display server)
>>     involvment will be to participate enough to know that the HMD
>>     display is off-limits. If you have more questions on the display
>>     aspect, or VR rendering in general, I'm happy to try to address
>>     them out-of-band from this conversation.
>>
>>
>>     On 12/23/2016 02:54 AM, Christian König wrote:
>>
>>             But yes, in general you don't want another compositor in
>>             the way, so
>>             we'll be acquiring the HMD display directly, separate
>>             from any desktop
>>             or display server.
>>
>>         Assuming that the the HMD is attached to the rendering device
>>         in some
>>         way you have the X server and the Compositor which both try
>>         to be DRM
>>         master at the same time.
>>
>>         Please correct me if that was fixed in the meantime, but that
>>         sounds
>>         like it will simply not work. Or is this what Andres mention
>>         below Dave
>>         is working on ?.
>>
>>         Additional to that a compositor in combination with X is a
>>         bit counter
>>         productive when you want to keep the latency low.
>>
>>         E.g. the "normal" flow of a GL or Vulkan surface filled with
>>         rendered
>>         data to be displayed is from the Application -> X server ->
>>         compositor
>>         -> X server.
>>
>>         The extra step between X server and compositor just means
>>         extra latency
>>         and for this use case you probably don't want that.
>>
>>         Targeting something like Wayland and when you need X
>>         compatibility
>>         XWayland sounds like the much better idea.
>>
>>         Regards,
>>         Christian.
>>
>>         Am 22.12.2016 um 20:54 schrieb Pierre-Loup A. Griffais:
>>
>>             Display concerns are a separate issue, and as Andres said
>>             we have
>>             other plans to address. But yes, in general you don't
>>             want another
>>             compositor in the way, so we'll be acquiring the HMD
>>             display directly,
>>             separate from any desktop or display server. Same with
>>             security, we
>>             can have a separate conversation about that when the time
>>             comes.
>>
>>             On 12/22/2016 08:41 AM, Serguei Sagalovitch wrote:
>>
>>                 Andres,
>>
>>                 Did you measure  latency, etc. impact of __any__
>>                 compositor?
>>
>>                 My understanding is that VR has pretty strict
>>                 requirements related to
>>                 QoS.
>>
>>                 Sincerely yours,
>>                 Serguei Sagalovitch
>>
>>
>>                 On 2016-12-22 11:35 AM, Andres Rodriguez wrote:
>>
>>                     Hey Christian,
>>
>>                     We are currently interested in X, but with some
>>                     distros switching to
>>                     other compositors by default, we also need to
>>                     consider those.
>>
>>                     We agree, running the full vrcompositor in root
>>                     isn't something that
>>                     we want to do. Too many security concerns. Having
>>                     a small root helper
>>                     that does the privilege escalation for us is the
>>                     initial idea.
>>
>>                     For a long term approach, Pierre-Loup and Dave
>>                     are working on dealing
>>                     with the "two compositors" scenario a little
>>                     better in DRM+X.
>>                     Fullscreen isn't really a sufficient approach,
>>                     since we don't want the
>>                     HMD to be used as part of the Desktop environment
>>                     when a VR app is not
>>                     in use (this is extremely annoying).
>>
>>                     When the above is settled, we should have an auth
>>                     mechanism besides
>>                     DRM_MASTER or DRM_AUTH that allows the
>>                     vrcompositor to take over the
>>                     HMD permanently away from X. Re-using that auth
>>                     method to gate this
>>                     IOCTL is probably going to be the final solution.
>>
>>                     I propose to start with ROOT_ONLY since it should
>>                     allow us to respect
>>                     kernel IOCTL compatibility guidelines with the
>>                     most flexibility. Going
>>                     from a restrictive to a more flexible permission
>>                     model would be
>>                     inclusive, but going from a general to a
>>                     restrictive model may exclude
>>                     some apps that used to work.
>>
>>                     Regards,
>>                     Andres
>>
>>                     On 12/22/2016 6:42 AM, Christian König wrote:
>>
>>                         Hi Andres,
>>
>>                         well using root might cause stability and
>>                         security problems as well.
>>                         We worked quite hard to avoid exactly this for X.
>>
>>                         We could make this feature depend on the
>>                         compositor being DRM master,
>>                         but for example with X the X server is master
>>                         (and e.g. can change
>>                         resolutions etc..) and not the compositor.
>>
>>                         So another question is also what windowing
>>                         system (if any) are you
>>                         planning to use? X, Wayland, Flinger or
>>                         something completely
>>                         different ?
>>
>>                         Regards,
>>                         Christian.
>>
>>                         Am 20.12.2016 um 16:51 schrieb Andres Rodriguez:
>>
>>                             Hi Christian,
>>
>>                             That is definitely a concern. What we are
>>                             currently thinking is to
>>                             make the high priority queues accessible
>>                             to root only.
>>
>>                             Therefore is a non-root user attempts to
>>                             set the high priority flag
>>                             on context allocation, we would fail the
>>                             call and return ENOPERM.
>>
>>                             Regards,
>>                             Andres
>>
>>
>>                             On 12/20/2016 7:56 AM, Christian König wrote:
>>
>>                                     BTW: If there is  non-VR
>>                                     application which will use
>>                                     high-priority
>>                                     h/w queue then VR application
>>                                     will suffer.  Any ideas how
>>                                     to solve it?
>>
>>                                 Yeah, that problem came to my mind as
>>                                 well.
>>
>>                                 Basically we need to restrict those
>>                                 high priority submissions to
>>                                 the VR compositor or otherwise any
>>                                 malfunctioning application could
>>                                 use it.
>>
>>                                 Just think about some WebGL suddenly
>>                                 taking all our rendering away
>>                                 and we won't get anything drawn any more.
>>
>>                                 Alex or Michel any ideas on that?
>>
>>                                 Regards,
>>                                 Christian.
>>
>>                                 Am 19.12.2016 um 15:48 schrieb
>>                                 Serguei Sagalovitch:
>>
>>                                     > If compute queue is occupied
>>                                     only by you, the efficiency
>>                                     > is equal with setting job queue
>>                                     to high priority I think.
>>                                     The only risk is the situation
>>                                     when graphics will take all
>>                                     needed CUs. But in any case it
>>                                     should be very good test.
>>
>>                                     Andres/Pierre-Loup,
>>
>>                                     Did you try to do it or it is a
>>                                     lot of work for you?
>>
>>
>>                                     BTW: If there is  non-VR
>>                                     application which will use
>>                                     high-priority
>>                                     h/w queue then VR application
>>                                     will suffer.  Any ideas how
>>                                     to solve it?
>>
>>                                     Sincerely yours,
>>                                     Serguei Sagalovitch
>>
>>                                     On 2016-12-19 12:50 AM, zhoucm1
>>                                     wrote:
>>
>>                                         Do you encounter the priority
>>                                         issue for compute queue with
>>                                         current driver?
>>
>>                                         If compute queue is occupied
>>                                         only by you, the efficiency
>>                                         is equal
>>                                         with setting job queue to
>>                                         high priority I think.
>>
>>                                         Regards,
>>                                         David Zhou
>>
>>                                         On 2016年12月19日 13:29, Andres
>>                                         Rodriguez wrote:
>>
>>                                             Yes, vulkan is available
>>                                             on all-open through the
>>                                             mesa radv UMD.
>>
>>                                             I'm not sure if I'm
>>                                             asking for too much, but
>>                                             if we can
>>                                             coordinate a similar
>>                                             interface in radv and
>>                                             amdgpu-pro at the
>>                                             vulkan level that would
>>                                             be great.
>>
>>                                             I'm not sure what that's
>>                                             going to be yet.
>>
>>                                             - Andres
>>
>>                                             On 12/19/2016 12:11 AM,
>>                                             zhoucm1 wrote:
>>
>>
>>
>>                                                 On 2016年12月19日 11:33,
>>                                                 Pierre-Loup A.
>>                                                 Griffais wrote:
>>
>>                                                     We're currently
>>                                                     working with the
>>                                                     open stack; I
>>                                                     assume that a
>>                                                     mechanism could
>>                                                     be exposed by
>>                                                     both open and Pro
>>                                                     Vulkan
>>                                                     userspace drivers
>>                                                     and that the
>>                                                     amdgpu kernel
>>                                                     interface
>>                                                     improvements we
>>                                                     would pursue
>>                                                     following this
>>                                                     discussion would
>>                                                     let both drivers
>>                                                     take advantage of
>>                                                     the feature, correct?
>>
>>                                                 Of course.
>>                                                 Does open stack have
>>                                                 Vulkan support?
>>
>>                                                 Regards,
>>                                                 David Zhou
>>
>>
>>                                                     On 12/18/2016
>>                                                     07:26 PM, zhoucm1
>>                                                     wrote:
>>
>>                                                         By the way,
>>                                                         are you using
>>                                                         all-open
>>                                                         driver or
>>                                                         amdgpu-pro
>>                                                         driver?
>>
>>                                                         +David Mao,
>>                                                         who is
>>                                                         working on
>>                                                         our Vulkan
>>                                                         driver.
>>
>>                                                         Regards,
>>                                                         David Zhou
>>
>>                                                         On
>>                                                         2016年12月18日
>>                                                         06:05,
>>                                                         Pierre-Loup
>>                                                         A. Griffais
>>                                                         wrote:
>>
>>                                                             Hi Serguei,
>>
>>                                                             I'm also
>>                                                             working
>>                                                             on the
>>                                                             bringing
>>                                                             up our VR
>>                                                             runtime
>>                                                             on top of
>>                                                             amgpu;
>>                                                             see
>>                                                             replies
>>                                                             inline.
>>
>>                                                             On
>>                                                             12/16/2016
>>                                                             09:05 PM,
>>                                                             Sagalovitch,
>>                                                             Serguei
>>                                                             wrote:
>>
>>                                                                 Andres,
>>
>>                                                                      For
>>                                                                     current
>>                                                                     VR
>>                                                                     workloads
>>                                                                     we
>>                                                                     have
>>                                                                     3
>>                                                                     separate
>>                                                                     processes
>>                                                                     running
>>                                                                     actually:
>>
>>                                                                 So we
>>                                                                 could
>>                                                                 have
>>                                                                 potential
>>                                                                 memory
>>                                                                 overcommit
>>                                                                 case
>>                                                                 or do
>>                                                                 you do
>>                                                                 partitioning
>>                                                                 on
>>                                                                 your
>>                                                                 own? 
>>                                                                 I
>>                                                                 would
>>                                                                 think
>>                                                                 that
>>                                                                 there
>>                                                                 is
>>                                                                 need
>>                                                                 to avoid
>>                                                                 overcomit
>>                                                                 in
>>                                                                 VR
>>                                                                 case to
>>                                                                 prevent
>>                                                                 any
>>                                                                 BO
>>                                                                 migration.
>>
>>
>>                                                             You're
>>                                                             entirely
>>                                                             correct;
>>                                                             currently
>>                                                             the VR
>>                                                             runtime is
>>                                                             setting up
>>                                                             prioritized
>>                                                             CPU
>>                                                             scheduling
>>                                                             for its
>>                                                             VR
>>                                                             compositor,
>>                                                             we're
>>                                                             working on
>>                                                             prioritized
>>                                                             GPU
>>                                                             scheduling
>>                                                             and
>>                                                             pre-emption
>>                                                             (eg. this
>>                                                             thread),
>>                                                             and in
>>                                                             the
>>                                                             future it
>>                                                             will make
>>                                                             sense to
>>                                                             do work
>>                                                             in order
>>                                                             to make
>>                                                             sure that
>>                                                             its
>>                                                             memory
>>                                                             allocations
>>                                                             do not
>>                                                             get
>>                                                             evicted,
>>                                                             to
>>                                                             prevent any
>>                                                             unwelcome
>>                                                             additional
>>                                                             latency
>>                                                             in the
>>                                                             event of
>>                                                             needing
>>                                                             to perform
>>                                                             just-in-time
>>                                                             reprojection.
>>
>>                                                                 BTW:
>>                                                                 Do
>>                                                                 you
>>                                                                 mean
>>                                                                 __real__
>>                                                                 processes
>>                                                                 or
>>                                                                 threads?
>>                                                                 Based
>>                                                                 on my
>>                                                                 understanding
>>                                                                 sharing
>>                                                                 BOs
>>                                                                 between
>>                                                                 different
>>                                                                 processes
>>                                                                 could
>>                                                                 introduce
>>                                                                 additional
>>                                                                 synchronization
>>                                                                 constrains.
>>                                                                 btw:
>>                                                                 I am not
>>                                                                 sure
>>                                                                 if we
>>                                                                 are
>>                                                                 able
>>                                                                 to
>>                                                                 share
>>                                                                 Vulkan
>>                                                                 sync.
>>                                                                 object
>>                                                                 cross-process
>>                                                                 boundary.
>>
>>
>>                                                             They are
>>                                                             different
>>                                                             processes;
>>                                                             it is
>>                                                             important
>>                                                             for the
>>                                                             compositor
>>                                                             that
>>                                                             is
>>                                                             responsible
>>                                                             for
>>                                                             quality-of-service
>>                                                             features
>>                                                             such as
>>                                                             consistently
>>                                                             presenting
>>                                                             distorted
>>                                                             frames
>>                                                             with the
>>                                                             right
>>                                                             latency,
>>                                                             reprojection,
>>                                                             etc,
>>                                                             to be
>>                                                             separate
>>                                                             from the
>>                                                             main
>>                                                             application.
>>
>>                                                             Currently
>>                                                             we are
>>                                                             using
>>                                                             unreleased
>>                                                             cross-process
>>                                                             memory and
>>                                                             semaphore
>>                                                             extensions
>>                                                             to fetch
>>                                                             updated
>>                                                             eye
>>                                                             images
>>                                                             from the
>>                                                             client
>>                                                             application,
>>                                                             but the
>>                                                             just-in-time
>>                                                             reprojection
>>                                                             discussed
>>                                                             here does not
>>                                                             actually
>>                                                             have any
>>                                                             direct
>>                                                             interactions
>>                                                             with
>>                                                             cross-process
>>                                                             resource
>>                                                             sharing,
>>                                                             since
>>                                                             it's
>>                                                             achieved
>>                                                             by using
>>                                                             whatever
>>                                                             is the
>>                                                             latest, most
>>                                                             up-to-date
>>                                                             eye
>>                                                             images
>>                                                             that have
>>                                                             already
>>                                                             been sent
>>                                                             by the client
>>                                                             application,
>>                                                             which are
>>                                                             already
>>                                                             available
>>                                                             to use
>>                                                             without
>>                                                             additional
>>                                                             synchronization.
>>
>>
>>                                                                      
>>                                                                      3)
>>                                                                     System
>>                                                                     compositor
>>                                                                     (we
>>                                                                     are
>>                                                                     looking
>>                                                                     at
>>                                                                     approaches
>>                                                                     to
>>                                                                     remove
>>                                                                     this
>>                                                                     overhead)
>>
>>                                                                 Yes, 
>>                                                                 IMHO
>>                                                                 the
>>                                                                 best
>>                                                                 is to
>>                                                                 run
>>                                                                 in 
>>                                                                 "full
>>                                                                 screen
>>                                                                 mode".
>>
>>
>>                                                             Yes, we
>>                                                             are
>>                                                             working
>>                                                             on
>>                                                             mechanisms
>>                                                             to
>>                                                             present
>>                                                             directly
>>                                                             to the
>>                                                             headset
>>                                                             display
>>                                                             without
>>                                                             any
>>                                                             intermediaries
>>                                                             as a
>>                                                             separate
>>                                                             effort.
>>
>>
>>                                                                      The
>>                                                                     latency
>>                                                                     is
>>                                                                     our
>>                                                                     main
>>                                                                     concern,
>>
>>                                                                 I
>>                                                                 would
>>                                                                 assume
>>                                                                 that
>>                                                                 this
>>                                                                 is
>>                                                                 the
>>                                                                 known
>>                                                                 problem
>>                                                                 (at
>>                                                                 least for
>>                                                                 compute
>>                                                                 usage).
>>                                                                 It
>>                                                                 looks
>>                                                                 like
>>                                                                 that
>>                                                                 amdgpu
>>                                                                 /
>>                                                                 kernel
>>                                                                 submission
>>                                                                 is
>>                                                                 rather
>>                                                                 CPU
>>                                                                 intensive
>>                                                                 (at least
>>                                                                 in
>>                                                                 the
>>                                                                 default
>>                                                                 configuration).
>>
>>
>>                                                             As long
>>                                                             as it's a
>>                                                             consistent
>>                                                             cost, it
>>                                                             shouldn't
>>                                                             an issue.
>>                                                             However, if
>>                                                             there's
>>                                                             high
>>                                                             degrees
>>                                                             of
>>                                                             variance
>>                                                             then that
>>                                                             would be
>>                                                             troublesome
>>                                                             and we
>>                                                             would
>>                                                             need to
>>                                                             account
>>                                                             for the
>>                                                             worst case.
>>
>>                                                             Hopefully
>>                                                             the
>>                                                             requirements
>>                                                             and
>>                                                             approach
>>                                                             we
>>                                                             described
>>                                                             make
>>                                                             sense, we're
>>                                                             looking
>>                                                             forward
>>                                                             to your
>>                                                             feedback
>>                                                             and
>>                                                             suggestions.
>>
>>                                                             Thanks!
>>                                                              -
>>                                                             Pierre-Loup
>>
>>
>>                                                                 Sincerely
>>                                                                 yours,
>>                                                                 Serguei
>>                                                                 Sagalovitch
>>
>>
>>                                                                 From:
>>                                                                 Andres
>>                                                                 Rodriguez
>>                                                                 <andresr at valvesoftware.com
>>                                                                 <mailto:andresr at valvesoftware.com>>
>>                                                                 Sent:
>>                                                                 December
>>                                                                 16,
>>                                                                 2016
>>                                                                 10:00 PM
>>                                                                 To:
>>                                                                 Sagalovitch,
>>                                                                 Serguei;
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 Subject:
>>                                                                 RE:
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>                                                                 Hey
>>                                                                 Serguei,
>>
>>                                                                     [Serguei]
>>                                                                     No.
>>                                                                     I
>>                                                                     mean
>>                                                                     pipe
>>                                                                     :-)
>>                                                                     as
>>                                                                     MEC
>>                                                                     define
>>                                                                     it. 
>>                                                                     As
>>                                                                     far
>>                                                                     as I
>>                                                                     understand
>>                                                                     (by
>>                                                                     simplifying)
>>                                                                     some
>>                                                                     scheduling
>>                                                                     is
>>                                                                     per
>>                                                                     pipe. 
>>                                                                     I
>>                                                                     know
>>                                                                     about
>>                                                                     the
>>                                                                     current
>>                                                                     allocation
>>                                                                     scheme
>>                                                                     but
>>                                                                     I
>>                                                                     do
>>                                                                     not
>>                                                                     think
>>                                                                     that
>>                                                                     it
>>                                                                     is 
>>                                                                     ideal. 
>>                                                                     I
>>                                                                     would
>>                                                                     assume
>>                                                                     that
>>                                                                     we
>>                                                                     need
>>                                                                     to
>>                                                                     switch
>>                                                                     to
>>                                                                     dynamical
>>                                                                     partition
>>                                                                     of
>>                                                                     resources 
>>                                                                     based
>>                                                                     on
>>                                                                     the
>>                                                                     workload
>>                                                                     otherwise
>>                                                                     we
>>                                                                     will
>>                                                                     have
>>                                                                     resource
>>                                                                     conflict
>>                                                                     between
>>                                                                     Vulkan
>>                                                                     compute
>>                                                                     and 
>>                                                                     OpenCL.
>>
>>
>>                                                                 I
>>                                                                 agree
>>                                                                 the
>>                                                                 partitioning
>>                                                                 isn't
>>                                                                 ideal.
>>                                                                 I'm
>>                                                                 hoping
>>                                                                 we can
>>                                                                 start
>>                                                                 with a
>>                                                                 solution
>>                                                                 that
>>                                                                 assumes
>>                                                                 that
>>                                                                 only
>>                                                                 pipe0
>>                                                                 has
>>                                                                 any
>>                                                                 work
>>                                                                 and
>>                                                                 the
>>                                                                 other
>>                                                                 pipes
>>                                                                 are
>>                                                                 idle (no
>>                                                                 HSA/ROCm
>>                                                                 running
>>                                                                 on
>>                                                                 the
>>                                                                 system).
>>
>>                                                                 This
>>                                                                 should
>>                                                                 be
>>                                                                 more
>>                                                                 or
>>                                                                 less
>>                                                                 the
>>                                                                 use
>>                                                                 case
>>                                                                 we
>>                                                                 expect
>>                                                                 from VR
>>                                                                 users.
>>
>>                                                                 I
>>                                                                 agree
>>                                                                 the
>>                                                                 split
>>                                                                 is
>>                                                                 currently
>>                                                                 not
>>                                                                 ideal,
>>                                                                 but
>>                                                                 I'd
>>                                                                 like to
>>                                                                 consider
>>                                                                 that
>>                                                                 a
>>                                                                 separate
>>                                                                 task,
>>                                                                 because
>>                                                                 making
>>                                                                 it
>>                                                                 dynamic
>>                                                                 is
>>                                                                 not
>>                                                                 straight
>>                                                                 forward
>>                                                                 :P
>>
>>                                                                     [Serguei]
>>                                                                     Vulkan
>>                                                                     works
>>                                                                     via
>>                                                                     amdgpu
>>                                                                     (kernel
>>                                                                     submissions)
>>                                                                     so
>>                                                                     amdkfd
>>                                                                     will
>>                                                                     be
>>                                                                     not
>>                                                                     involved. 
>>                                                                     I
>>                                                                     would
>>                                                                     assume
>>                                                                     that
>>                                                                     in
>>                                                                     the
>>                                                                     case
>>                                                                     of
>>                                                                     VR
>>                                                                     we
>>                                                                     will
>>                                                                     have
>>                                                                     one
>>                                                                     main
>>                                                                     application
>>                                                                     ("console"
>>                                                                     mode(?))
>>                                                                     so
>>                                                                     we
>>                                                                     could
>>                                                                     temporally
>>                                                                     "ignore"
>>                                                                     OpenCL/ROCm
>>                                                                     needs
>>                                                                     when
>>                                                                     VR
>>                                                                     is
>>                                                                     running.
>>
>>
>>                                                                 Correct,
>>                                                                 this
>>                                                                 is
>>                                                                 why
>>                                                                 we
>>                                                                 want
>>                                                                 to
>>                                                                 enable
>>                                                                 the
>>                                                                 high
>>                                                                 priority
>>                                                                 compute
>>                                                                 queue
>>                                                                 through
>>                                                                 libdrm-amdgpu,
>>                                                                 so
>>                                                                 that
>>                                                                 we
>>                                                                 can
>>                                                                 expose
>>                                                                 it
>>                                                                 through
>>                                                                 Vulkan
>>                                                                 later.
>>
>>                                                                 For
>>                                                                 current
>>                                                                 VR
>>                                                                 workloads
>>                                                                 we
>>                                                                 have
>>                                                                 3
>>                                                                 separate
>>                                                                 processes
>>                                                                 running
>>                                                                 actually:
>>                                                                    
>>                                                                 1)
>>                                                                 Game
>>                                                                 process
>>                                                                    
>>                                                                 2) VR
>>                                                                 Compositor
>>                                                                 (this
>>                                                                 is
>>                                                                 the
>>                                                                 process
>>                                                                 that
>>                                                                 will
>>                                                                 require
>>                                                                 high
>>                                                                 priority
>>                                                                 queue)
>>                                                                    
>>                                                                 3)
>>                                                                 System
>>                                                                 compositor
>>                                                                 (we
>>                                                                 are
>>                                                                 looking
>>                                                                 at
>>                                                                 approaches
>>                                                                 to
>>                                                                 remove
>>                                                                 this
>>                                                                 overhead)
>>
>>                                                                 For
>>                                                                 now I
>>                                                                 think
>>                                                                 it is
>>                                                                 okay
>>                                                                 to
>>                                                                 assume
>>                                                                 no
>>                                                                 OpenCL/ROCm
>>                                                                 running
>>                                                                 simultaneously,
>>                                                                 but
>>                                                                 I
>>                                                                 would
>>                                                                 also
>>                                                                 like
>>                                                                 to be
>>                                                                 able
>>                                                                 to
>>                                                                 address
>>                                                                 this
>>                                                                 case
>>                                                                 in the
>>                                                                 future
>>                                                                 (cross-pipe
>>                                                                 priorities).
>>
>>                                                                     [Serguei] 
>>                                                                     The
>>                                                                     problem
>>                                                                     with
>>                                                                     pre-emption
>>                                                                     of
>>                                                                     graphics
>>                                                                     task:
>>                                                                     (a)
>>                                                                     it
>>                                                                     may
>>                                                                     take
>>                                                                     time
>>                                                                     so
>>                                                                     latency
>>                                                                     may
>>                                                                     suffer
>>
>>
>>                                                                 The
>>                                                                 latency
>>                                                                 is
>>                                                                 our
>>                                                                 main
>>                                                                 concern,
>>                                                                 we
>>                                                                 want
>>                                                                 something
>>                                                                 that is
>>                                                                 predictable.
>>                                                                 A good
>>                                                                 illustration
>>                                                                 of
>>                                                                 what
>>                                                                 the
>>                                                                 reprojection
>>                                                                 scheduling
>>                                                                 looks
>>                                                                 like
>>                                                                 can be
>>                                                                 found
>>                                                                 here:
>>                                                                 https://community.amd.com/servlet/JiveServlet/showImage/38-1310-104754/pastedImage_3.png
>>                                                                 <https://community.amd.com/servlet/JiveServlet/showImage/38-1310-104754/pastedImage_3.png>
>>
>>
>>
>>
>>                                                                     (b)
>>                                                                     to
>>                                                                     preempt
>>                                                                     we
>>                                                                     need
>>                                                                     to
>>                                                                     have
>>                                                                     different
>>                                                                     "context"
>>                                                                     - we
>>                                                                     want
>>                                                                     to
>>                                                                     guarantee
>>                                                                     that
>>                                                                     submissions
>>                                                                     from
>>                                                                     the
>>                                                                     same
>>                                                                     context
>>                                                                     will
>>                                                                     be
>>                                                                     executed
>>                                                                     in
>>                                                                     order.
>>
>>
>>                                                                 This
>>                                                                 is
>>                                                                 okay,
>>                                                                 as
>>                                                                 the
>>                                                                 reprojection
>>                                                                 work
>>                                                                 doesn't
>>                                                                 have
>>                                                                 dependencies
>>                                                                 on
>>                                                                 the
>>                                                                 game
>>                                                                 context,
>>                                                                 and it
>>                                                                 even
>>                                                                 happens
>>                                                                 in a
>>                                                                 separate
>>                                                                 process.
>>
>>                                                                     BTW:
>>                                                                     (a)
>>                                                                     Do
>>                                                                     you
>>                                                                     want
>>                                                                     "preempt"
>>                                                                     and
>>                                                                     later
>>                                                                     resume
>>                                                                     or
>>                                                                     do
>>                                                                     you
>>                                                                     want
>>                                                                     "preempt"
>>                                                                     and
>>                                                                     "cancel/abort"
>>
>>
>>                                                                 Preempt
>>                                                                 the
>>                                                                 game
>>                                                                 with
>>                                                                 the
>>                                                                 compositor
>>                                                                 task
>>                                                                 and
>>                                                                 then
>>                                                                 resume
>>                                                                 it.
>>
>>                                                                     (b)
>>                                                                     Vulkan
>>                                                                     is
>>                                                                     generic
>>                                                                     API
>>                                                                     and
>>                                                                     could
>>                                                                     be
>>                                                                     used
>>                                                                     for
>>                                                                     graphics
>>                                                                     as
>>                                                                     well
>>                                                                     as
>>                                                                     for
>>                                                                     plain
>>                                                                     compute
>>                                                                     tasks
>>                                                                     (VK_QUEUE_COMPUTE_BIT).
>>
>>
>>                                                                 Yeah,
>>                                                                 the
>>                                                                 plan
>>                                                                 is to
>>                                                                 use
>>                                                                 vulkan
>>                                                                 compute.
>>                                                                 But
>>                                                                 if
>>                                                                 you
>>                                                                 figure
>>                                                                 out a way
>>                                                                 for
>>                                                                 us to get
>>                                                                 a
>>                                                                 guaranteed
>>                                                                 execution
>>                                                                 time
>>                                                                 using
>>                                                                 vulkan
>>                                                                 graphics,
>>                                                                 then
>>                                                                 I'll
>>                                                                 take you
>>                                                                 out
>>                                                                 for a
>>                                                                 beer :)
>>
>>                                                                 Regards,
>>                                                                 Andres
>>                                                                 ________________________________________
>>                                                                 From:
>>                                                                 Sagalovitch,
>>                                                                 Serguei
>>                                                                 [Serguei.Sagalovitch at amd.com
>>                                                                 <mailto:Serguei.Sagalovitch at amd.com>]
>>                                                                 Sent:
>>                                                                 Friday,
>>                                                                 December
>>                                                                 16,
>>                                                                 2016
>>                                                                 9:13 PM
>>                                                                 To:
>>                                                                 Andres
>>                                                                 Rodriguez;
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 Subject:
>>                                                                 Re:
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>                                                                 Hi
>>                                                                 Andres,
>>
>>                                                                 Please
>>                                                                 see
>>                                                                 inline
>>                                                                 (as
>>                                                                 [Serguei])
>>
>>                                                                 Sincerely
>>                                                                 yours,
>>                                                                 Serguei
>>                                                                 Sagalovitch
>>
>>
>>                                                                 From:
>>                                                                 Andres
>>                                                                 Rodriguez
>>                                                                 <andresr at valvesoftware.com
>>                                                                 <mailto:andresr at valvesoftware.com>>
>>                                                                 Sent:
>>                                                                 December
>>                                                                 16,
>>                                                                 2016
>>                                                                 8:29 PM
>>                                                                 To:
>>                                                                 Sagalovitch,
>>                                                                 Serguei;
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 Subject:
>>                                                                 RE:
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>                                                                 Hi
>>                                                                 Serguei,
>>
>>                                                                 Thanks
>>                                                                 for
>>                                                                 the
>>                                                                 feedback.
>>                                                                 Answers
>>                                                                 inline
>>                                                                 as [AR].
>>
>>                                                                 Regards,
>>                                                                 Andres
>>
>>                                                                 ________________________________________
>>                                                                 From:
>>                                                                 Sagalovitch,
>>                                                                 Serguei
>>                                                                 [Serguei.Sagalovitch at amd.com
>>                                                                 <mailto:Serguei.Sagalovitch at amd.com>]
>>                                                                 Sent:
>>                                                                 Friday,
>>                                                                 December
>>                                                                 16,
>>                                                                 2016
>>                                                                 8:15 PM
>>                                                                 To:
>>                                                                 Andres
>>                                                                 Rodriguez;
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 Subject:
>>                                                                 Re:
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>                                                                 Andres,
>>
>>
>>                                                                 Quick
>>                                                                 comments:
>>
>>                                                                 1) To
>>                                                                 minimize
>>                                                                 "bubbles",
>>                                                                 etc.
>>                                                                 we
>>                                                                 need
>>                                                                 to
>>                                                                 "force"
>>                                                                 CU
>>                                                                 assignments/binding
>>                                                                 to
>>                                                                 high-priority
>>                                                                 queue
>>                                                                 when
>>                                                                 it
>>                                                                 will
>>                                                                 be in
>>                                                                 use
>>                                                                 and
>>                                                                 "free"
>>                                                                 them
>>                                                                 later
>>                                                                 (we 
>>                                                                 do
>>                                                                 not
>>                                                                 want
>>                                                                 forever
>>                                                                 take
>>                                                                 CUs
>>                                                                 from
>>                                                                 e.g.
>>                                                                 graphic
>>                                                                 task to
>>                                                                 degrade
>>                                                                 graphics
>>                                                                 performance).
>>
>>                                                                 Otherwise
>>                                                                 we
>>                                                                 could
>>                                                                 have
>>                                                                 scenario
>>                                                                 when
>>                                                                 long
>>                                                                 graphics
>>                                                                 task (or
>>                                                                 low-priority
>>                                                                 compute)
>>                                                                 will
>>                                                                 took
>>                                                                 all
>>                                                                 (extra)
>>                                                                 CUs
>>                                                                 and
>>                                                                 high--priority
>>                                                                 will
>>                                                                 wait for
>>                                                                 needed
>>                                                                 resources.
>>                                                                 It
>>                                                                 will
>>                                                                 not
>>                                                                 be
>>                                                                 visible
>>                                                                 on
>>                                                                 "NOP
>>                                                                 " but
>>                                                                 only
>>                                                                 when
>>                                                                 you
>>                                                                 submit
>>                                                                 "real"
>>                                                                 compute
>>                                                                 task
>>                                                                 so I
>>                                                                 would
>>                                                                 recommend
>>                                                                 not
>>                                                                 to
>>                                                                 use
>>                                                                 "NOP"
>>                                                                 packets
>>                                                                 at
>>                                                                 all for
>>                                                                 testing.
>>
>>                                                                 It
>>                                                                 (CU
>>                                                                 assignment)
>>                                                                 could
>>                                                                 be
>>                                                                 relatively
>>                                                                 easy
>>                                                                 done when
>>                                                                 everything
>>                                                                 is
>>                                                                 going
>>                                                                 via
>>                                                                 kernel
>>                                                                 (e.g.
>>                                                                 as
>>                                                                 part
>>                                                                 of
>>                                                                 frame
>>                                                                 submission)
>>                                                                 but I
>>                                                                 must
>>                                                                 admit
>>                                                                 that I
>>                                                                 am
>>                                                                 not sure
>>                                                                 about
>>                                                                 the
>>                                                                 best
>>                                                                 way
>>                                                                 for
>>                                                                 user
>>                                                                 level
>>                                                                 submissions
>>                                                                 (amdkfd).
>>
>>                                                                 [AR]
>>                                                                 I
>>                                                                 wasn't
>>                                                                 aware
>>                                                                 of
>>                                                                 this
>>                                                                 part
>>                                                                 of
>>                                                                 the
>>                                                                 programming
>>                                                                 sequence.
>>                                                                 Thanks
>>                                                                 for
>>                                                                 the
>>                                                                 heads up!
>>                                                                 Is
>>                                                                 this
>>                                                                 similar
>>                                                                 to
>>                                                                 the
>>                                                                 CU
>>                                                                 masking
>>                                                                 programming?
>>                                                                 [Serguei]
>>                                                                 Yes.
>>                                                                 To
>>                                                                 simplify:
>>                                                                 the
>>                                                                 problem
>>                                                                 is
>>                                                                 that
>>                                                                 "scheduler"
>>                                                                 when
>>                                                                 deciding
>>                                                                 which
>>                                                                 queue
>>                                                                 to 
>>                                                                 run
>>                                                                 will
>>                                                                 check
>>                                                                 if
>>                                                                 there
>>                                                                 is
>>                                                                 enough
>>                                                                 resources
>>                                                                 and
>>                                                                 if
>>                                                                 not then
>>                                                                 it
>>                                                                 will
>>                                                                 begin
>>                                                                 to
>>                                                                 check
>>                                                                 other
>>                                                                 queues
>>                                                                 with
>>                                                                 lower
>>                                                                 priority.
>>
>>                                                                 2) I
>>                                                                 would
>>                                                                 recommend
>>                                                                 to
>>                                                                 dedicate
>>                                                                 the
>>                                                                 whole
>>                                                                 pipe to
>>                                                                 high-priority
>>                                                                 queue
>>                                                                 and have
>>                                                                 nothing
>>                                                                 their
>>                                                                 except
>>                                                                 it.
>>
>>                                                                 [AR]
>>                                                                 I'm
>>                                                                 guessing
>>                                                                 in
>>                                                                 this
>>                                                                 context
>>                                                                 you
>>                                                                 mean
>>                                                                 pipe
>>                                                                 = queue?
>>                                                                 (as
>>                                                                 opposed
>>                                                                 to
>>                                                                 the
>>                                                                 MEC
>>                                                                 definition
>>                                                                 of
>>                                                                 pipe,
>>                                                                 which
>>                                                                 is a
>>                                                                 grouping
>>                                                                 of
>>                                                                 queues).
>>                                                                 I say
>>                                                                 this
>>                                                                 because
>>                                                                 amdgpu
>>                                                                 only
>>                                                                 has
>>                                                                 access
>>                                                                 to 1
>>                                                                 pipe,
>>                                                                 and
>>                                                                 the
>>                                                                 rest
>>                                                                 are
>>                                                                 statically
>>                                                                 partitioned
>>                                                                 for
>>                                                                 amdkfd
>>                                                                 usage.
>>
>>                                                                 [Serguei]
>>                                                                 No. I
>>                                                                 mean
>>                                                                 pipe
>>                                                                 :-) 
>>                                                                 as
>>                                                                 MEC
>>                                                                 define
>>                                                                 it.
>>                                                                 As
>>                                                                 far as I
>>                                                                 understand
>>                                                                 (by
>>                                                                 simplifying)
>>                                                                 some
>>                                                                 scheduling
>>                                                                 is
>>                                                                 per
>>                                                                 pipe. 
>>                                                                 I
>>                                                                 know
>>                                                                 about
>>                                                                 the
>>                                                                 current
>>                                                                 allocation
>>                                                                 scheme
>>                                                                 but I
>>                                                                 do
>>                                                                 not think
>>                                                                 that
>>                                                                 it
>>                                                                 is 
>>                                                                 ideal. 
>>                                                                 I
>>                                                                 would
>>                                                                 assume
>>                                                                 that
>>                                                                 we
>>                                                                 need
>>                                                                 to
>>                                                                 switch to
>>                                                                 dynamical
>>                                                                 partition
>>                                                                 of
>>                                                                 resources 
>>                                                                 based
>>                                                                 on
>>                                                                 the
>>                                                                 workload
>>                                                                 otherwise
>>                                                                 we
>>                                                                 will have
>>                                                                 resource
>>                                                                 conflict
>>                                                                 between
>>                                                                 Vulkan
>>                                                                 compute
>>                                                                 and 
>>                                                                 OpenCL.
>>
>>
>>                                                                 BTW:
>>                                                                 Which
>>                                                                 user
>>                                                                 level
>>                                                                 API
>>                                                                 do
>>                                                                 you
>>                                                                 want
>>                                                                 to
>>                                                                 use
>>                                                                 for
>>                                                                 compute:
>>                                                                 Vulkan or
>>                                                                 OpenCL?
>>
>>                                                                 [AR]
>>                                                                 Vulkan
>>
>>                                                                 [Serguei]
>>                                                                 Vulkan
>>                                                                 works
>>                                                                 via
>>                                                                 amdgpu
>>                                                                 (kernel
>>                                                                 submissions)
>>                                                                 so
>>                                                                 amdkfd
>>                                                                 will
>>                                                                 be not
>>                                                                 involved. 
>>                                                                 I
>>                                                                 would
>>                                                                 assume
>>                                                                 that
>>                                                                 in
>>                                                                 the
>>                                                                 case
>>                                                                 of VR
>>                                                                 we will
>>                                                                 have
>>                                                                 one main
>>                                                                 application
>>                                                                 ("console"
>>                                                                 mode(?))
>>                                                                 so we
>>                                                                 could
>>                                                                 temporally
>>                                                                 "ignore"
>>                                                                 OpenCL/ROCm
>>                                                                 needs
>>                                                                 when
>>                                                                 VR is
>>                                                                 running.
>>
>>                                                                      we
>>                                                                     will
>>                                                                     not
>>                                                                     be
>>                                                                     able
>>                                                                     to
>>                                                                     provide
>>                                                                     a
>>                                                                     solution
>>                                                                     compatible
>>                                                                     with
>>                                                                     GFX
>>                                                                     worloads.
>>
>>                                                                 I
>>                                                                 assume
>>                                                                 that
>>                                                                 you
>>                                                                 are
>>                                                                 talking
>>                                                                 about
>>                                                                 graphics?
>>                                                                 Am I
>>                                                                 right?
>>
>>                                                                 [AR]
>>                                                                 Yeah,
>>                                                                 my
>>                                                                 understanding
>>                                                                 is
>>                                                                 that
>>                                                                 pre-empting
>>                                                                 the
>>                                                                 currently
>>                                                                 running
>>                                                                 graphics
>>                                                                 job
>>                                                                 and
>>                                                                 scheduling
>>                                                                 in
>>                                                                 something
>>                                                                 else
>>                                                                 using
>>                                                                 mid-buffer
>>                                                                 pre-emption
>>                                                                 has
>>                                                                 some
>>                                                                 cases
>>                                                                 where it
>>                                                                 doesn't
>>                                                                 work
>>                                                                 well.
>>                                                                 But
>>                                                                 if with
>>                                                                 polaris10
>>                                                                 it
>>                                                                 starts
>>                                                                 working
>>                                                                 well,
>>                                                                 it
>>                                                                 might
>>                                                                 be a
>>                                                                 better
>>                                                                 solution
>>                                                                 for
>>                                                                 us
>>                                                                 (because
>>                                                                 the
>>                                                                 whole
>>                                                                 reprojection
>>                                                                 work
>>                                                                 uses
>>                                                                 the
>>                                                                 vulkan
>>                                                                 graphics
>>                                                                 stack
>>                                                                 at
>>                                                                 the
>>                                                                 moment,
>>                                                                 and
>>                                                                 porting
>>                                                                 it to
>>                                                                 compute
>>                                                                 is
>>                                                                 not
>>                                                                 trivial).
>>
>>                                                                 [Serguei] 
>>                                                                 The
>>                                                                 problem
>>                                                                 with
>>                                                                 pre-emption
>>                                                                 of
>>                                                                 graphics
>>                                                                 task:
>>                                                                 (a)
>>                                                                 it may
>>                                                                 take
>>                                                                 time so
>>                                                                 latency
>>                                                                 may
>>                                                                 suffer
>>                                                                 (b)
>>                                                                 to
>>                                                                 preempt
>>                                                                 we
>>                                                                 need
>>                                                                 to
>>                                                                 have
>>                                                                 different
>>                                                                 "context"
>>                                                                 - we want
>>                                                                 to
>>                                                                 guarantee
>>                                                                 that
>>                                                                 submissions
>>                                                                 from
>>                                                                 the
>>                                                                 same
>>                                                                 context
>>                                                                 will be
>>                                                                 executed
>>                                                                 in order.
>>                                                                 BTW:
>>                                                                 (a)
>>                                                                 Do
>>                                                                 you
>>                                                                 want
>>                                                                 "preempt"
>>                                                                 and
>>                                                                 later
>>                                                                 resume
>>                                                                 or do you
>>                                                                 want
>>                                                                 "preempt"
>>                                                                 and
>>                                                                 "cancel/abort"? 
>>                                                                 (b)
>>                                                                 Vulkan
>>                                                                 is
>>                                                                 generic
>>                                                                 API
>>                                                                 and
>>                                                                 could
>>                                                                 be used
>>                                                                 for
>>                                                                 graphics
>>                                                                 as
>>                                                                 well
>>                                                                 as
>>                                                                 for
>>                                                                 plain
>>                                                                 compute
>>                                                                 tasks
>>                                                                 (VK_QUEUE_COMPUTE_BIT).
>>
>>
>>                                                                 Sincerely
>>                                                                 yours,
>>                                                                 Serguei
>>                                                                 Sagalovitch
>>
>>
>>
>>                                                                 From:
>>                                                                 amd-gfx
>>                                                                 <amd-gfx-bounces at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx-bounces at lists.freedesktop.org>>
>>                                                                 on
>>                                                                 behalf of
>>                                                                 Andres
>>                                                                 Rodriguez
>>                                                                 <andresr at valvesoftware.com
>>                                                                 <mailto:andresr at valvesoftware.com>>
>>                                                                 Sent:
>>                                                                 December
>>                                                                 16,
>>                                                                 2016
>>                                                                 6:15 PM
>>                                                                 To:
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 Subject:
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in
>>                                                                 amdgpu
>>
>>                                                                 Hi
>>                                                                 Everyone,
>>
>>                                                                 This
>>                                                                 RFC
>>                                                                 is
>>                                                                 also
>>                                                                 available
>>                                                                 as a
>>                                                                 gist
>>                                                                 here:
>>                                                                 https://gist.github.com/lostgoat/7000432cd6864265dbc2c3ab93204249
>>                                                                 <https://gist.github.com/lostgoat/7000432cd6864265dbc2c3ab93204249>
>>
>>
>>
>>
>>
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>                                                                 gist.github.com
>>                                                                 <http://gist.github.com>
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>
>>
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>                                                                 gist.github.com
>>                                                                 <http://gist.github.com>
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>
>>
>>
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>                                                                 gist.github.com
>>                                                                 <http://gist.github.com>
>>                                                                 [RFC]
>>                                                                 Mechanism
>>                                                                 for
>>                                                                 high
>>                                                                 priority
>>                                                                 scheduling
>>                                                                 in amdgpu
>>
>>
>>                                                                 We
>>                                                                 are
>>                                                                 interested
>>                                                                 in
>>                                                                 feedback
>>                                                                 for a
>>                                                                 mechanism
>>                                                                 to
>>                                                                 effectively
>>                                                                 schedule
>>                                                                 high
>>                                                                 priority
>>                                                                 VR
>>                                                                 reprojection
>>                                                                 tasks
>>                                                                 (also
>>                                                                 referred
>>                                                                 to as
>>                                                                 time-warping)
>>                                                                 for
>>                                                                 Polaris10
>>                                                                 running
>>                                                                 on
>>                                                                 the
>>                                                                 amdgpu
>>                                                                 kernel
>>                                                                 driver.
>>
>>                                                                 Brief
>>                                                                 context:
>>                                                                 --------------
>>
>>                                                                 The
>>                                                                 main
>>                                                                 objective
>>                                                                 of
>>                                                                 reprojection
>>                                                                 is to
>>                                                                 avoid
>>                                                                 motion
>>                                                                 sickness
>>                                                                 for VR
>>                                                                 users in
>>                                                                 scenarios
>>                                                                 where
>>                                                                 the
>>                                                                 game
>>                                                                 or
>>                                                                 application
>>                                                                 would
>>                                                                 fail
>>                                                                 to finish
>>                                                                 rendering
>>                                                                 a new
>>                                                                 frame
>>                                                                 in
>>                                                                 time
>>                                                                 for
>>                                                                 the
>>                                                                 next
>>                                                                 VBLANK.
>>                                                                 When
>>                                                                 this
>>                                                                 happens,
>>                                                                 the
>>                                                                 user's
>>                                                                 head
>>                                                                 movements
>>                                                                 are
>>                                                                 not
>>                                                                 reflected
>>                                                                 on
>>                                                                 the
>>                                                                 Head
>>                                                                 Mounted
>>                                                                 Display
>>                                                                 (HMD)
>>                                                                 for the
>>                                                                 duration
>>                                                                 of an
>>                                                                 extra
>>                                                                 frame.
>>                                                                 This
>>                                                                 extended
>>                                                                 mismatch
>>                                                                 between
>>                                                                 the
>>                                                                 inner ear
>>                                                                 and the
>>                                                                 eyes may
>>                                                                 cause
>>                                                                 the
>>                                                                 user
>>                                                                 to
>>                                                                 experience
>>                                                                 motion
>>                                                                 sickness.
>>
>>                                                                 The
>>                                                                 VR
>>                                                                 compositor
>>                                                                 deals
>>                                                                 with
>>                                                                 this
>>                                                                 problem
>>                                                                 by
>>                                                                 fabricating
>>                                                                 a
>>                                                                 new frame
>>                                                                 using the
>>                                                                 user's
>>                                                                 updated
>>                                                                 head
>>                                                                 position
>>                                                                 in
>>                                                                 combination
>>                                                                 with the
>>                                                                 previous
>>                                                                 frames.
>>                                                                 This
>>                                                                 avoids
>>                                                                 a
>>                                                                 prolonged
>>                                                                 mismatch
>>                                                                 between
>>                                                                 the
>>                                                                 HMD
>>                                                                 output
>>                                                                 and the
>>                                                                 inner
>>                                                                 ear.
>>
>>                                                                 Because
>>                                                                 of
>>                                                                 the
>>                                                                 adverse
>>                                                                 effects
>>                                                                 on
>>                                                                 the
>>                                                                 user,
>>                                                                 we
>>                                                                 require
>>                                                                 high
>>                                                                 confidence
>>                                                                 that the
>>                                                                 reprojection
>>                                                                 task
>>                                                                 will
>>                                                                 complete
>>                                                                 before
>>                                                                 the
>>                                                                 VBLANK
>>                                                                 interval.
>>                                                                 Even if
>>                                                                 the
>>                                                                 GFX pipe
>>                                                                 is
>>                                                                 currently
>>                                                                 full
>>                                                                 of
>>                                                                 work
>>                                                                 from
>>                                                                 the
>>                                                                 game/application
>>                                                                 (which
>>                                                                 is most
>>                                                                 likely
>>                                                                 the
>>                                                                 case).
>>
>>                                                                 For
>>                                                                 more
>>                                                                 details
>>                                                                 and
>>                                                                 illustrations,
>>                                                                 please
>>                                                                 refer
>>                                                                 to the
>>                                                                 following
>>                                                                 document:
>>                                                                 https://community.amd.com/community/gaming/blog/2016/03/28/asynchronous-shaders-evolved
>>                                                                 <https://community.amd.com/community/gaming/blog/2016/03/28/asynchronous-shaders-evolved>
>>
>>
>>
>>
>>
>>                                                                 Gaming:
>>                                                                 Asynchronous
>>                                                                 Shaders
>>                                                                 Evolved
>>                                                                 |
>>                                                                 Community
>>                                                                 community.amd.com
>>                                                                 <http://community.amd.com>
>>                                                                 One
>>                                                                 of
>>                                                                 the
>>                                                                 most
>>                                                                 exciting
>>                                                                 new
>>                                                                 developments
>>                                                                 in
>>                                                                 GPU
>>                                                                 technology
>>                                                                 over the
>>                                                                 past
>>                                                                 year
>>                                                                 has
>>                                                                 been
>>                                                                 the
>>                                                                 adoption
>>                                                                 of
>>                                                                 asynchronous
>>                                                                 shaders,
>>                                                                 which can
>>                                                                 make
>>                                                                 more
>>                                                                 efficient
>>                                                                 use
>>                                                                 of ...
>>
>>
>>
>>                                                                 Gaming:
>>                                                                 Asynchronous
>>                                                                 Shaders
>>                                                                 Evolved
>>                                                                 |
>>                                                                 Community
>>                                                                 community.amd.com
>>                                                                 <http://community.amd.com>
>>                                                                 One
>>                                                                 of
>>                                                                 the
>>                                                                 most
>>                                                                 exciting
>>                                                                 new
>>                                                                 developments
>>                                                                 in
>>                                                                 GPU
>>                                                                 technology
>>                                                                 over the
>>                                                                 past
>>                                                                 year
>>                                                                 has
>>                                                                 been
>>                                                                 the
>>                                                                 adoption
>>                                                                 of
>>                                                                 asynchronous
>>                                                                 shaders,
>>                                                                 which can
>>                                                                 make
>>                                                                 more
>>                                                                 efficient
>>                                                                 use
>>                                                                 of ...
>>
>>
>>
>>                                                                 Gaming:
>>                                                                 Asynchronous
>>                                                                 Shaders
>>                                                                 Evolved
>>                                                                 |
>>                                                                 Community
>>                                                                 community.amd.com
>>                                                                 <http://community.amd.com>
>>                                                                 One
>>                                                                 of
>>                                                                 the
>>                                                                 most
>>                                                                 exciting
>>                                                                 new
>>                                                                 developments
>>                                                                 in
>>                                                                 GPU
>>                                                                 technology
>>                                                                 over the
>>                                                                 past
>>                                                                 year
>>                                                                 has
>>                                                                 been
>>                                                                 the
>>                                                                 adoption
>>                                                                 of
>>                                                                 asynchronous
>>                                                                 shaders,
>>                                                                 which can
>>                                                                 make
>>                                                                 more
>>                                                                 efficient
>>                                                                 use
>>                                                                 of ...
>>
>>
>>                                                                 Requirements:
>>                                                                 -------------
>>
>>                                                                 The
>>                                                                 mechanism
>>                                                                 must
>>                                                                 expose
>>                                                                 the
>>                                                                 following
>>                                                                 functionaility:
>>
>>                                                                     *
>>                                                                 Job
>>                                                                 round
>>                                                                 trip
>>                                                                 time
>>                                                                 must
>>                                                                 be
>>                                                                 predictable,
>>                                                                 from
>>                                                                 submission
>>                                                                 to
>>                                                                 fence
>>                                                                 signal
>>
>>                                                                     *
>>                                                                 The
>>                                                                 mechanism
>>                                                                 must
>>                                                                 support
>>                                                                 compute
>>                                                                 workloads.
>>
>>                                                                 Goals:
>>                                                                 ------
>>
>>                                                                     *
>>                                                                 The
>>                                                                 mechanism
>>                                                                 should
>>                                                                 provide
>>                                                                 low
>>                                                                 submission
>>                                                                 latencies
>>
>>                                                                 Test:
>>                                                                 submitting
>>                                                                 a NOP
>>                                                                 packet
>>                                                                 through
>>                                                                 the
>>                                                                 mechanism
>>                                                                 on busy
>>                                                                 hardware
>>                                                                 should
>>                                                                 be
>>                                                                 equivalent
>>                                                                 to
>>                                                                 submitting
>>                                                                 a NOP
>>                                                                 on
>>                                                                 idle
>>                                                                 hardware.
>>
>>                                                                 Nice
>>                                                                 to have:
>>                                                                 -------------
>>
>>                                                                     *
>>                                                                 The
>>                                                                 mechanism
>>                                                                 should
>>                                                                 also
>>                                                                 support
>>                                                                 GFX
>>                                                                 workloads.
>>
>>                                                                 My
>>                                                                 understanding
>>                                                                 is
>>                                                                 that
>>                                                                 with
>>                                                                 the
>>                                                                 current
>>                                                                 hardware
>>                                                                 capabilities
>>                                                                 in
>>                                                                 Polaris10
>>                                                                 we
>>                                                                 will
>>                                                                 not
>>                                                                 be
>>                                                                 able
>>                                                                 to
>>                                                                 provide
>>                                                                 a
>>                                                                 solution
>>                                                                 compatible
>>                                                                 with GFX
>>                                                                 worloads.
>>
>>                                                                 But I
>>                                                                 would
>>                                                                 love
>>                                                                 to
>>                                                                 hear
>>                                                                 otherwise.
>>                                                                 So if
>>                                                                 anyone
>>                                                                 has an
>>                                                                 idea,
>>                                                                 approach
>>                                                                 or
>>                                                                 suggestion
>>                                                                 that
>>                                                                 will
>>                                                                 also
>>                                                                 be
>>                                                                 compatible
>>                                                                 with
>>                                                                 the
>>                                                                 GFX ring,
>>                                                                 please
>>                                                                 let
>>                                                                 us know
>>                                                                 about it.
>>
>>                                                                     *
>>                                                                 The
>>                                                                 above
>>                                                                 guarantees
>>                                                                 should
>>                                                                 also
>>                                                                 be
>>                                                                 respected
>>                                                                 by
>>                                                                 amdkfd
>>                                                                 workloads
>>
>>                                                                 Would
>>                                                                 be
>>                                                                 good
>>                                                                 to
>>                                                                 have
>>                                                                 for
>>                                                                 consistency,
>>                                                                 but
>>                                                                 not
>>                                                                 strictly
>>                                                                 necessary
>>                                                                 as
>>                                                                 users
>>                                                                 running
>>                                                                 games
>>                                                                 are
>>                                                                 not
>>                                                                 traditionally
>>                                                                 running
>>                                                                 HPC
>>                                                                 workloads
>>                                                                 in the
>>                                                                 background.
>>
>>                                                                 Proposed
>>                                                                 approach:
>>                                                                 ------------------
>>
>>                                                                 Similar
>>                                                                 to
>>                                                                 the
>>                                                                 windows
>>                                                                 driver,
>>                                                                 we
>>                                                                 could
>>                                                                 expose
>>                                                                 a high
>>                                                                 priority
>>                                                                 compute
>>                                                                 queue to
>>                                                                 userspace.
>>
>>                                                                 Submissions
>>                                                                 to
>>                                                                 this
>>                                                                 compute
>>                                                                 queue
>>                                                                 will
>>                                                                 be
>>                                                                 scheduled
>>                                                                 with
>>                                                                 high
>>                                                                 priority,
>>                                                                 and may
>>                                                                 acquire
>>                                                                 hardware
>>                                                                 resources
>>                                                                 previously
>>                                                                 in
>>                                                                 use
>>                                                                 by other
>>                                                                 queues.
>>
>>                                                                 This
>>                                                                 can
>>                                                                 be
>>                                                                 achieved
>>                                                                 by
>>                                                                 taking
>>                                                                 advantage
>>                                                                 of
>>                                                                 the
>>                                                                 'priority'
>>                                                                 field in
>>                                                                 the HQDs
>>                                                                 and
>>                                                                 could
>>                                                                 be
>>                                                                 programmed
>>                                                                 by
>>                                                                 amdgpu
>>                                                                 or
>>                                                                 the
>>                                                                 amdgpu
>>                                                                 scheduler.
>>                                                                 The
>>                                                                 relevant
>>                                                                 register
>>                                                                 fields
>>                                                                 are:
>>                                                                      
>>                                                                   *
>>                                                                 mmCP_HQD_PIPE_PRIORITY
>>                                                                      
>>                                                                   *
>>                                                                 mmCP_HQD_QUEUE_PRIORITY
>>
>>                                                                 Implementation
>>                                                                 approach
>>                                                                 1 -
>>                                                                 static
>>                                                                 partitioning:
>>                                                                 ------------------------------------------------
>>
>>                                                                 The
>>                                                                 amdgpu
>>                                                                 driver
>>                                                                 currently
>>                                                                 controls
>>                                                                 8
>>                                                                 compute
>>                                                                 queues
>>                                                                 from
>>                                                                 pipe0.
>>                                                                 We can
>>                                                                 statically
>>                                                                 partition
>>                                                                 these
>>                                                                 as
>>                                                                 follows:
>>                                                                      
>>                                                                   *
>>                                                                 7x
>>                                                                 regular
>>                                                                      
>>                                                                   *
>>                                                                 1x
>>                                                                 high
>>                                                                 priority
>>
>>                                                                 The
>>                                                                 relevant
>>                                                                 priorities
>>                                                                 can
>>                                                                 be
>>                                                                 set
>>                                                                 so
>>                                                                 that
>>                                                                 submissions
>>                                                                 to
>>                                                                 the high
>>                                                                 priority
>>                                                                 ring
>>                                                                 will
>>                                                                 starve
>>                                                                 the
>>                                                                 other
>>                                                                 compute
>>                                                                 rings
>>                                                                 and
>>                                                                 the
>>                                                                 GFX ring.
>>
>>                                                                 The
>>                                                                 amdgpu
>>                                                                 scheduler
>>                                                                 will
>>                                                                 only
>>                                                                 place
>>                                                                 jobs
>>                                                                 into
>>                                                                 the high
>>                                                                 priority
>>                                                                 rings
>>                                                                 if the
>>                                                                 context
>>                                                                 is
>>                                                                 marked
>>                                                                 as
>>                                                                 high
>>                                                                 priority.
>>                                                                 And a
>>                                                                 corresponding
>>                                                                 priority
>>                                                                 should be
>>                                                                 added
>>                                                                 to
>>                                                                 keep
>>                                                                 track
>>                                                                 of
>>                                                                 this
>>                                                                 information:
>>                                                                    
>>                                                                  *
>>                                                                 AMD_SCHED_PRIORITY_KERNEL
>>                                                                    
>>                                                                  * ->
>>                                                                 AMD_SCHED_PRIORITY_HIGH
>>                                                                    
>>                                                                  *
>>                                                                 AMD_SCHED_PRIORITY_NORMAL
>>
>>                                                                 The
>>                                                                 user
>>                                                                 will
>>                                                                 request
>>                                                                 a
>>                                                                 high
>>                                                                 priority
>>                                                                 context
>>                                                                 by
>>                                                                 setting
>>                                                                 an
>>                                                                 appropriate
>>                                                                 flag
>>                                                                 in
>>                                                                 drm_amdgpu_ctx_in
>>                                                                 (AMDGPU_CTX_HIGH_PRIORITY
>>                                                                 or
>>                                                                 similar):
>>                                                                 https://github.com/torvalds/linux/blob/master/include/uapi/drm/amdgpu_drm.h#L163
>>                                                                 <https://github.com/torvalds/linux/blob/master/include/uapi/drm/amdgpu_drm.h#L163>
>>
>>
>>
>>
>>                                                                 The
>>                                                                 setting
>>                                                                 is in
>>                                                                 a per
>>                                                                 context
>>                                                                 level
>>                                                                 so
>>                                                                 that
>>                                                                 we can:
>>                                                                     *
>>                                                                 Maintain
>>                                                                 a
>>                                                                 consistent
>>                                                                 FIFO
>>                                                                 ordering
>>                                                                 of all
>>                                                                 submissions
>>                                                                 to a
>>                                                                 context
>>                                                                     *
>>                                                                 Create
>>                                                                 high
>>                                                                 priority
>>                                                                 and
>>                                                                 non-high
>>                                                                 priority
>>                                                                 contexts
>>                                                                 in
>>                                                                 the same
>>                                                                 process
>>
>>                                                                 Implementation
>>                                                                 approach
>>                                                                 2 -
>>                                                                 dynamic
>>                                                                 priority
>>                                                                 programming:
>>                                                                 ---------------------------------------------------------
>>
>>                                                                 Similar
>>                                                                 to
>>                                                                 the
>>                                                                 above,
>>                                                                 but
>>                                                                 instead
>>                                                                 of
>>                                                                 programming
>>                                                                 the
>>                                                                 priorities
>>                                                                 and
>>                                                                 amdgpu_init()
>>                                                                 time,
>>                                                                 the
>>                                                                 SW
>>                                                                 scheduler
>>                                                                 will
>>                                                                 reprogram
>>                                                                 the
>>                                                                 queue
>>                                                                 priorities
>>                                                                 dynamically
>>                                                                 when
>>                                                                 scheduling
>>                                                                 a task.
>>
>>                                                                 This
>>                                                                 would
>>                                                                 involve
>>                                                                 having
>>                                                                 a
>>                                                                 hardware
>>                                                                 specific
>>                                                                 callback
>>                                                                 from
>>                                                                 the
>>                                                                 scheduler
>>                                                                 to
>>                                                                 set
>>                                                                 the
>>                                                                 appropriate
>>                                                                 queue
>>                                                                 priority:
>>                                                                 set_priority(int
>>                                                                 ring,
>>                                                                 int
>>                                                                 index,
>>                                                                 int
>>                                                                 priority)
>>
>>                                                                 During
>>                                                                 this
>>                                                                 callback
>>                                                                 we
>>                                                                 would
>>                                                                 have
>>                                                                 to
>>                                                                 grab
>>                                                                 the
>>                                                                 SRBM
>>                                                                 mutex
>>                                                                 to
>>                                                                 perform
>>                                                                 the
>>                                                                 appropriate
>>                                                                 HW
>>                                                                 programming,
>>                                                                 and
>>                                                                 I'm
>>                                                                 not
>>                                                                 really
>>                                                                 sure
>>                                                                 if
>>                                                                 that is
>>                                                                 something
>>                                                                 we
>>                                                                 should
>>                                                                 be
>>                                                                 doing
>>                                                                 from
>>                                                                 the
>>                                                                 scheduler.
>>
>>                                                                 On
>>                                                                 the
>>                                                                 positive
>>                                                                 side,
>>                                                                 this
>>                                                                 approach
>>                                                                 would
>>                                                                 allow
>>                                                                 us to
>>                                                                 program
>>                                                                 a
>>                                                                 range of
>>                                                                 priorities
>>                                                                 for
>>                                                                 jobs
>>                                                                 instead
>>                                                                 of a
>>                                                                 single
>>                                                                 "high
>>                                                                 priority"
>>                                                                 value",
>>                                                                 achieving
>>                                                                 something
>>                                                                 similar
>>                                                                 to
>>                                                                 the
>>                                                                 niceness
>>                                                                 API
>>                                                                 available
>>                                                                 for CPU
>>                                                                 scheduling.
>>
>>                                                                 I'm
>>                                                                 not
>>                                                                 sure
>>                                                                 if
>>                                                                 this
>>                                                                 flexibility
>>                                                                 is
>>                                                                 something
>>                                                                 that
>>                                                                 we would
>>                                                                 need for
>>                                                                 our use
>>                                                                 case,
>>                                                                 but
>>                                                                 it
>>                                                                 might
>>                                                                 be
>>                                                                 useful
>>                                                                 in
>>                                                                 other
>>                                                                 scenarios
>>                                                                 (multiple
>>                                                                 users
>>                                                                 sharing
>>                                                                 compute
>>                                                                 time
>>                                                                 on a
>>                                                                 server).
>>
>>                                                                 This
>>                                                                 approach
>>                                                                 would
>>                                                                 require
>>                                                                 a new
>>                                                                 int
>>                                                                 field in
>>                                                                 drm_amdgpu_ctx_in,
>>                                                                 or
>>                                                                 repurposing
>>                                                                 of
>>                                                                 the
>>                                                                 flags
>>                                                                 field.
>>
>>                                                                 Known
>>                                                                 current
>>                                                                 obstacles:
>>                                                                 ------------------------
>>
>>                                                                 The
>>                                                                 SQ is
>>                                                                 currently
>>                                                                 programmed
>>                                                                 to
>>                                                                 disregard
>>                                                                 the HQD
>>                                                                 priorities,
>>                                                                 and
>>                                                                 instead
>>                                                                 it picks
>>                                                                 jobs
>>                                                                 at
>>                                                                 random.
>>                                                                 Settings
>>                                                                 from
>>                                                                 the
>>                                                                 shader
>>                                                                 itself
>>                                                                 are also
>>                                                                 disregarded
>>                                                                 as
>>                                                                 this is
>>                                                                 considered
>>                                                                 a
>>                                                                 privileged
>>                                                                 field.
>>
>>                                                                 Effectively
>>                                                                 we
>>                                                                 can
>>                                                                 get
>>                                                                 our
>>                                                                 compute
>>                                                                 wavefront
>>                                                                 launched
>>                                                                 ASAP,
>>                                                                 but we
>>                                                                 might
>>                                                                 not
>>                                                                 get the
>>                                                                 time
>>                                                                 we
>>                                                                 need
>>                                                                 on
>>                                                                 the SQ.
>>
>>                                                                 The
>>                                                                 current
>>                                                                 programming
>>                                                                 would
>>                                                                 have
>>                                                                 to be
>>                                                                 changed
>>                                                                 to allow
>>                                                                 priority
>>                                                                 propagation
>>                                                                 from
>>                                                                 the
>>                                                                 HQD
>>                                                                 into
>>                                                                 the SQ.
>>
>>                                                                 Generic
>>                                                                 approach
>>                                                                 for
>>                                                                 all
>>                                                                 HW IPs:
>>                                                                 --------------------------------
>>
>>                                                                 For
>>                                                                 consistency
>>                                                                 purposes,
>>                                                                 the
>>                                                                 high
>>                                                                 priority
>>                                                                 context
>>                                                                 can be
>>                                                                 enabled
>>                                                                 for
>>                                                                 all
>>                                                                 HW IPs
>>                                                                 with
>>                                                                 support
>>                                                                 of
>>                                                                 the
>>                                                                 SW
>>                                                                 scheduler.
>>                                                                 This
>>                                                                 will
>>                                                                 function
>>                                                                 similarly
>>                                                                 to the
>>                                                                 current
>>                                                                 AMD_SCHED_PRIORITY_KERNEL
>>                                                                 priority,
>>                                                                 where
>>                                                                 the
>>                                                                 job
>>                                                                 can jump
>>                                                                 ahead of
>>                                                                 anything
>>                                                                 not
>>                                                                 commited
>>                                                                 to
>>                                                                 the
>>                                                                 HW queue.
>>
>>                                                                 The
>>                                                                 benefits
>>                                                                 of
>>                                                                 requesting
>>                                                                 a
>>                                                                 high
>>                                                                 priority
>>                                                                 context
>>                                                                 for a
>>                                                                 non-compute
>>                                                                 queue
>>                                                                 will
>>                                                                 be
>>                                                                 lesser
>>                                                                 (e.g.
>>                                                                 up to
>>                                                                 10s
>>                                                                 of
>>                                                                 wait
>>                                                                 time
>>                                                                 if a
>>                                                                 GFX
>>                                                                 command
>>                                                                 is
>>                                                                 stuck in
>>                                                                 front of
>>                                                                 you),
>>                                                                 but
>>                                                                 having
>>                                                                 the
>>                                                                 API
>>                                                                 in
>>                                                                 place
>>                                                                 will
>>                                                                 allow
>>                                                                 us to
>>                                                                 easily
>>                                                                 improve
>>                                                                 the
>>                                                                 implementation
>>                                                                 in
>>                                                                 the
>>                                                                 future
>>                                                                 as
>>                                                                 new
>>                                                                 features
>>                                                                 become
>>                                                                 available
>>                                                                 in new
>>                                                                 hardware.
>>
>>                                                                 Future
>>                                                                 steps:
>>                                                                 -------------
>>
>>                                                                 Once
>>                                                                 we
>>                                                                 have
>>                                                                 an
>>                                                                 approach
>>                                                                 settled,
>>                                                                 I can
>>                                                                 take
>>                                                                 care
>>                                                                 of the
>>                                                                 implementation.
>>
>>                                                                 Also,
>>                                                                 once
>>                                                                 the
>>                                                                 interface
>>                                                                 is
>>                                                                 mostly
>>                                                                 decided,
>>                                                                 we
>>                                                                 can start
>>                                                                 thinking
>>                                                                 about
>>                                                                 exposing
>>                                                                 the
>>                                                                 high
>>                                                                 priority
>>                                                                 queue
>>                                                                 through
>>                                                                 radv.
>>
>>                                                                 Request
>>                                                                 for
>>                                                                 feedback:
>>                                                                 ---------------------
>>
>>                                                                 We
>>                                                                 aren't
>>                                                                 married
>>                                                                 to
>>                                                                 any
>>                                                                 of
>>                                                                 the
>>                                                                 approaches
>>                                                                 outlined
>>                                                                 above.
>>                                                                 Our goal
>>                                                                 is to
>>                                                                 obtain
>>                                                                 a
>>                                                                 mechanism
>>                                                                 that
>>                                                                 will
>>                                                                 allow
>>                                                                 us to
>>                                                                 complete
>>                                                                 the
>>                                                                 reprojection
>>                                                                 job
>>                                                                 within a
>>                                                                 predictable
>>                                                                 amount
>>                                                                 of
>>                                                                 time.
>>                                                                 So if
>>                                                                 anyone
>>                                                                 anyone
>>                                                                 has any
>>                                                                 suggestions
>>                                                                 for
>>                                                                 improvements
>>                                                                 or
>>                                                                 alternative
>>                                                                 strategies
>>                                                                 we
>>                                                                 are
>>                                                                 more than
>>                                                                 happy
>>                                                                 to hear
>>                                                                 them.
>>
>>                                                                 If
>>                                                                 any
>>                                                                 of
>>                                                                 the
>>                                                                 technical
>>                                                                 information
>>                                                                 above
>>                                                                 is also
>>                                                                 incorrect,
>>                                                                 feel
>>                                                                 free
>>                                                                 to point
>>                                                                 out
>>                                                                 my
>>                                                                 misunderstandings.
>>
>>                                                                 Looking
>>                                                                 forward
>>                                                                 to
>>                                                                 hearing
>>                                                                 from you.
>>
>>                                                                 Regards,
>>                                                                 Andres
>>
>>                                                                 _______________________________________________
>>                                                                 amd-gfx
>>                                                                 mailing
>>                                                                 list
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>                                                                 <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>>
>>
>>                                                                 amd-gfx
>>                                                                 Info
>>                                                                 Page
>>                                                                 -
>>                                                                 lists.freedesktop.org
>>                                                                 <http://lists.freedesktop.org>
>>                                                                 lists.freedesktop.org
>>                                                                 <http://lists.freedesktop.org>
>>                                                                 To
>>                                                                 see
>>                                                                 the
>>                                                                 collection
>>                                                                 of
>>                                                                 prior
>>                                                                 postings
>>                                                                 to
>>                                                                 the list,
>>                                                                 visit the
>>                                                                 amd-gfx
>>                                                                 Archives.
>>                                                                 Using
>>                                                                 amd-gfx:
>>                                                                 To
>>                                                                 post
>>                                                                 a
>>                                                                 message
>>                                                                 to all
>>                                                                 the list
>>                                                                 members,
>>                                                                 send
>>                                                                 email ...
>>
>>
>>
>>                                                                 amd-gfx
>>                                                                 Info
>>                                                                 Page
>>                                                                 -
>>                                                                 lists.freedesktop.org
>>                                                                 <http://lists.freedesktop.org>
>>                                                                 lists.freedesktop.org
>>                                                                 <http://lists.freedesktop.org>
>>                                                                 To
>>                                                                 see
>>                                                                 the
>>                                                                 collection
>>                                                                 of
>>                                                                 prior
>>                                                                 postings
>>                                                                 to
>>                                                                 the list,
>>                                                                 visit the
>>                                                                 amd-gfx
>>                                                                 Archives.
>>                                                                 Using
>>                                                                 amd-gfx:
>>                                                                 To
>>                                                                 post
>>                                                                 a
>>                                                                 message
>>                                                                 to all
>>                                                                 the list
>>                                                                 members,
>>                                                                 send
>>                                                                 email ...
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>
>>                                                                 _______________________________________________
>>                                                                 amd-gfx
>>                                                                 mailing
>>                                                                 list
>>                                                                 amd-gfx at lists.freedesktop.org
>>                                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                                 https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>                                                                 <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>>
>>
>>                                                             _______________________________________________
>>                                                             amd-gfx
>>                                                             mailing list
>>                                                             amd-gfx at lists.freedesktop.org
>>                                                             <mailto:amd-gfx at lists.freedesktop.org>
>>                                                             https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>                                                             <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>>
>>
>>
>>
>>                                                 _______________________________________________
>>                                                 amd-gfx mailing list
>>                                                 amd-gfx at lists.freedesktop.org
>>                                                 <mailto:amd-gfx at lists.freedesktop.org>
>>                                                 https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>                                                 <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>>
>>
>>
>>
>>                                     Sincerely yours,
>>                                     Serguei Sagalovitch
>>
>>                                     _______________________________________________
>>                                     amd-gfx mailing list
>>                                     amd-gfx at lists.freedesktop.org
>>                                     <mailto:amd-gfx at lists.freedesktop.org>
>>                                     https://lists.freedesktop.org/mailman/listinfo/amd-gfx
>>                                     <https://lists.freedesktop.org/mailman/listinfo/amd-gfx>
>>
>>
>>
>>
>>
>>
>>
>>                 Sincerely yours,
>>                 Serguei Sagalovitch
>>
>>
>>
>>
>>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/amd-gfx/attachments/20170102/f11a2405/attachment-0001.html>