[PATCH] drm/radeon: Adding UVD handle basis fps estimation v2

Grigori Goronzy greg at chown.ath.cx
Fri Aug 15 08:20:16 PDT 2014


On 15.08.2014 16:11, Christian König wrote:
> Hi Marco,
> 
> the problem with an CS ioctl flag is that we sometimes don't know how
> much SCLK/MCLK boost is needed, for example when we do post processing
> in the player using OpenGL and UVD decoding with VDPAU. In this case
> VDPAU don't has the slightest idea how high SCLK/MCLK must be and so
> can't give that info to the kernel either.
>

Maybe it's an acceptable workaround to simply disable dynamic UVD state
selection in case the UVD states only have a single power level. That
will avoid the performance issues on affected systems, while still
allowing dynamic UVD states on systems that have a saner DPM table
setup. I think it is mosly older systems that suffer from this.

Best regards
Grigori

> Regards,
> Christian.
> 
> Am 15.08.2014 um 15:21 schrieb Marco Benatto:
>> Hey all,
>>
>> I also had a talk with Alex yesterday about post-processing issues
>> when using dynamic UVD profiles and a chamge on CS ioctl
>> including a flag to let user mode driver tell to the kernel which
>> performance requirement it wants for post processing. A commom
>> point for both discussion is to stablish the default values for these
>> profiles, but probably this ioctl change would be more impacting/complex
>> to implement than a sysfs entry.
>>
>> If a sysfs entry is anough for now I can handle the code to create it
>> and, with your help, the code to setup the UVD profile requested
>> through it.
>>
>> Is there any suggestion?
>>
>> Thanks all for your help,
>>
>>
>> On Fri, Aug 15, 2014 at 5:48 AM, Christian König
>> <christian.koenig at amd.com <mailto:christian.koenig at amd.com>> wrote:
>>
>>     Hi guys,
>>
>>     to make a long story short every time I watch a movie my laptop
>>     start to heat up because we always select the standard UVD power
>>     profile without actually measuring if that is necessary.
>>
>>     Marco came up with a patch that seems to reliable measure the fps
>>     send down to the kernel and so together with knowing the frame
>>     size of the video should allow us to select the right UVD power
>>     profile.
>>
>>     The problem is that Alex (unnoticed by me) completely disabled
>>     selecting the UVD profiles because of some issues with advanced
>>     post processing discussed on IRC. The problem seems to be that the
>>     lower UVD profiles have a to low SCLK/MCLK to handle the 3D load
>>     that comes with scaling, deinterlacing etc...
>>
>>     I unfortunately don't have time for it, cause this only affects
>>     the hardware generations R600-SI and not the newest one CIK. So
>>     could you guys stick together and come up with a solution?
>>     Something like a sysfs entry that let's us select the minimum UVD
>>     power level allowed?
>>
>>     I think Marco is happy to come up with a patch, we just need to
>>     know what's really needed and what should be the default values.
>>     I'm happy to review everything that comes out of it, just don't
>>     have time to do it myself.
>>
>>     Happy discussion and thanks in advance,
>>     Christian.
>>
>>     Am 12.08.2014 um 15:05 schrieb Alex Deucher:
>>
>>         On Tue, Aug 12, 2014 at 6:00 AM, Christian König
>>         <deathsimple at vodafone.de <mailto:deathsimple at vodafone.de>> wrote:
>>
>>             Am 11.08.2014 um 16:52 schrieb Alex Deucher:
>>
>>                 On Mon, Aug 11, 2014 at 5:08 AM, Christian König
>>                 <deathsimple at vodafone.de
>>                 <mailto:deathsimple at vodafone.de>> wrote:
>>
>>                     Am 07.08.2014 um 21:43 schrieb Alex Deucher:
>>
>>                         On Thu, Aug 7, 2014 at 11:32 AM, Christian König
>>                         <deathsimple at vodafone.de
>>                         <mailto:deathsimple at vodafone.de>> wrote:
>>
>>                             Am 07.08.2014 um 16:32 schrieb Alex Deucher:
>>
>>                                 On Thu, Aug 7, 2014 at 7:33 AM,
>>                                 Christian König
>>                                 <deathsimple at vodafone.de
>>                                 <mailto:deathsimple at vodafone.de>>
>>                                 wrote:
>>
>>                                     From: Marco A Benatto
>>                                     <marco.antonio.780 at gmail.com
>>                                     <mailto:marco.antonio.780 at gmail.com>>
>>
>>                                     Adding a Frames Per Second
>>                                     estimation logic on UVD handles
>>                                     when it has being used. This
>>                                     estimation is per handle basis
>>                                     and will help on DPM profile
>>                                     calculation.
>>
>>                                     v2 (chk): fix timestamp type, move
>>                                     functions around and
>>                                                   cleanup code a bit.
>>
>>                                 Will this really help much?  I thought
>>                                 the problem was mainly due to
>>                                 sclk and mclk for post processing.
>>
>>
>>                             It should at least handle the UVD side for
>>                             upclocking when you get a
>>                             lot
>>                             of
>>                             streams / fps. And at on my NI the patch
>>                             seems to do exactly that.
>>
>>                             Switching sclk and mclk for post
>>                             processing is a different task, and I
>>                             actually have no idea what to do with them.
>>
>>                         At this point we always choose the plain UVD
>>                         state anyway so this
>>                         patch would only take effect if we re-enabled
>>                         the dynamic UVD state
>>                         selection.
>>
>>
>>                     Hui? I thought we already re-enabled the dynamic
>>                     UVD state selection, but
>>                     double checking this I found it disabled again.
>>
>>                     What was the problem with that? Looks like I
>>                     somehow missed the
>>                     discussion
>>                     around it.
>>
>>                 We did, but after doing so a number of people
>>                 complained about a
>>                 regression on IRC because when apps like xmbc enabled
>>                 post processing,
>>                 performance went down.
>>
>>
>>             That's strange, from my experience the different UVD
>>             performance states only
>>             affect UVDs dclk/vclk, not sclk/mclk. I need to get the
>>             DPM dumps to
>>             confirms this.
>>
>>         The sclks and mclks are usually different as well, especially
>>         on APUs.
>>         I can send you some examples.
>>
>>             You not off hand remember who complained on IRC? Finding
>>             something in the
>>             IRC logs is like searching for a needle in a haystack.
>>
>>         I don't remember off hand.  I think zgreg was involved in some
>>         of the
>>         discussions.
>>
>>         Alex
>>
>>             Thanks,
>>             Christian.
>>
>>
>>                 Alex
>>
>>
>>                     Christian.
>>
>>
>>                         For the post processing, we probably need a
>>                         hint we can
>>                         pass to the driver in the CS ioctl to denote
>>                         what state we need.
>>                         Although if we did that, this could would
>>                         largely be moot.  That said,
>>                         newer asics support dynamic UVD clocks so we
>>                         really only need
>>                         something like that for older asics and I
>>                         guess VCE.
>>
>>                         Alex
>>
>>                             Christian.
>>
>>
>>                                 Alex
>>
>>                                     Signed-off-by: Marco A Benatto
>>                                     <marco.antonio.780 at gmail.com
>>                                     <mailto:marco.antonio.780 at gmail.com>>
>>                                     Signed-off-by: Christian König
>>                                     <christian.koenig at amd.com
>>                                     <mailto:christian.koenig at amd.com>>
>>                                     ---
>>                                        
>>                                      drivers/gpu/drm/radeon/radeon.h 
>>                                        | 10 ++++++
>>                                        
>>                                      drivers/gpu/drm/radeon/radeon_uvd.c
>>                                     | 64
>>                                     +++++++++++++++++++++++++++++++++----
>>                                          2 files changed, 68
>>                                     insertions(+), 6 deletions(-)
>>
>>                                     diff --git
>>                                     a/drivers/gpu/drm/radeon/radeon.h
>>                                     b/drivers/gpu/drm/radeon/radeon.h
>>                                     index 9e1732e..e92f6cb 100644
>>                                     --- a/drivers/gpu/drm/radeon/radeon.h
>>                                     +++ b/drivers/gpu/drm/radeon/radeon.h
>>                                     @@ -1617,6 +1617,15 @@ int
>>                                     radeon_pm_get_type_index(struct
>>                                     radeon_device
>>                                     *rdev,
>>                                          #define
>>                                     RADEON_UVD_STACK_SIZE  (1024*1024)
>>                                          #define RADEON_UVD_HEAP_SIZE 
>>                                      (1024*1024)
>>
>>                                     +#define RADEON_UVD_FPS_EVENTS_MAX 8
>>                                     +#define RADEON_UVD_DEFAULT_FPS 60
>>                                     +
>>                                     +struct radeon_uvd_fps {
>>                                     +       uint64_t        timestamp;
>>                                     +       uint8_t         event_index;
>>                                     +       uint8_t       
>>                                      events[RADEON_UVD_FPS_EVENTS_MAX];
>>                                     +};
>>                                     +
>>                                          struct radeon_uvd {
>>                                                 struct radeon_bo     
>>                                       *vcpu_bo;
>>                                                 void                 
>>                                       *cpu_addr;
>>                                     @@ -1626,6 +1635,7 @@ struct
>>                                     radeon_uvd {
>>                                                 struct drm_file       
>>                                      *filp[RADEON_MAX_UVD_HANDLES];
>>                                                 unsigned             
>>                                       img_size[RADEON_MAX_UVD_HANDLES];
>>                                                 struct delayed_work   
>>                                      idle_work;
>>                                     +       struct radeon_uvd_fps 
>>                                      fps_info[RADEON_MAX_UVD_HANDLES];
>>                                          };
>>
>>                                          int radeon_uvd_init(struct
>>                                     radeon_device *rdev);
>>                                     diff --git
>>                                     a/drivers/gpu/drm/radeon/radeon_uvd.c
>>                                     b/drivers/gpu/drm/radeon/radeon_uvd.c
>>                                     index 6bf55ec..ef5667a 100644
>>                                     ---
>>                                     a/drivers/gpu/drm/radeon/radeon_uvd.c
>>                                     +++
>>                                     b/drivers/gpu/drm/radeon/radeon_uvd.c
>>                                     @@ -237,6 +237,51 @@ void
>>                                     radeon_uvd_force_into_uvd_segment(struct
>>                                     radeon_bo *rbo)
>>                                                 rbo->placement.lpfn =
>>                                     (256 * 1024 * 1024) >> PAGE_SHIFT;
>>                                          }
>>
>>                                     +static void
>>                                     radeon_uvd_fps_clear_events(struct
>>                                     radeon_device *rdev,
>>                                     int
>>                                     idx)
>>                                     +{
>>                                     +       struct radeon_uvd_fps *fps
>>                                     = &rdev->uvd.fps_info[idx];
>>                                     +       unsigned i;
>>                                     +
>>                                     +       fps->timestamp = jiffies_64;
>>                                     +       fps->event_index = 0;
>>                                     +       for (i = 0; i <
>>                                     RADEON_UVD_FPS_EVENTS_MAX; i++)
>>                                     +               fps->events[i] = 0;
>>                                     +}
>>                                     +
>>                                     +static void
>>                                     radeon_uvd_fps_note_event(struct
>>                                     radeon_device *rdev,
>>                                     int
>>                                     idx)
>>                                     +{
>>                                     +       struct radeon_uvd_fps *fps
>>                                     = &rdev->uvd.fps_info[idx];
>>                                     +       uint64_t timestamp =
>>                                     jiffies_64;
>>                                     +       unsigned rate = 0;
>>                                     +
>>                                     +       uint8_t index =
>>                                     fps->event_index++;
>>                                     +       fps->event_index %=
>>                                     RADEON_UVD_FPS_EVENTS_MAX;
>>                                     +
>>                                     +       rate = div64_u64(HZ,
>>                                     max(timestamp - fps->timestamp,
>>                                     1ULL));
>>                                     +
>>                                     +       fps->timestamp = timestamp;
>>                                     +       fps->events[index] =
>>                                     min(rate, 120u);
>>                                     +}
>>                                     +
>>                                     +static unsigned
>>                                     radeon_uvd_estimate_fps(struct
>>                                     radeon_device *rdev,
>>                                     int
>>                                     idx)
>>                                     +{
>>                                     +       struct radeon_uvd_fps *fps
>>                                     = &rdev->uvd.fps_info[idx];
>>                                     +       unsigned i, valid = 0,
>>                                     count = 0;
>>                                     +
>>                                     +       for (i = 0; i <
>>                                     RADEON_UVD_FPS_EVENTS_MAX; i++) {
>>                                     +               /* We should
>>                                     ignore zero values */
>>                                     +               if (fps->events[i]
>>                                     != 0) {
>>                                     +                       count +=
>>                                     fps->events[i];
>>                                     +                       valid++;
>>                                     +               }
>>                                     +       }
>>                                     +
>>                                     +       if (valid > 0)
>>                                     +               return count / valid;
>>                                     +       else
>>                                     +               return
>>                                     RADEON_UVD_DEFAULT_FPS;
>>                                     +}
>>                                     +
>>                                          void
>>                                     radeon_uvd_free_handles(struct
>>                                     radeon_device *rdev, struct
>>                                     drm_file *filp)
>>                                          {
>>                                                 int i, r;
>>                                     @@ -419,8 +464,10 @@ static int
>>                                     radeon_uvd_cs_msg(struct
>>                                     radeon_cs_parser
>>                                     *p, struct radeon_bo *bo,
>>
>>                                                 /* create or decode,
>>                                     validate the handle */
>>                                                 for (i = 0; i <
>>                                     RADEON_MAX_UVD_HANDLES; ++i) {
>>                                     -               if
>>                                     (atomic_read(&p->rdev->uvd.handles[i])
>>                                     == handle)
>>                                     +               if
>>                                     (atomic_read(&p->rdev->uvd.handles[i])
>>                                     == handle)
>>                                     {
>>                                     +                     
>>                                      radeon_uvd_fps_note_event(p->rdev, i);
>>                                                                 return 0;
>>                                     +               }
>>                                                 }
>>
>>                                                 /* handle not found
>>                                     try to alloc a new one */
>>                                     @@ -428,6 +475,7 @@ static int
>>                                     radeon_uvd_cs_msg(struct
>>                                     radeon_cs_parser
>>                                     *p, struct radeon_bo *bo,
>>                                                         if
>>                                     (!atomic_cmpxchg(&p->rdev->uvd.handles[i],
>>                                     0,
>>                                     handle)) {
>>                                                                
>>                                     p->rdev->uvd.filp[i] = p->filp;
>>                                                                
>>                                     p->rdev->uvd.img_size[i] = img_size;
>>                                     +                     
>>                                      radeon_uvd_fps_clear_events(p->rdev,
>>                                     i);
>>                                                                 return 0;
>>                                                         }
>>                                                 }
>>                                     @@ -763,7 +811,7 @@ int
>>                                     radeon_uvd_get_destroy_msg(struct
>>                                     radeon_device
>>                                     *rdev, int ring,
>>                                          static void
>>                                     radeon_uvd_count_handles(struct
>>                                     radeon_device *rdev,
>>                                                                      
>>                                            unsigned *sd, unsigned *hd)
>>                                          {
>>                                     -       unsigned i;
>>                                     +       unsigned i, fps_rate = 0;
>>
>>                                                 *sd = 0;
>>                                                 *hd = 0;
>>                                     @@ -772,10 +820,13 @@ static void
>>                                     radeon_uvd_count_handles(struct
>>                                     radeon_device *rdev,
>>                                                         if
>>                                     (!atomic_read(&rdev->uvd.handles[i]))
>>                                                                 continue;
>>
>>                                     -               if
>>                                     (rdev->uvd.img_size[i] >= 720*576)
>>                                     -                       ++(*hd);
>>                                     -               else
>>                                     -                       ++(*sd);
>>                                     +               fps_rate =
>>                                     radeon_uvd_estimate_fps(rdev, i);
>>                                     +
>>                                     +               if
>>                                     (rdev->uvd.img_size[i] >= 720*576) {
>>                                     +                       (*hd) +=
>>                                     fps_rate > 30 ? 1 : 2;
>>                                     +               } else {
>>                                     +                       (*sd) +=
>>                                     fps_rate > 30 ? 1 : 2;
>>                                     +               }
>>                                                 }
>>                                          }
>>
>>                                     @@ -805,6 +856,7 @@ void
>>                                     radeon_uvd_note_usage(struct
>>                                     radeon_device
>>                                     *rdev)
>>                                                 set_clocks &=
>>                                     schedule_delayed_work(&rdev->uvd.idle_work,
>>
>>                                     msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS));
>>
>>                                     +
>>                                                 if
>>                                     ((rdev->pm.pm_method ==
>>                                     PM_METHOD_DPM) &&
>>                                     rdev->pm.dpm_enabled) {
>>                                                         unsigned hd =
>>                                     0, sd = 0;
>>                                                        
>>                                     radeon_uvd_count_handles(rdev,
>>                                     &sd, &hd);
>>                                     --
>>                                     1.9.1
>>
>>
>>
>>
>>
>> -- 
>> Marco Antonio Benatto
>> Linux user ID:#506236
> 


-------------- next part --------------
A non-text attachment was scrubbed...
Name: signature.asc
Type: application/pgp-signature
Size: 819 bytes
Desc: OpenPGP digital signature
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20140815/e12cfa06/attachment.sig>


More information about the dri-devel mailing list