<html>
  <head>
    <meta content="text/html; charset=utf-8" http-equiv="Content-Type">
  </head>
  <body text="#000000" bgcolor="#FFFFFF">
    <div class="moz-cite-prefix">Hi Marco,<br>
      <br>
      the problem with an CS ioctl flag is that we sometimes don't know
      how much SCLK/MCLK boost is needed, for example when we do post
      processing in the player using OpenGL and UVD decoding with VDPAU.
      In this case VDPAU don't has the slightest idea how high SCLK/MCLK
      must be and so can't give that info to the kernel either.<br>
      <br>
      Regards,<br>
      Christian.<br>
      <br>
      Am 15.08.2014 um 15:21 schrieb Marco Benatto:<br>
    </div>
    <blockquote
cite="mid:CADHkTqVsnoG-99g1ZWSZca_wu9CWvB0YRnBSmps=PHyL-+wWzw@mail.gmail.com"
      type="cite">
      <meta http-equiv="Content-Type" content="text/html; charset=utf-8">
      <div dir="ltr">
        <div>
          <div>Hey all,<br>
            <br>
          </div>
          I also had a talk with Alex yesterday about post-processing
          issues when using dynamic UVD profiles and a chamge on CS
          ioctl<br>
        </div>
        <div>including a flag to let user mode driver tell to the kernel
          which performance requirement it wants for post processing. A
          commom<br>
        </div>
        <div>point for both discussion is to stablish the default values
          for these profiles, but probably this ioctl change would be
          more impacting/complex<br>
        </div>
        <div>to implement than a sysfs entry.<br>
          <br>
        </div>
        <div>If a sysfs entry is anough for now I can handle the code to
          create it and, with your help, the code to setup the UVD
          profile requested through it.<br>
          <br>
        </div>
        <div>Is there any suggestion? <br>
          <br>
        </div>
        <div>Thanks all for your help,<br>
        </div>
      </div>
      <div class="gmail_extra"><br>
        <br>
        <div class="gmail_quote">On Fri, Aug 15, 2014 at 5:48 AM,
          Christian König <span dir="ltr"><<a moz-do-not-send="true"
              href="mailto:christian.koenig@amd.com" target="_blank">christian.koenig@amd.com</a>></span>
          wrote:<br>
          <blockquote class="gmail_quote" style="margin:0 0 0
            .8ex;border-left:1px #ccc solid;padding-left:1ex">Hi guys,<br>
            <br>
            to make a long story short every time I watch a movie my
            laptop start to heat up because we always select the
            standard UVD power profile without actually measuring if
            that is necessary.<br>
            <br>
            Marco came up with a patch that seems to reliable measure
            the fps send down to the kernel and so together with knowing
            the frame size of the video should allow us to select the
            right UVD power profile.<br>
            <br>
            The problem is that Alex (unnoticed by me) completely
            disabled selecting the UVD profiles because of some issues
            with advanced post processing discussed on IRC. The problem
            seems to be that the lower UVD profiles have a to low
            SCLK/MCLK to handle the 3D load that comes with scaling,
            deinterlacing etc...<br>
            <br>
            I unfortunately don't have time for it, cause this only
            affects the hardware generations R600-SI and not the newest
            one CIK. So could you guys stick together and come up with a
            solution? Something like a sysfs entry that let's us select
            the minimum UVD power level allowed?<br>
            <br>
            I think Marco is happy to come up with a patch, we just need
            to know what's really needed and what should be the default
            values. I'm happy to review everything that comes out of it,
            just don't have time to do it myself.<br>
            <br>
            Happy discussion and thanks in advance,<br>
            Christian.<br>
            <br>
            Am 12.08.2014 um 15:05 schrieb Alex Deucher:
            <div class="HOEnZb">
              <div class="h5"><br>
                <blockquote class="gmail_quote" style="margin:0 0 0
                  .8ex;border-left:1px #ccc solid;padding-left:1ex">
                  On Tue, Aug 12, 2014 at 6:00 AM, Christian König<br>
                  <<a moz-do-not-send="true"
                    href="mailto:deathsimple@vodafone.de"
                    target="_blank">deathsimple@vodafone.de</a>>
                  wrote:<br>
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    Am 11.08.2014 um 16:52 schrieb Alex Deucher:<br>
                    <br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      On Mon, Aug 11, 2014 at 5:08 AM, Christian König<br>
                      <<a moz-do-not-send="true"
                        href="mailto:deathsimple@vodafone.de"
                        target="_blank">deathsimple@vodafone.de</a>>
                      wrote:<br>
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">
                        Am 07.08.2014 um 21:43 schrieb Alex Deucher:<br>
                        <br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          On Thu, Aug 7, 2014 at 11:32 AM, Christian
                          König<br>
                          <<a moz-do-not-send="true"
                            href="mailto:deathsimple@vodafone.de"
                            target="_blank">deathsimple@vodafone.de</a>>
                          wrote:<br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            Am 07.08.2014 um 16:32 schrieb Alex Deucher:<br>
                            <br>
                            <blockquote class="gmail_quote"
                              style="margin:0 0 0 .8ex;border-left:1px
                              #ccc solid;padding-left:1ex">
                              On Thu, Aug 7, 2014 at 7:33 AM, Christian
                              König<br>
                              <<a moz-do-not-send="true"
                                href="mailto:deathsimple@vodafone.de"
                                target="_blank">deathsimple@vodafone.de</a>><br>
                              wrote:<br>
                              <blockquote class="gmail_quote"
                                style="margin:0 0 0 .8ex;border-left:1px
                                #ccc solid;padding-left:1ex">
                                From: Marco A Benatto <<a
                                  moz-do-not-send="true"
                                  href="mailto:marco.antonio.780@gmail.com"
                                  target="_blank">marco.antonio.780@gmail.com</a>><br>
                                <br>
                                Adding a Frames Per Second estimation
                                logic on UVD handles<br>
                                when it has being used. This estimation
                                is per handle basis<br>
                                and will help on DPM profile
                                calculation.<br>
                                <br>
                                v2 (chk): fix timestamp type, move
                                functions around and<br>
                                              cleanup code a bit.<br>
                              </blockquote>
                              Will this really help much?  I thought the
                              problem was mainly due to<br>
                              sclk and mclk for post processing.<br>
                            </blockquote>
                            <br>
                            It should at least handle the UVD side for
                            upclocking when you get a<br>
                            lot<br>
                            of<br>
                            streams / fps. And at on my NI the patch
                            seems to do exactly that.<br>
                            <br>
                            Switching sclk and mclk for post processing
                            is a different task, and I<br>
                            actually have no idea what to do with them.<br>
                          </blockquote>
                          At this point we always choose the plain UVD
                          state anyway so this<br>
                          patch would only take effect if we re-enabled
                          the dynamic UVD state<br>
                          selection.<br>
                        </blockquote>
                        <br>
                        Hui? I thought we already re-enabled the dynamic
                        UVD state selection, but<br>
                        double checking this I found it disabled again.<br>
                        <br>
                        What was the problem with that? Looks like I
                        somehow missed the<br>
                        discussion<br>
                        around it.<br>
                      </blockquote>
                      We did, but after doing so a number of people
                      complained about a<br>
                      regression on IRC because when apps like xmbc
                      enabled post processing,<br>
                      performance went down.<br>
                    </blockquote>
                    <br>
                    That's strange, from my experience the different UVD
                    performance states only<br>
                    affect UVDs dclk/vclk, not sclk/mclk. I need to get
                    the DPM dumps to<br>
                    confirms this.<br>
                    <br>
                  </blockquote>
                  The sclks and mclks are usually different as well,
                  especially on APUs.<br>
                  I can send you some examples.<br>
                  <br>
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    You not off hand remember who complained on IRC?
                    Finding something in the<br>
                    IRC logs is like searching for a needle in a
                    haystack.<br>
                  </blockquote>
                  I don't remember off hand.  I think zgreg was involved
                  in some of the<br>
                  discussions.<br>
                  <br>
                  Alex<br>
                  <br>
                  <blockquote class="gmail_quote" style="margin:0 0 0
                    .8ex;border-left:1px #ccc solid;padding-left:1ex">
                    Thanks,<br>
                    Christian.<br>
                    <br>
                    <br>
                    <blockquote class="gmail_quote" style="margin:0 0 0
                      .8ex;border-left:1px #ccc solid;padding-left:1ex">
                      Alex<br>
                      <br>
                      <br>
                      <blockquote class="gmail_quote" style="margin:0 0
                        0 .8ex;border-left:1px #ccc
                        solid;padding-left:1ex">
                        Christian.<br>
                        <br>
                        <br>
                        <blockquote class="gmail_quote" style="margin:0
                          0 0 .8ex;border-left:1px #ccc
                          solid;padding-left:1ex">
                          For the post processing, we probably need a
                          hint we can<br>
                          pass to the driver in the CS ioctl to denote
                          what state we need.<br>
                          Although if we did that, this could would
                          largely be moot.  That said,<br>
                          newer asics support dynamic UVD clocks so we
                          really only need<br>
                          something like that for older asics and I
                          guess VCE.<br>
                          <br>
                          Alex<br>
                          <br>
                          <blockquote class="gmail_quote"
                            style="margin:0 0 0 .8ex;border-left:1px
                            #ccc solid;padding-left:1ex">
                            Christian.<br>
                            <br>
                            <br>
                            <blockquote class="gmail_quote"
                              style="margin:0 0 0 .8ex;border-left:1px
                              #ccc solid;padding-left:1ex">
                              Alex<br>
                              <br>
                              <blockquote class="gmail_quote"
                                style="margin:0 0 0 .8ex;border-left:1px
                                #ccc solid;padding-left:1ex">
                                Signed-off-by: Marco A Benatto <<a
                                  moz-do-not-send="true"
                                  href="mailto:marco.antonio.780@gmail.com"
                                  target="_blank">marco.antonio.780@gmail.com</a>><br>
                                Signed-off-by: Christian König <<a
                                  moz-do-not-send="true"
                                  href="mailto:christian.koenig@amd.com"
                                  target="_blank">christian.koenig@amd.com</a>><br>
                                ---<br>
                                     drivers/gpu/drm/radeon/radeon.h   
                                 | 10 ++++++<br>
                                     drivers/gpu/drm/radeon/radeon_uvd.c
                                | 64<br>
                                +++++++++++++++++++++++++++++++++----<br>
                                     2 files changed, 68 insertions(+),
                                6 deletions(-)<br>
                                <br>
                                diff --git a/drivers/gpu/drm/radeon/radeon.h<br>
                                b/drivers/gpu/drm/radeon/radeon.h<br>
                                index 9e1732e..e92f6cb 100644<br>
                                --- a/drivers/gpu/drm/radeon/radeon.h<br>
                                +++ b/drivers/gpu/drm/radeon/radeon.h<br>
                                @@ -1617,6 +1617,15 @@ int
                                radeon_pm_get_type_index(struct<br>
                                radeon_device<br>
                                *rdev,<br>
                                     #define RADEON_UVD_STACK_SIZE 
                                (1024*1024)<br>
                                     #define RADEON_UVD_HEAP_SIZE 
                                 (1024*1024)<br>
                                <br>
                                +#define RADEON_UVD_FPS_EVENTS_MAX 8<br>
                                +#define RADEON_UVD_DEFAULT_FPS 60<br>
                                +<br>
                                +struct radeon_uvd_fps {<br>
                                +       uint64_t        timestamp;<br>
                                +       uint8_t         event_index;<br>
                                +       uint8_t       
                                 events[RADEON_UVD_FPS_EVENTS_MAX];<br>
                                +};<br>
                                +<br>
                                     struct radeon_uvd {<br>
                                            struct radeon_bo       
                                *vcpu_bo;<br>
                                            void                   
                                *cpu_addr;<br>
                                @@ -1626,6 +1635,7 @@ struct radeon_uvd
                                {<br>
                                            struct drm_file       
                                 *filp[RADEON_MAX_UVD_HANDLES];<br>
                                            unsigned               
                                img_size[RADEON_MAX_UVD_HANDLES];<br>
                                            struct delayed_work   
                                 idle_work;<br>
                                +       struct radeon_uvd_fps 
                                 fps_info[RADEON_MAX_UVD_HANDLES];<br>
                                     };<br>
                                <br>
                                     int radeon_uvd_init(struct
                                radeon_device *rdev);<br>
                                diff --git a/drivers/gpu/drm/radeon/radeon_uvd.c<br>
                                b/drivers/gpu/drm/radeon/radeon_uvd.c<br>
                                index 6bf55ec..ef5667a 100644<br>
                                --- a/drivers/gpu/drm/radeon/radeon_uvd.c<br>
                                +++ b/drivers/gpu/drm/radeon/radeon_uvd.c<br>
                                @@ -237,6 +237,51 @@ void
                                radeon_uvd_force_into_uvd_segment(struct<br>
                                radeon_bo *rbo)<br>
                                            rbo->placement.lpfn =
                                (256 * 1024 * 1024) >> PAGE_SHIFT;<br>
                                     }<br>
                                <br>
                                +static void
                                radeon_uvd_fps_clear_events(struct
                                radeon_device *rdev,<br>
                                int<br>
                                idx)<br>
                                +{<br>
                                +       struct radeon_uvd_fps *fps =
                                &rdev->uvd.fps_info[idx];<br>
                                +       unsigned i;<br>
                                +<br>
                                +       fps->timestamp = jiffies_64;<br>
                                +       fps->event_index = 0;<br>
                                +       for (i = 0; i <
                                RADEON_UVD_FPS_EVENTS_MAX; i++)<br>
                                +               fps->events[i] = 0;<br>
                                +}<br>
                                +<br>
                                +static void radeon_uvd_fps_note_event(struct
                                radeon_device *rdev,<br>
                                int<br>
                                idx)<br>
                                +{<br>
                                +       struct radeon_uvd_fps *fps =
                                &rdev->uvd.fps_info[idx];<br>
                                +       uint64_t timestamp = jiffies_64;<br>
                                +       unsigned rate = 0;<br>
                                +<br>
                                +       uint8_t index =
                                fps->event_index++;<br>
                                +       fps->event_index %=
                                RADEON_UVD_FPS_EVENTS_MAX;<br>
                                +<br>
                                +       rate = div64_u64(HZ,
                                max(timestamp - fps->timestamp,
                                1ULL));<br>
                                +<br>
                                +       fps->timestamp = timestamp;<br>
                                +       fps->events[index] =
                                min(rate, 120u);<br>
                                +}<br>
                                +<br>
                                +static unsigned
                                radeon_uvd_estimate_fps(struct
                                radeon_device *rdev,<br>
                                int<br>
                                idx)<br>
                                +{<br>
                                +       struct radeon_uvd_fps *fps =
                                &rdev->uvd.fps_info[idx];<br>
                                +       unsigned i, valid = 0, count =
                                0;<br>
                                +<br>
                                +       for (i = 0; i <
                                RADEON_UVD_FPS_EVENTS_MAX; i++) {<br>
                                +               /* We should ignore zero
                                values */<br>
                                +               if (fps->events[i] !=
                                0) {<br>
                                +                       count +=
                                fps->events[i];<br>
                                +                       valid++;<br>
                                +               }<br>
                                +       }<br>
                                +<br>
                                +       if (valid > 0)<br>
                                +               return count / valid;<br>
                                +       else<br>
                                +               return
                                RADEON_UVD_DEFAULT_FPS;<br>
                                +}<br>
                                +<br>
                                     void radeon_uvd_free_handles(struct
                                radeon_device *rdev, struct<br>
                                drm_file *filp)<br>
                                     {<br>
                                            int i, r;<br>
                                @@ -419,8 +464,10 @@ static int
                                radeon_uvd_cs_msg(struct<br>
                                radeon_cs_parser<br>
                                *p, struct radeon_bo *bo,<br>
                                <br>
                                            /* create or decode,
                                validate the handle */<br>
                                            for (i = 0; i <
                                RADEON_MAX_UVD_HANDLES; ++i) {<br>
                                -               if
                                (atomic_read(&p->rdev->uvd.handles[i])
                                == handle)<br>
                                +               if
                                (atomic_read(&p->rdev->uvd.handles[i])
                                == handle)<br>
                                {<br>
                                +                     
                                 radeon_uvd_fps_note_event(p->rdev,
                                i);<br>
                                                            return 0;<br>
                                +               }<br>
                                            }<br>
                                <br>
                                            /* handle not found try to
                                alloc a new one */<br>
                                @@ -428,6 +475,7 @@ static int
                                radeon_uvd_cs_msg(struct<br>
                                radeon_cs_parser<br>
                                *p, struct radeon_bo *bo,<br>
                                                    if
                                (!atomic_cmpxchg(&p->rdev->uvd.handles[i],
                                0,<br>
                                handle)) {<br>
                                                           
                                p->rdev->uvd.filp[i] = p->filp;<br>
                                                           
                                p->rdev->uvd.img_size[i] =
                                img_size;<br>
                                +                     
                                 radeon_uvd_fps_clear_events(p->rdev,
                                i);<br>
                                                            return 0;<br>
                                                    }<br>
                                            }<br>
                                @@ -763,7 +811,7 @@ int
                                radeon_uvd_get_destroy_msg(struct<br>
                                radeon_device<br>
                                *rdev, int ring,<br>
                                     static void
                                radeon_uvd_count_handles(struct
                                radeon_device *rdev,<br>
                                                                       
                                 unsigned *sd, unsigned *hd)<br>
                                     {<br>
                                -       unsigned i;<br>
                                +       unsigned i, fps_rate = 0;<br>
                                <br>
                                            *sd = 0;<br>
                                            *hd = 0;<br>
                                @@ -772,10 +820,13 @@ static void
                                radeon_uvd_count_handles(struct<br>
                                radeon_device *rdev,<br>
                                                    if
                                (!atomic_read(&rdev->uvd.handles[i]))<br>
                                                            continue;<br>
                                <br>
                                -               if
                                (rdev->uvd.img_size[i] >= 720*576)<br>
                                -                       ++(*hd);<br>
                                -               else<br>
                                -                       ++(*sd);<br>
                                +               fps_rate =
                                radeon_uvd_estimate_fps(rdev, i);<br>
                                +<br>
                                +               if
                                (rdev->uvd.img_size[i] >= 720*576)
                                {<br>
                                +                       (*hd) +=
                                fps_rate > 30 ? 1 : 2;<br>
                                +               } else {<br>
                                +                       (*sd) +=
                                fps_rate > 30 ? 1 : 2;<br>
                                +               }<br>
                                            }<br>
                                     }<br>
                                <br>
                                @@ -805,6 +856,7 @@ void
                                radeon_uvd_note_usage(struct
                                radeon_device<br>
                                *rdev)<br>
                                            set_clocks &=
                                schedule_delayed_work(&rdev->uvd.idle_work,<br>
                                <br>
                                msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS));<br>
                                <br>
                                +<br>
                                            if ((rdev->pm.pm_method
                                == PM_METHOD_DPM) &&<br>
                                rdev->pm.dpm_enabled) {<br>
                                                    unsigned hd = 0, sd
                                = 0;<br>
                                                   
                                radeon_uvd_count_handles(rdev, &sd,
                                &hd);<br>
                                --<br>
                                1.9.1<br>
                                <br>
                              </blockquote>
                            </blockquote>
                          </blockquote>
                        </blockquote>
                      </blockquote>
                    </blockquote>
                  </blockquote>
                </blockquote>
                <br>
              </div>
            </div>
          </blockquote>
        </div>
        <br>
        <br clear="all">
        <br>
        -- <br>
        Marco Antonio Benatto<br>
        Linux user ID:<font> #506236</font>
      </div>
    </blockquote>
    <br>
  </body>
</html>