<div dir="ltr"><div><div><div><div><div>Looking at radeon_uvd source code and AFAIK it's currently disable<br></div>since commit <a href="http://cgit.freedesktop.org/%7Eagd5f/linux/commit/drivers/gpu/drm/radeon/radeon_uvd.c?h=drm-next-3.18-wip&id=dca5086a90c9ec64f4e0de801a659508202b0640">dca5086a90c9ec64f4e0de801a659508202b0640</a><br>
</div>Please, correct me if I'm wrong.<br><br></div>Maybe we can check for which system we're running and take a decision<br></div>in what we use (if sysfs entry or UVD streams) to determine which UVD state can be set, like Grigori suggested.<br>
<br></div><div>What do you think?<br></div><div><br><br></div> </div><div class="gmail_extra"><br><br><div class="gmail_quote">On Fri, Aug 15, 2014 at 12:32 PM, Grigori Goronzy <span dir="ltr"><<a href="mailto:greg@chown.ath.cx" target="_blank">greg@chown.ath.cx</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class="">On 15.08.2014 17:26, Alex Deucher wrote:<br>
> On Fri, Aug 15, 2014 at 11:20 AM, Grigori Goronzy <<a href="mailto:greg@chown.ath.cx">greg@chown.ath.cx</a>> wrote:<br>
>> On 15.08.2014 16:11, Christian König wrote:<br>
>>> Hi Marco,<br>
>>><br>
>>> the problem with an CS ioctl flag is that we sometimes don't know how<br>
>>> much SCLK/MCLK boost is needed, for example when we do post processing<br>
>>> in the player using OpenGL and UVD decoding with VDPAU. In this case<br>
>>> VDPAU don't has the slightest idea how high SCLK/MCLK must be and so<br>
>>> can't give that info to the kernel either.<br>
>>><br>
>><br>
>> Maybe it's an acceptable workaround to simply disable dynamic UVD state<br>
>> selection in case the UVD states only have a single power level. That<br>
>> will avoid the performance issues on affected systems, while still<br>
>> allowing dynamic UVD states on systems that have a saner DPM table<br>
>> setup. I think it is mosly older systems that suffer from this.<br>
>><br>
><br>
> That is exactly what we do now.<br>
><br>
<br>
</div>Is it? In 3.17-wip, dynamic UVD state selection (according to active<br>
streams) is still completely disabled. It will always use the generic<br>
UVD state. In fact wasn't it reverted again because of the performance<br>
issues on some systems?<br>
<span class="HOEnZb"><font color="#888888"><br>
Grigori<br>
</font></span><div class="HOEnZb"><div class="h5"><br>
> Alex<br>
><br>
><br>
>> Best regards<br>
>> Grigori<br>
>><br>
>>> Regards,<br>
>>> Christian.<br>
>>><br>
>>> Am 15.08.2014 um 15:21 schrieb Marco Benatto:<br>
>>>> Hey all,<br>
>>>><br>
>>>> I also had a talk with Alex yesterday about post-processing issues<br>
>>>> when using dynamic UVD profiles and a chamge on CS ioctl<br>
>>>> including a flag to let user mode driver tell to the kernel which<br>
>>>> performance requirement it wants for post processing. A commom<br>
>>>> point for both discussion is to stablish the default values for these<br>
>>>> profiles, but probably this ioctl change would be more impacting/complex<br>
>>>> to implement than a sysfs entry.<br>
>>>><br>
>>>> If a sysfs entry is anough for now I can handle the code to create it<br>
>>>> and, with your help, the code to setup the UVD profile requested<br>
>>>> through it.<br>
>>>><br>
>>>> Is there any suggestion?<br>
>>>><br>
>>>> Thanks all for your help,<br>
>>>><br>
>>>><br>
>>>> On Fri, Aug 15, 2014 at 5:48 AM, Christian König<br>
>>>> <<a href="mailto:christian.koenig@amd.com">christian.koenig@amd.com</a> <mailto:<a href="mailto:christian.koenig@amd.com">christian.koenig@amd.com</a>>> wrote:<br>
>>>><br>
>>>> Hi guys,<br>
>>>><br>
>>>> to make a long story short every time I watch a movie my laptop<br>
>>>> start to heat up because we always select the standard UVD power<br>
>>>> profile without actually measuring if that is necessary.<br>
>>>><br>
>>>> Marco came up with a patch that seems to reliable measure the fps<br>
>>>> send down to the kernel and so together with knowing the frame<br>
>>>> size of the video should allow us to select the right UVD power<br>
>>>> profile.<br>
>>>><br>
>>>> The problem is that Alex (unnoticed by me) completely disabled<br>
>>>> selecting the UVD profiles because of some issues with advanced<br>
>>>> post processing discussed on IRC. The problem seems to be that the<br>
>>>> lower UVD profiles have a to low SCLK/MCLK to handle the 3D load<br>
>>>> that comes with scaling, deinterlacing etc...<br>
>>>><br>
>>>> I unfortunately don't have time for it, cause this only affects<br>
>>>> the hardware generations R600-SI and not the newest one CIK. So<br>
>>>> could you guys stick together and come up with a solution?<br>
>>>> Something like a sysfs entry that let's us select the minimum UVD<br>
>>>> power level allowed?<br>
>>>><br>
>>>> I think Marco is happy to come up with a patch, we just need to<br>
>>>> know what's really needed and what should be the default values.<br>
>>>> I'm happy to review everything that comes out of it, just don't<br>
>>>> have time to do it myself.<br>
>>>><br>
>>>> Happy discussion and thanks in advance,<br>
>>>> Christian.<br>
>>>><br>
>>>> Am 12.08.2014 um 15:05 schrieb Alex Deucher:<br>
>>>><br>
>>>> On Tue, Aug 12, 2014 at 6:00 AM, Christian König<br>
>>>> <<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a> <mailto:<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a>>> wrote:<br>
>>>><br>
>>>> Am 11.08.2014 um 16:52 schrieb Alex Deucher:<br>
>>>><br>
>>>> On Mon, Aug 11, 2014 at 5:08 AM, Christian König<br>
>>>> <<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a><br>
>>>> <mailto:<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a>>> wrote:<br>
>>>><br>
>>>> Am 07.08.2014 um 21:43 schrieb Alex Deucher:<br>
>>>><br>
>>>> On Thu, Aug 7, 2014 at 11:32 AM, Christian König<br>
>>>> <<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a><br>
>>>> <mailto:<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a>>> wrote:<br>
>>>><br>
>>>> Am 07.08.2014 um 16:32 schrieb Alex Deucher:<br>
>>>><br>
>>>> On Thu, Aug 7, 2014 at 7:33 AM,<br>
>>>> Christian König<br>
>>>> <<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a><br>
>>>> <mailto:<a href="mailto:deathsimple@vodafone.de">deathsimple@vodafone.de</a>>><br>
>>>> wrote:<br>
>>>><br>
>>>> From: Marco A Benatto<br>
>>>> <<a href="mailto:marco.antonio.780@gmail.com">marco.antonio.780@gmail.com</a><br>
>>>> <mailto:<a href="mailto:marco.antonio.780@gmail.com">marco.antonio.780@gmail.com</a>>><br>
>>>><br>
>>>> Adding a Frames Per Second<br>
>>>> estimation logic on UVD handles<br>
>>>> when it has being used. This<br>
>>>> estimation is per handle basis<br>
>>>> and will help on DPM profile<br>
>>>> calculation.<br>
>>>><br>
>>>> v2 (chk): fix timestamp type, move<br>
>>>> functions around and<br>
>>>> cleanup code a bit.<br>
>>>><br>
>>>> Will this really help much? I thought<br>
>>>> the problem was mainly due to<br>
>>>> sclk and mclk for post processing.<br>
>>>><br>
>>>><br>
>>>> It should at least handle the UVD side for<br>
>>>> upclocking when you get a<br>
>>>> lot<br>
>>>> of<br>
>>>> streams / fps. And at on my NI the patch<br>
>>>> seems to do exactly that.<br>
>>>><br>
>>>> Switching sclk and mclk for post<br>
>>>> processing is a different task, and I<br>
>>>> actually have no idea what to do with them.<br>
>>>><br>
>>>> At this point we always choose the plain UVD<br>
>>>> state anyway so this<br>
>>>> patch would only take effect if we re-enabled<br>
>>>> the dynamic UVD state<br>
>>>> selection.<br>
>>>><br>
>>>><br>
>>>> Hui? I thought we already re-enabled the dynamic<br>
>>>> UVD state selection, but<br>
>>>> double checking this I found it disabled again.<br>
>>>><br>
>>>> What was the problem with that? Looks like I<br>
>>>> somehow missed the<br>
>>>> discussion<br>
>>>> around it.<br>
>>>><br>
>>>> We did, but after doing so a number of people<br>
>>>> complained about a<br>
>>>> regression on IRC because when apps like xmbc enabled<br>
>>>> post processing,<br>
>>>> performance went down.<br>
>>>><br>
>>>><br>
>>>> That's strange, from my experience the different UVD<br>
>>>> performance states only<br>
>>>> affect UVDs dclk/vclk, not sclk/mclk. I need to get the<br>
>>>> DPM dumps to<br>
>>>> confirms this.<br>
>>>><br>
>>>> The sclks and mclks are usually different as well, especially<br>
>>>> on APUs.<br>
>>>> I can send you some examples.<br>
>>>><br>
>>>> You not off hand remember who complained on IRC? Finding<br>
>>>> something in the<br>
>>>> IRC logs is like searching for a needle in a haystack.<br>
>>>><br>
>>>> I don't remember off hand. I think zgreg was involved in some<br>
>>>> of the<br>
>>>> discussions.<br>
>>>><br>
>>>> Alex<br>
>>>><br>
>>>> Thanks,<br>
>>>> Christian.<br>
>>>><br>
>>>><br>
>>>> Alex<br>
>>>><br>
>>>><br>
>>>> Christian.<br>
>>>><br>
>>>><br>
>>>> For the post processing, we probably need a<br>
>>>> hint we can<br>
>>>> pass to the driver in the CS ioctl to denote<br>
>>>> what state we need.<br>
>>>> Although if we did that, this could would<br>
>>>> largely be moot. That said,<br>
>>>> newer asics support dynamic UVD clocks so we<br>
>>>> really only need<br>
>>>> something like that for older asics and I<br>
>>>> guess VCE.<br>
>>>><br>
>>>> Alex<br>
>>>><br>
>>>> Christian.<br>
>>>><br>
>>>><br>
>>>> Alex<br>
>>>><br>
>>>> Signed-off-by: Marco A Benatto<br>
>>>> <<a href="mailto:marco.antonio.780@gmail.com">marco.antonio.780@gmail.com</a><br>
>>>> <mailto:<a href="mailto:marco.antonio.780@gmail.com">marco.antonio.780@gmail.com</a>>><br>
>>>> Signed-off-by: Christian König<br>
>>>> <<a href="mailto:christian.koenig@amd.com">christian.koenig@amd.com</a><br>
>>>> <mailto:<a href="mailto:christian.koenig@amd.com">christian.koenig@amd.com</a>>><br>
>>>> ---<br>
>>>><br>
>>>> drivers/gpu/drm/radeon/radeon.h<br>
>>>> | 10 ++++++<br>
>>>><br>
>>>> drivers/gpu/drm/radeon/radeon_uvd.c<br>
>>>> | 64<br>
>>>> +++++++++++++++++++++++++++++++++----<br>
>>>> 2 files changed, 68<br>
>>>> insertions(+), 6 deletions(-)<br>
>>>><br>
>>>> diff --git<br>
>>>> a/drivers/gpu/drm/radeon/radeon.h<br>
>>>> b/drivers/gpu/drm/radeon/radeon.h<br>
>>>> index 9e1732e..e92f6cb 100644<br>
>>>> --- a/drivers/gpu/drm/radeon/radeon.h<br>
>>>> +++ b/drivers/gpu/drm/radeon/radeon.h<br>
>>>> @@ -1617,6 +1617,15 @@ int<br>
>>>> radeon_pm_get_type_index(struct<br>
>>>> radeon_device<br>
>>>> *rdev,<br>
>>>> #define<br>
>>>> RADEON_UVD_STACK_SIZE (1024*1024)<br>
>>>> #define RADEON_UVD_HEAP_SIZE<br>
>>>> (1024*1024)<br>
>>>><br>
>>>> +#define RADEON_UVD_FPS_EVENTS_MAX 8<br>
>>>> +#define RADEON_UVD_DEFAULT_FPS 60<br>
>>>> +<br>
>>>> +struct radeon_uvd_fps {<br>
>>>> + uint64_t timestamp;<br>
>>>> + uint8_t event_index;<br>
>>>> + uint8_t<br>
>>>> events[RADEON_UVD_FPS_EVENTS_MAX];<br>
>>>> +};<br>
>>>> +<br>
>>>> struct radeon_uvd {<br>
>>>> struct radeon_bo<br>
>>>> *vcpu_bo;<br>
>>>> void<br>
>>>> *cpu_addr;<br>
>>>> @@ -1626,6 +1635,7 @@ struct<br>
>>>> radeon_uvd {<br>
>>>> struct drm_file<br>
>>>> *filp[RADEON_MAX_UVD_HANDLES];<br>
>>>> unsigned<br>
>>>> img_size[RADEON_MAX_UVD_HANDLES];<br>
>>>> struct delayed_work<br>
>>>> idle_work;<br>
>>>> + struct radeon_uvd_fps<br>
>>>> fps_info[RADEON_MAX_UVD_HANDLES];<br>
>>>> };<br>
>>>><br>
>>>> int radeon_uvd_init(struct<br>
>>>> radeon_device *rdev);<br>
>>>> diff --git<br>
>>>> a/drivers/gpu/drm/radeon/radeon_uvd.c<br>
>>>> b/drivers/gpu/drm/radeon/radeon_uvd.c<br>
>>>> index 6bf55ec..ef5667a 100644<br>
>>>> ---<br>
>>>> a/drivers/gpu/drm/radeon/radeon_uvd.c<br>
>>>> +++<br>
>>>> b/drivers/gpu/drm/radeon/radeon_uvd.c<br>
>>>> @@ -237,6 +237,51 @@ void<br>
>>>> radeon_uvd_force_into_uvd_segment(struct<br>
>>>> radeon_bo *rbo)<br>
>>>> rbo->placement.lpfn =<br>
>>>> (256 * 1024 * 1024) >> PAGE_SHIFT;<br>
>>>> }<br>
>>>><br>
>>>> +static void<br>
>>>> radeon_uvd_fps_clear_events(struct<br>
>>>> radeon_device *rdev,<br>
>>>> int<br>
>>>> idx)<br>
>>>> +{<br>
>>>> + struct radeon_uvd_fps *fps<br>
>>>> = &rdev->uvd.fps_info[idx];<br>
>>>> + unsigned i;<br>
>>>> +<br>
>>>> + fps->timestamp = jiffies_64;<br>
>>>> + fps->event_index = 0;<br>
>>>> + for (i = 0; i <<br>
>>>> RADEON_UVD_FPS_EVENTS_MAX; i++)<br>
>>>> + fps->events[i] = 0;<br>
>>>> +}<br>
>>>> +<br>
>>>> +static void<br>
>>>> radeon_uvd_fps_note_event(struct<br>
>>>> radeon_device *rdev,<br>
>>>> int<br>
>>>> idx)<br>
>>>> +{<br>
>>>> + struct radeon_uvd_fps *fps<br>
>>>> = &rdev->uvd.fps_info[idx];<br>
>>>> + uint64_t timestamp =<br>
>>>> jiffies_64;<br>
>>>> + unsigned rate = 0;<br>
>>>> +<br>
>>>> + uint8_t index =<br>
>>>> fps->event_index++;<br>
>>>> + fps->event_index %=<br>
>>>> RADEON_UVD_FPS_EVENTS_MAX;<br>
>>>> +<br>
>>>> + rate = div64_u64(HZ,<br>
>>>> max(timestamp - fps->timestamp,<br>
>>>> 1ULL));<br>
>>>> +<br>
>>>> + fps->timestamp = timestamp;<br>
>>>> + fps->events[index] =<br>
>>>> min(rate, 120u);<br>
>>>> +}<br>
>>>> +<br>
>>>> +static unsigned<br>
>>>> radeon_uvd_estimate_fps(struct<br>
>>>> radeon_device *rdev,<br>
>>>> int<br>
>>>> idx)<br>
>>>> +{<br>
>>>> + struct radeon_uvd_fps *fps<br>
>>>> = &rdev->uvd.fps_info[idx];<br>
>>>> + unsigned i, valid = 0,<br>
>>>> count = 0;<br>
>>>> +<br>
>>>> + for (i = 0; i <<br>
>>>> RADEON_UVD_FPS_EVENTS_MAX; i++) {<br>
>>>> + /* We should<br>
>>>> ignore zero values */<br>
>>>> + if (fps->events[i]<br>
>>>> != 0) {<br>
>>>> + count +=<br>
>>>> fps->events[i];<br>
>>>> + valid++;<br>
>>>> + }<br>
>>>> + }<br>
>>>> +<br>
>>>> + if (valid > 0)<br>
>>>> + return count / valid;<br>
>>>> + else<br>
>>>> + return<br>
>>>> RADEON_UVD_DEFAULT_FPS;<br>
>>>> +}<br>
>>>> +<br>
>>>> void<br>
>>>> radeon_uvd_free_handles(struct<br>
>>>> radeon_device *rdev, struct<br>
>>>> drm_file *filp)<br>
>>>> {<br>
>>>> int i, r;<br>
>>>> @@ -419,8 +464,10 @@ static int<br>
>>>> radeon_uvd_cs_msg(struct<br>
>>>> radeon_cs_parser<br>
>>>> *p, struct radeon_bo *bo,<br>
>>>><br>
>>>> /* create or decode,<br>
>>>> validate the handle */<br>
>>>> for (i = 0; i <<br>
>>>> RADEON_MAX_UVD_HANDLES; ++i) {<br>
>>>> - if<br>
>>>> (atomic_read(&p->rdev->uvd.handles[i])<br>
>>>> == handle)<br>
>>>> + if<br>
>>>> (atomic_read(&p->rdev->uvd.handles[i])<br>
>>>> == handle)<br>
>>>> {<br>
>>>> +<br>
>>>> radeon_uvd_fps_note_event(p->rdev, i);<br>
>>>> return 0;<br>
>>>> + }<br>
>>>> }<br>
>>>><br>
>>>> /* handle not found<br>
>>>> try to alloc a new one */<br>
>>>> @@ -428,6 +475,7 @@ static int<br>
>>>> radeon_uvd_cs_msg(struct<br>
>>>> radeon_cs_parser<br>
>>>> *p, struct radeon_bo *bo,<br>
>>>> if<br>
>>>> (!atomic_cmpxchg(&p->rdev->uvd.handles[i],<br>
>>>> 0,<br>
>>>> handle)) {<br>
>>>><br>
>>>> p->rdev->uvd.filp[i] = p->filp;<br>
>>>><br>
>>>> p->rdev->uvd.img_size[i] = img_size;<br>
>>>> +<br>
>>>> radeon_uvd_fps_clear_events(p->rdev,<br>
>>>> i);<br>
>>>> return 0;<br>
>>>> }<br>
>>>> }<br>
>>>> @@ -763,7 +811,7 @@ int<br>
>>>> radeon_uvd_get_destroy_msg(struct<br>
>>>> radeon_device<br>
>>>> *rdev, int ring,<br>
>>>> static void<br>
>>>> radeon_uvd_count_handles(struct<br>
>>>> radeon_device *rdev,<br>
>>>><br>
>>>> unsigned *sd, unsigned *hd)<br>
>>>> {<br>
>>>> - unsigned i;<br>
>>>> + unsigned i, fps_rate = 0;<br>
>>>><br>
>>>> *sd = 0;<br>
>>>> *hd = 0;<br>
>>>> @@ -772,10 +820,13 @@ static void<br>
>>>> radeon_uvd_count_handles(struct<br>
>>>> radeon_device *rdev,<br>
>>>> if<br>
>>>> (!atomic_read(&rdev->uvd.handles[i]))<br>
>>>> continue;<br>
>>>><br>
>>>> - if<br>
>>>> (rdev->uvd.img_size[i] >= 720*576)<br>
>>>> - ++(*hd);<br>
>>>> - else<br>
>>>> - ++(*sd);<br>
>>>> + fps_rate =<br>
>>>> radeon_uvd_estimate_fps(rdev, i);<br>
>>>> +<br>
>>>> + if<br>
>>>> (rdev->uvd.img_size[i] >= 720*576) {<br>
>>>> + (*hd) +=<br>
>>>> fps_rate > 30 ? 1 : 2;<br>
>>>> + } else {<br>
>>>> + (*sd) +=<br>
>>>> fps_rate > 30 ? 1 : 2;<br>
>>>> + }<br>
>>>> }<br>
>>>> }<br>
>>>><br>
>>>> @@ -805,6 +856,7 @@ void<br>
>>>> radeon_uvd_note_usage(struct<br>
>>>> radeon_device<br>
>>>> *rdev)<br>
>>>> set_clocks &=<br>
>>>> schedule_delayed_work(&rdev->uvd.idle_work,<br>
>>>><br>
>>>> msecs_to_jiffies(UVD_IDLE_TIMEOUT_MS));<br>
>>>><br>
>>>> +<br>
>>>> if<br>
>>>> ((rdev->pm.pm_method ==<br>
>>>> PM_METHOD_DPM) &&<br>
>>>> rdev->pm.dpm_enabled) {<br>
>>>> unsigned hd =<br>
>>>> 0, sd = 0;<br>
>>>><br>
>>>> radeon_uvd_count_handles(rdev,<br>
>>>> &sd, &hd);<br>
>>>> --<br>
>>>> 1.9.1<br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>><br>
>>>> --<br>
>>>> Marco Antonio Benatto<br>
>>>> Linux user ID:#506236<br>
>>><br>
>><br>
>><br>
<br>
<br>
</div></div></blockquote></div><br><br clear="all"><br>-- <br>Marco Antonio Benatto<br>Linux user ID:<font> #506236</font>
</div>