[PATCH] drm/radeon: Inline r100_mm_rreg

Christian König deathsimple at vodafone.de
Fri Apr 11 01:33:08 PDT 2014


Am 11.04.2014 09:52, schrieb Lauri Kasanen:
> On Thu, 10 Apr 2014 21:30:03 +0200
> Christian König <deathsimple at vodafone.de> wrote:
>
>>>>> Quick thought from someone entirely unfamiliar with the hardware:
>>>>> perhaps you can get the performance benefit without the size increase
>>>>> by moving the else portion into a non-inline function? I'm guessing
>>>>> that most accesses happen in the "if" branch.
>>>> The function call overhead is about equal to branching overhead, so
>>>> splitting it would only help about half that. It's called from many
>>>> places, and a lot of calls per sec.
>> Actually direct register access shouldn't be necessary so often. Apart
>> from page flips, write/read pointer updates and irq processing there
>> shouldn't be so many of them. Could you clarify a bit more what issue
>> you are seeing here?
> Too much cpu usage for such a simple function. 2% makes it #2 in top-10
> radeon.ko functions, right after evergreen_cs_parse. For reference, #3
> (radeon_cs_packet_parse) is only 0.5%, one fourth of this function's
> usage.

I think you misunderstood me here. I do believe your numbers that it 
makes a noticeable difference.

But I've did a couple of perf tests recently on SI and CIK while hacking 
on VM support, and IIRC r100_mm_rreg didn't showed up in the top 10 on 
those systems.

So what puzzles me is who the hack is calling r100_mm_rreg so often that 
it makes a noticeable difference on evergreen/NI?

Christian.

>
> As proved by the perf increase, it's called often enough that getting
> rid of the function call overhead (and compiling the if out
> compile-time) helps measurably.
>
> - Lauri



More information about the dri-devel mailing list