[PATCH] drm/radeon: Inline r100_mm_rreg

Christian König deathsimple at vodafone.de
Fri Apr 11 05:32:20 PDT 2014


Am 11.04.2014 11:54, schrieb Lauri Kasanen:
> On Fri, 11 Apr 2014 10:33:08 +0200
> Christian König <deathsimple at vodafone.de> wrote:
>
>>>> Actually direct register access shouldn't be necessary so often. Apart
>>>> from page flips, write/read pointer updates and irq processing there
>>>> shouldn't be so many of them. Could you clarify a bit more what issue
>>>> you are seeing here?
>>> Too much cpu usage for such a simple function. 2% makes it #2 in top-10
>>> radeon.ko functions, right after evergreen_cs_parse. For reference, #3
>>> (radeon_cs_packet_parse) is only 0.5%, one fourth of this function's
>>> usage.
>> I think you misunderstood me here. I do believe your numbers that it
>> makes a noticeable difference.
>>
>> But I've did a couple of perf tests recently on SI and CIK while hacking
>> on VM support, and IIRC r100_mm_rreg didn't showed up in the top 10 on
>> those systems.
>>
>> So what puzzles me is who the hack is calling r100_mm_rreg so often that
>> it makes a noticeable difference on evergreen/NI?
> The biggest caller is cayman_cp_int_cntl_setup. Before inlining it took
> 0.0013%, after it takes 1%.

Sounds like somebody is constantly turning interrupts on and off.

> This is on a Richland APU, so Aruba/Cayman. Urban Terror is an ioq3
> game with a lot of cpu-side vertex submissions.

That will probably be the difference, I only tested lightsmark.

Anyway, I would do like Ilia suggested and only put the else branch into 
a separate, not inlined function.

BTW: It's probably a good idea to do the same for the write function as 
well.

Christian.

> - Lauri



More information about the dri-devel mailing list