[PATCH] drm/radeon: Inline r100_mm_rreg

Thu Apr 10 11:52:33 PDT 2014

On Thu, Apr 10, 2014 at 2:46 PM, Lauri Kasanen <cand at gmx.com> wrote:
> On Thu, 10 Apr 2014 12:19:10 -0400
> Ilia Mirkin <imirkin at alum.mit.edu> wrote:
>
>> > +static inline uint32_t r100_mm_rreg(struct radeon_device *rdev, uint32_t reg,
>> > +                                   bool always_indirect)
>> > +{
>> > +       if (reg < rdev->rmmio_size && !always_indirect)
>> > +               return readl(((void __iomem *)rdev->rmmio) + reg);
>>
>> Quick thought from someone entirely unfamiliar with the hardware:
>> perhaps you can get the performance benefit without the size increase
>> by moving the else portion into a non-inline function? I'm guessing
>> that most accesses happen in the "if" branch.
>
> The function call overhead is about equal to branching overhead, so
> splitting it would only help about half that. It's called from many
> places, and a lot of calls per sec.

Is that really true? I haven't profiled it, but a function call has to
save/restore registers, set up new stack, etc. And the jump is to some
far-away code, so perhaps less likely to be in i-cache? (But maybe
not, not sure on the details of how all that works.) And I'm guessing
most of the size increase is coming from the spinlock/unlock, which,
I'm guessing again, is the much-rarer case.

And the branch would happen either way... so that's a sunk cost.
(Except I bet that the always_indirect param is always constant and
that bit of the if can be resolved at compile time with that part
being inlined.)

Anyways, it was just a thought.

  -ilia