[Mesa-dev] [PATCH 3/4] winsys/radeon: Keep bo statistics
Marek Olšák
maraeo at gmail.com
Wed Jan 8 10:03:53 PST 2014
On Wed, Jan 8, 2014 at 6:21 PM, Lauri Kasanen <cand at gmx.com> wrote:
> On Wed, 8 Jan 2014 15:54:04 +0100
> Marek Olšák <maraeo at gmail.com> wrote:
>
>> > On Wed, 8 Jan 2014 12:03:12 +0100
>> > Marek Olšák <maraeo at gmail.com> wrote:
>> >> Why don't you just set the statistics once per CS in
>> >> radeon_drm_cs_flush? I don't see a value in doing it in every function
>> >> that sets the resources.
>> >
>> > It's the only way to get accurate statistics that I can see. Doing it
>> > per-cs could be off by big amounts (100x even?). Being off by that much
>> > could lead to rather worse decisions.
>>
>> It's not accurate at all, it's actually pretty random. The stats
>> should not be called "num_reads" and "num_writes", they should be
>> called "num_state_changes", and the number of resource state changes
>> has nothing to do with how the resources affect GPU performance. You
>> might get a pretty high score for unimportant resources with your
>> approach. It's as useful as assigning a random number to each
>> resource.
>
> Yes, more accurate names would be "times_bound_for_reads" and
> "times_bound_for_writes", but those are too long names for my taste ;)
Yeah, and those 2 are completely useless, because the number of times
a resource is bound is irrelevant and won't help us in any way. Like I
said, you could just assign a random number to "times_bound_for_*" in
radeon_drm_cs_flush and it would perform the same or better, because
random statistics are better than bad statistics.
The only 2 pieces of information that could help are:
1) how much bandwidth each resource needs per frame
2) how well the GPU can hide latency for memory reads and writes
from/to that resource.
Based on the two, you could make a pretty good estimate which
resources could be moved to RAM such that it would have the smallest
impact on performance. Unfortunately, we have no way to get that info.
In other words, I won't accept your "num_reads/writes" counters in the
current form.
>
>> Another issue is that you record times when resource state changes
>> happen, but rendering actually starts after radeon_drm_cs_flush is
>> called. Your recorded times actually only tell you when the user
>> changed states, which may be useful for CPU measurements, but it's
>> useless for everything else.
>
> The timing accuracy is intended to determine "recently", ie "within this
> frame" or "within a couple frames". It achieves that as far
> as I can see.
You are completely wrong. You can only determine "within this frame"
and "within a couple frames", in other words "the frame number", by
checking PIPE_FLUSH_END_OF_FRAME (in the pipe driver), which is passed
down to the winsys as RADEON_FLUSH_END_OF_FRAME, which is passed down
to the kernel. Ignoring these flags will not prevent RAM<->VRAM buffer
ping-pong within a frame. Keep in mind that one frame can take 2
seconds or 1 millisecond, therefore recording the time alone is not
good enough.
Marek
More information about the mesa-dev
mailing list