[Intel-gfx] [PATCH v3] drm/i915: Use exponential backoff for wait_for()

Thu Nov 30 03:04:14 UTC 2017

On 11/24/2017 6:12 AM, Chris Wilson wrote:
> Quoting Michał Winiarski (2017-11-24 12:37:56)
>> Since we see the effects for GuC preeption, let's gather some evidence.
>>
>> (SKL)
>> intel_guc_send_mmio latency: 100 rounds of gem_exec_latency --r '*-preemption'
>>
>> drm-tip:
>>       usecs               : count     distribution
>>           0 -> 1          : 0        |                                        |
>>           2 -> 3          : 0        |                                        |
>>           4 -> 7          : 0        |                                        |
>>           8 -> 15         : 44       |                                        |
>>          16 -> 31         : 1088     |                                        |
>>          32 -> 63         : 832      |                                        |
>>          64 -> 127        : 0        |                                        |
>>         128 -> 255        : 0        |                                        |
>>         256 -> 511        : 12       |                                        |
>>         512 -> 1023       : 0        |                                        |
>>        1024 -> 2047       : 29899    |*********                               |
>>        2048 -> 4095       : 131033   |****************************************|
> Such pretty graphs. Reminds me of the bpf hist output, I wonder if we
> could create a tracepoint/kprobe that would output a histogram for each
> waiter (filterable ofc). Benefit? Just thinking of tuning the
> spin/sleep, in which case overall metrics are best
> (intel_eait_for_register needs to be optimised for the typical case). I
> am wondering if we could tune the spin period down to 5us, 2us? And then
> have the 10us sleep.
>
> We would also need a typical workload to run, it's profile-guided
> optimisation after all. Hmm.
> -Chris

It took me a while to get back to this but I've now had chance to run 
with this exponential backoff scheme on the original system that showed 
the problem. It was a slightly messy back port due to the customer tree 
being much older than current nightly. I'm pretty sure I got it correct 
though. However, I'm not sure what the recommendation is for the two 
timeout values. Using the default of '10, 10' in the patch, I still get 
lots of very long delays. I have to up the Wmin value to at least 140 to 
get a stall free result. Which is plausible given that the big spike in 
the results of any fast version is at 110-150us. Also of note is that a 
Wmin between 10 and 110 actually makes things worse. Changing Wmax has 
no effect.

In the following table, 'original' is the original driver before any 
changes and 'retry loop' is the version using the first workaround of 
just running the busy poll wait in a 10x loop. The other columns are 
using the backoff patch with the given Wmin/Wmax values. Note that the 
times are bucketed to 10us up to 500us and then in 500us lumps 
thereafter. The value listed is the lower limit, i.e. there were no 
times of <10us measured. Each case was run for 1000 samples.

     Time        Original    10/10 50/10    100/10    110/10    
130/10    140/10  RetryLoop
     10us:          2         2         2         2 2         2         
2         2
     30us:                              1         1 1         1         1
     50us:                              1
     70us:                             14        63 56        64        
63        61
     80us:                              8        41 52        44        
46        41
     90us:                              6        24 10        28        
12        17
    100us:                    2         4        20 16        17        
17        22
    110us:                                       13 21        14        
13        11
    120us:                              6       366 633       636       
660       650
    130us:                    2         2        46 125        95        
86        95
    140us:                    3         2        16 18        32        
46        48
    150us:                  210         3        12 13        37        
32        31
    160us:                  322         1        18 10        14        
12        17
    170us:                  157         4         5 5         3         
5         2
    180us:                   62        11         3 1         2         
1         1
    190us:                   32       212 1                   1         2
    200us:                   27       266 1                   1
    210us:                   16 
181                                                 1
    220us:                   16 51                                       1
    230us:                   10        43         4
    240us:                   12        22        62         1
    250us:                    4        12       112         3
    260us:                    3        13        73         8
    270us:                    5        12        12 8         2
    280us:                    4         7        12 5         1
    290us:                              9         4
    300us:                    1         3         9 1         1
    310us:                    2         3         5 1         1
    320us:                    1         4         2         3
    330us:                    1         5         1
    340us:                    1 2                   1
    350us:                              2         1
    360us:                              2         1
    370us:                    2                   2
    380us:                                        1
    390us:                    2         1         2         1
    410us:                    1
    420us:                    3
    430us:                    2         2         1
    440us:                    2         1
    450us:                              4
    460us:                    3         1
    470us:                              3         1
    480us:                    2                   2
    490us:                                        1
    500us:                   19        13        17
   1000us:        249        22        30        11
   1500us:        393         4         4         2         1
   2000us:        132         7         8         8 2         
1                   1
   2500us:         63         4         4         6 1         1         1
   3000us:         59         9         7         6         1
   3500us:         34         2 1                             1
   4000us:         17         9         4         1
   4500us:          8         2         1         1
   5000us:          7         1         2
   5500us:          7         2                   1
   6000us:          4         2         1         1
   6500us:          3                             1
   7000us:          6         2                   1
   7500us:          4         1                             1
   8000us:          5                             1
   8500us:                    1         1
   9000us:          2
   9500us:          2         1
 >10000us:          3                             1

John.

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://lists.freedesktop.org/archives/intel-gfx/attachments/20171129/53639068/attachment.html>