[LKP] [drm/mgag200] 90f479ae51: vm-scalability.median -18.8% regression

Rong Chen rong.a.chen at intel.com
Fri Aug 2 07:11:12 UTC 2019


Hi,

On 8/1/19 7:58 PM, Thomas Zimmermann wrote:
> Hi
>
> Am 01.08.19 um 13:25 schrieb Feng Tang:
>> Hi Thomas,
>>
>> On Thu, Aug 01, 2019 at 11:59:28AM +0200, Thomas Zimmermann wrote:
>>> Hi
>>>
>>> Am 01.08.19 um 10:37 schrieb Feng Tang:
>>>> On Thu, Aug 01, 2019 at 02:19:53PM +0800, Rong Chen wrote:
>>>>>>>>>>>>>> commit: 90f479ae51afa45efab97afdde9b94b9660dd3e4 ("drm/mgag200: Replace struct mga_fbdev with generic framebuffer emulation")
>>>>>>>>>>>>>> https://kernel.googlesource.com/pub/scm/linux/kernel/git/next/linux-next.git master
>>>>>>>>>>>>> Daniel, Noralf, we may have to revert this patch.
>>>>>>>>>>>>>
>>>>>>>>>>>>> I expected some change in display performance, but not in VM. Since it's
>>>>>>>>>>>>> a server chipset, probably no one cares much about display performance.
>>>>>>>>>>>>> So that seemed like a good trade-off for re-using shared code.
>>>>>>>>>>>>>
>>>>>>>>>>>>> Part of the patch set is that the generic fb emulation now maps and
>>>>>>>>>>>>> unmaps the fbdev BO when updating the screen. I guess that's the cause
>>>>>>>>>>>>> of the performance regression. And it should be visible with other
>>>>>>>>>>>>> drivers as well if they use a shadow FB for fbdev emulation.
>>>>>>>>>>>> For fbcon we should need to do any maps/unamps at all, this is for the
>>>>>>>>>>>> fbdev mmap support only. If the testcase mentioned here tests fbdev
>>>>>>>>>>>> mmap handling it's pretty badly misnamed :-) And as long as you don't
>>>>>>>>>>>> have an fbdev mmap there shouldn't be any impact at all.
>>>>>>>>>>> The ast and mgag200 have only a few MiB of VRAM, so we have to get the
>>>>>>>>>>> fbdev BO out if it's not being displayed. If not being mapped, it can be
>>>>>>>>>>> evicted and make room for X, etc.
>>>>>>>>>>>
>>>>>>>>>>> To make this work, the BO's memory is mapped and unmapped in
>>>>>>>>>>> drm_fb_helper_dirty_work() before being updated from the shadow FB. [1]
>>>>>>>>>>> That fbdev mapping is established on each screen update, more or less.
>>>>>>>>>>>  From my (yet unverified) understanding, this causes the performance
>>>>>>>>>>> regression in the VM code.
>>>>>>>>>>>
>>>>>>>>>>> The original code in mgag200 used to kmap the fbdev BO while it's being
>>>>>>>>>>> displayed; [2] and the drawing code only mapped it when necessary (i.e.,
>>>>>>>>>>> not being display). [3]
>>>>>>>>>> Hm yeah, this vmap/vunmap is going to be pretty bad. We indeed should
>>>>>>>>>> cache this.
>>>>>>>>>>
>>>>>>>>>>> I think this could be added for VRAM helpers as well, but it's still a
>>>>>>>>>>> workaround and non-VRAM drivers might also run into such a performance
>>>>>>>>>>> regression if they use the fbdev's shadow fb.
>>>>>>>>>> Yeah agreed, fbdev emulation should try to cache the vmap.
>>>>>>>>>>
>>>>>>>>>>> Noralf mentioned that there are plans for other DRM clients besides the
>>>>>>>>>>> console. They would as well run into similar problems.
>>>>>>>>>>>
>>>>>>>>>>>>> The thing is that we'd need another generic fbdev emulation for ast and
>>>>>>>>>>>>> mgag200 that handles this issue properly.
>>>>>>>>>>>> Yeah I dont think we want to jump the gun here.  If you can try to
>>>>>>>>>>>> repro locally and profile where we're wasting cpu time I hope that
>>>>>>>>>>>> should sched a light what's going wrong here.
>>>>>>>>>>> I don't have much time ATM and I'm not even officially at work until
>>>>>>>>>>> late Aug. I'd send you the revert and investigate later. I agree that
>>>>>>>>>>> using generic fbdev emulation would be preferable.
>>>>>>>>>> Still not sure that's the right thing to do really. Yes it's a
>>>>>>>>>> regression, but vm testcases shouldn run a single line of fbcon or drm
>>>>>>>>>> code. So why this is impacted so heavily by a silly drm change is very
>>>>>>>>>> confusing to me. We might be papering over a deeper and much more
>>>>>>>>>> serious issue ...
>>>>>>>>> It's a regression, the right thing is to revert first and then work
>>>>>>>>> out the right thing to do.
>>>>>>>> Sure, but I have no idea whether the testcase is doing something
>>>>>>>> reasonable. If it's accidentally testing vm scalability of fbdev and
>>>>>>>> there's no one else doing something this pointless, then it's not a
>>>>>>>> real bug. Plus I think we're shooting the messenger here.
>>>>>>>>
>>>>>>>>> It's likely the test runs on the console and printfs stuff out while running.
>>>>>>>> But why did we not regress the world if a few prints on the console
>>>>>>>> have such a huge impact? We didn't get an entire stream of mails about
>>>>>>>> breaking stuff ...
>>>>>>> The regression seems not related to the commit.  But we have retested
>>>>>>> and confirmed the regression.  Hard to understand what happens.
>>>>>> Does the regressed test cause any output on console while it's
>>>>>> measuring? If so, it's probably accidentally measuring fbcon/DRM code in
>>>>>> addition to the workload it's trying to measure.
>>>>>>
>>>>> Sorry, I'm not familiar with DRM, we enabled the console to output logs, and
>>>>> attached please find the log file.
>>>>>
>>>>> "Command line: ... console=tty0 earlyprintk=ttyS0,115200
>>>>> console=ttyS0,115200 vga=normal rw"
>>>> We did more check, and found this test machine does use the
>>>> mgag200 driver.
>>>>
>>>> And we are suspecting the regression is caused by
>>>>
>>>> commit cf1ca9aeb930df074bb5bbcde55f935fec04e529
>>>> Author: Thomas Zimmermann <tzimmermann at suse.de>
>>>> Date:   Wed Jul 3 09:58:24 2019 +0200
>>> Yes, that's the commit. Unfortunately reverting it would require
>>> reverting a hand full of other patches as well.
>>>
>>> I have a potential fix for the problem. Could you run and verify that it
>>> resolves the problem?
>> Sure, please send it to us. Rong and I will try it.
> Fantastic, thank you! The patch set is available on dri-devel at
>
>    https://lists.freedesktop.org/archives/dri-devel/2019-August/228950.html

The patch set improves the performance slightly, but the change is not 
very obvious.

$ git log --oneline 8f7ec6bcc7 -5
8f7ec6bcc75a9 drm/mgag200: Map fbdev framebuffer while it's being displayed
abcb1cf24033a drm/ast: Map fbdev framebuffer while it's being displayed
a92f80044c623 drm/vram-helpers: Add kmap ref-counting to GEM VRAM objects
90f479ae51afa drm/mgag200: Replace struct mga_fbdev with generic 
framebuffer emulation
f1f8555dfb9a7 drm/bochs: Use shadow buffer for bochs framebuffer console

commit:
   f1f8555dfb ("drm/bochs: Use shadow buffer for bochs framebuffer console")
   90f479ae51 ("drm/mgag200: Replace struct mga_fbdev with generic 
framebuffer emulation")
   8f7ec6bcc7 ("drm/mgag200: Map fbdev framebuffer while it's being 
displayed")

f1f8555dfb9a70a2  90f479ae51afa45efab97afdde 8f7ec6bcc75a996f5c6b39a9cf  
testcase/testparams/testbox
----------------  -------------------------- --------------------------  
---------------------------
          %stddev      change         %stddev      change %stddev
              \          |                \          | \
      43921             -18%      35884             -17% 36629 
vm-scalability/performance-300s-8T-anon-cow-seq-hugetlb/lkp-knm01
      43921             -18%      35884             -17% 36629        
GEO-MEAN vm-scalability.median

Best Regards,
Rong Chen

>
> Best regards
> Thomas
>
>> Thanks,
>> Feng
>>
>>
>>> Best regards
>>> Thomas
>>>
>>>>      drm/fb-helper: Map DRM client buffer only when required
>>>>      
>>>>      This patch changes DRM clients to not map the buffer by default. The
>>>>      buffer, like any buffer object, should be mapped and unmapped when
>>>>      needed.
>>>>      
>>>>      An unmapped buffer object can be evicted to system memory and does
>>>>      not consume video ram until displayed. This allows to use generic fbdev
>>>>      emulation with drivers for low-memory devices, such as ast and mgag200.
>>>>      
>>>>      This change affects the generic framebuffer console. HW-based consoles
>>>>      map their console buffer once and keep it mapped. Userspace can mmap this
>>>>      buffer into its address space. The shadow-buffered framebuffer console
>>>>      only needs the buffer object to be mapped during updates. While not being
>>>>      updated from the shadow buffer, the buffer object can remain unmapped.
>>>>      Userspace will always mmap the shadow buffer.
>>>>   
>>>> which may add more load when fbcon is busy printing out messages.
>>>>
>>>> We are doing more test inside 0day to confirm.
>>>>
>>>> Thanks,
>>>> Feng
>>>> _______________________________________________
>>>> dri-devel mailing list
>>>> dri-devel at lists.freedesktop.org
>>>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>>>
>>> -- 
>>> Thomas Zimmermann
>>> Graphics Driver Developer
>>> SUSE Linux GmbH, Maxfeldstrasse 5, 90409 Nuernberg, Germany
>>> GF: Felix Imendörffer, Mary Higgins, Sri Rasiah
>>> HRB 21284 (AG Nürnberg)
>>>
>>
>>
>> _______________________________________________
>> dri-devel mailing list
>> dri-devel at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/dri-devel
>>

-------------- next part --------------
A non-text attachment was scrubbed...
Name: kmsg.xz
Type: application/x-xz
Size: 82932 bytes
Desc: not available
URL: <https://lists.freedesktop.org/archives/dri-devel/attachments/20190802/b0e180ca/attachment-0001.xz>


More information about the dri-devel mailing list