[Mesa-dev] [PATCH 00/26] RadeonSI: Primitive culling with async compute
Dieter Nützel
Dieter at nuetzel-hh.de
Thu Feb 14 18:43:38 UTC 2019
For the whole series (the updated branch merged in)
Tested-by: Dieter Nützel <Dieter at nuetzel-hh.de>
on Polaris 20
FreeCAD, Blender, UH, UV, US, some VTK apps
No surprising speed up but e.g. NO slowdown.
tb stands even for
[Mesa-dev] [PATCH 0/4] RadeonSI: Follow-up for the primitive culling
series
too (but no SI, here).
mplayer / mpv works like a charm, again.
ParaView-5.6.0-MPI-Linux-64bit
1920x1080
pd off ~18 fps
pd on ~24 fps ! ;-)
2560x1440
pd off ~14 fps
pd on ~16 fps
./pvbatch
../lib/python2.7/site-packages/paraview/benchmark/manyspheres.py -s 100
-r 726 -v 1920,1080 -f 30
Is this right?
Poor
Intel Xeon X3470, 2.93 GHz, 3.2 GHz turbo, 4c/8t
24 GB
Polaris 20, 8 GB
PCIe 2.1 only (NO PCIe atomics)
Dieter
Am 14.02.2019 03:07, schrieb Marek Olšák:
> I just updated the branch, fixing video players.
>
> Marek
>
> On Wed, Feb 13, 2019 at 8:28 PM Dieter Nützel <Dieter at nuetzel-hh.de>
> wrote:
>
>> Now with LLVM 9.0 git;-)
>>
>> Running, except mplayer/mpv (same as before).
>>
>> mplayer: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
>> radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'
>>
>> failed.
>> Abbruch (core dumped)
>>
>> mpv: ../src/gallium/drivers/radeon/radeon_winsys.h:866:
>> radeon_get_heap_index: Assertion `!"32BIT without WC is disallowed"'
>>
>> failed.
>> Abbruch (core dumped)
>>
>> And this after glxgears, Blender, FreeCAD, UH and UV:
>>
>> [38939.440950] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
>>
>> 00000000679c61fd is still alive
>> [38939.440993] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
>> 00000000679c61fd is still alive
>> [38964.901076] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
>>
>> 000000009c4b659b is still alive
>> [38964.901130] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
>> 000000009c4b659b is still alive
>> [38980.844577] [drm:amdgpu_ctx_mgr_entity_fini [amdgpu]] *ERROR* ctx
>>
>> 000000001bee3a35 is still alive
>> [38980.844642] [drm:amdgpu_ctx_mgr_fini [amdgpu]] *ERROR* ctx
>> 000000001bee3a35 is still alive
>>
>> Newer 'amd-staging-drm-next' needed? #0bf64b0a9f78 currently
>>
>> If I only had some big triangle apps...;-)
>>
>> Dieter
>>
>> Am 13.02.2019 17:36, schrieb Marek Olšák:
>>> Dieter, you need final LLVM 8.0.
>>>
>>> Marek
>>>
>>> On Wed, Feb 13, 2019 at 11:02 AM Dieter Nützel
>> <Dieter at nuetzel-hh.de>
>>> wrote:
>>>
>>>> GREAT stuff, Marek!
>>>>
>>>> But sadly some crashes.
>>>> Is my LLVM git version to old?
>>>> 7. Jan 2019 (short before 8.0 cut)
>>>>
>>>> LLVM (http://llvm.org/):
>>>> LLVM version 8.0.0svn
>>>> Optimized build.
>>>> Default target: x86_64-unknown-linux-gnu
>>>> Host CPU: nehalem
>>>>
>>>> Registered Targets:
>>>> amdgcn - AMD GCN GPUs
>>>> r600 - AMD GPUs HD2XXX-HD6XXX
>>>> x86 - 32-bit X86: Pentium-Pro and above
>>>> x86-64 - 64-bit X86: EM64T and AMD64
>>>>
>>>> Please have a look at my post @Phoronix:
>>>>
>>>
>>
> https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1079916-radeonsi-picks-up-primitive-culling-with-async-compute-for-performance-wins?p=1079984#post1079984
>>>>
>>>> Thanks,
>>>> Dieter
>>>>
>>>> Am 13.02.2019 06:15, schrieb Marek Olšák:
>>>>> Hi,
>>>>>
>>>>> This patch series uses async compute to do primitive culling
>>>> before
>>>>> the vertex shader. It significantly improves performance for
>>>>> applications
>>>>> that use a lot of geometry that is invisible because primitives
>>>> don't
>>>>> intersect sample points or there are a lot of back faces, etc.
>>>>>
>>>>> It passes 99.9999% of all tests (GL CTS, dEQP, piglit) and is
>> 100%
>>>>
>>>>> stable.
>>>>> It supports all chips all the way from Sea Islands to Radeon
>> VII.
>>>>>
>>>>> As you can see in the results marked (ENABLED) in the picture
>>>> below,
>>>>> it destroys our competition (The GeForce results are from a
>>>> Phoronix
>>>>> article from 2017, the latest ones I could find):
>>>>>
>>>>> Benchmark: ParaView - Many Spheres - 2560x1440
>>>>>
>> https://people.freedesktop.org/~mareko/prim-discard-cs-results.png
>>>>>
>>>>>
>>>>> The last patch describes the implementation and functional
>>>> limitations
>>>>> if you can find the huge code comment, so I'm not gonna do that
>>>> here.
>>>>>
>>>>> I decided to enable this optimization on all Pro graphics cards.
>>>>> The reason is that I haven't had time to benchmark games.
>>>>> This decision may be changed based on community feedback, etc.
>>>>>
>>>>> People using the Pro graphics cards can disable this by setting
>>>>> AMD_DEBUG=nopd, and people using consumer graphics cards can
>>>> enable
>>>>> this by setting AMD_DEBUG=pd. So you always have a choice.
>>>>>
>>>>> Eventually we might also enable this on consumer graphics cards
>>>> for
>>>>> those
>>>>> games that benefit. It might decrease performance if there is
>> not
>>>>> enough
>>>>> invisible geometry.
>>>>>
>>>>> Branch:
>>>>> https://cgit.freedesktop.org/~mareko/mesa/log/?h=prim-discard-cs
>>>>>
>>>>> Please review.
>>>>>
>>>>> Thanks,
>>>>> Marek
>>>>> _______________________________________________
>>>>> mesa-dev mailing list
>>>>> mesa-dev at lists.freedesktop.org
>>>>> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list