[PATCH RFC 00/24] Lima DRM driver
Qiang Yu
yuq825 at gmail.com
Wed May 23 14:13:13 UTC 2018
On Wed, May 23, 2018 at 9:59 PM, Christian König
<christian.koenig at amd.com> wrote:
> Am 23.05.2018 um 15:52 schrieb Qiang Yu:
>>
>> On Wed, May 23, 2018 at 5:29 PM, Christian König
>> <ckoenig.leichtzumerken at gmail.com> wrote:
>>>
>>> Am 18.05.2018 um 11:27 schrieb Qiang Yu:
>>>>
>>>> Kernel DRM driver for ARM Mali 400/450 GPUs.
>>>>
>>>> This implementation mainly take amdgpu DRM driver as reference.
>>>>
>>>> - Mali 4xx GPUs have two kinds of processors GP and PP. GP is for
>>>> OpenGL vertex shader processing and PP is for fragment shader
>>>> processing. Each processor has its own MMU so prcessors work in
>>>> virtual address space.
>>>> - There's only one GP but multiple PP (max 4 for mali 400 and 8
>>>> for mali 450) in the same mali 4xx GPU. All PPs are grouped
>>>> togather to handle a single fragment shader task divided by
>>>> FB output tiled pixels. Mali 400 user space driver is
>>>> responsible for assign target tiled pixels to each PP, but mali
>>>> 450 has a HW module called DLBU to dynamically balance each
>>>> PP's load.
>>>> - User space driver allocate buffer object and map into GPU
>>>> virtual address space, upload command stream and draw data with
>>>> CPU mmap of the buffer object, then submit task to GP/PP with
>>>> a register frame indicating where is the command stream and misc
>>>> settings.
>>>> - There's no command stream validation/relocation due to each user
>>>> process has its own GPU virtual address space. GP/PP's MMU switch
>>>> virtual address space before running two tasks from different
>>>> user process. Error or evil user space code just get MMU fault
>>>> or GP/PP error IRQ, then the HW/SW will be recovered.
>>>> - Use TTM as MM. TTM_PL_TT type memory is used as the content of
>>>> lima buffer object which is allocated from TTM page pool. all
>>>> lima buffer object gets pinned with TTM_PL_FLAG_NO_EVICT when
>>>> allocation, so there's no buffer eviction and swap for now. We
>>>> need reverse engineering to see if and how GP/PP support MMU
>>>> fault recovery (continue execution). Otherwise we have to
>>>> pin/unpin each envolved buffer when task creation/deletion.
>>>
>>>
>>> Well pinning all memory is usually a no-go for upstreaming. But since you
>>> are already using the drm_sched for GPU task scheduling why are you
>>> actually
>>> needing this?
>>>
>>> The scheduler should take care of signaling all fences when the hardware
>>> is
>>> done with it's magic and that is enough for TTM to note that a buffer
>>> object
>>> is movable again (e.g. unpin them).
>>
>> Please correct me if I'm wrong.
>
>
> Well, you are wrong :)
>
>> One way to implement eviction/swap is like this:
>> call validation on each buffers involved in a task, but this won't
>> prevent it from
>> eviction/swap when executing, so a GPU MMU fault may happen and in the
>> handler we need to recover the buffer evicted/swapped.
>>
>> Another way is pin/unpin buffers evolved when task create/free.
>>
>> First way is better when memory load is low and second way is better when
>> memory load is high. First way also need less memory.
>>
>> So I'd prefer first way but due to the GPU MMU fault
>> HW op need reverse engineering, I have to pin all buffers now. After
>> the HW op is clear, I can choose one way to implement.
>
>
> The general approach is:
> 1.) Lock all BOs
> 2.) Validate all BOs
> 3.) Add the fence
> 4.) Unlock the BOs
This is the task prepare process, right?
>
> BOs can't be evicted while they are locked
During the task prepare stage, they're locked, but after task queued, they
get unlocked and be evictable?
> and since you already add the
> fence that should be perfectly sufficient to prevent it from being evicted
> until your operation is completed.
You mean I have to explicitly pin it with TTM_PL_FLAG_NO_EVICT
when task creation or TTM will check buffer's reservation object and
won't evict it if see a fence?
Regards,
Qiang
>
> Using the MMU is certainly be better in general, but usually only optional
> and a pain in the ass to get working. We have that in amdgpu for quite a
> while as well now and still don't use it because of that.
>
> Regards,
> Christian.
>
>
>>
>> Regards,
>> Qiang
>>
>>> Christian.
>>>
>>>
>>>> - Use drm_sched for GPU task schedule. Each OpenGL context should
>>>> have a lima context object in the kernel to distinguish tasks
>>>> from different user. drm_sched gets task from each lima context
>>>> in a fair way.
>>>>
>>>> Not implemented:
>>>> - Dump buffer support
>>>> - Power management
>>>> - Performance counter
>>>>
>>>> This patch serial just pack a pair of .c/.h files in each patch.
>>>> For whole history of this driver's development, see:
>>>> https://github.com/yuq/linux-lima/commits/lima-4.17-rc4
>>>>
>>>> Mesa driver is still in development and not ready for daily usage,
>>>> but can run some simple tests like kmscube and glamrk2, see:
>>>> https://github.com/yuq/mesa-lima
>>>>
>>>> Andrei Paulau (1):
>>>> arm64/dts: add switch-delay for meson mali
>>>>
>>>> Lima Project Developers (10):
>>>> drm/lima: add mali 4xx GPU hardware regs
>>>> drm/lima: add lima core driver
>>>> drm/lima: add GPU device functions
>>>> drm/lima: add PMU related functions
>>>> drm/lima: add PP related functions
>>>> drm/lima: add MMU related functions
>>>> drm/lima: add GPU virtual memory space handing
>>>> drm/lima: add GEM related functions
>>>> drm/lima: add GEM Prime related functions
>>>> drm/lima: add makefile and kconfig
>>>>
>>>> Qiang Yu (12):
>>>> dt-bindings: add switch-delay property for mali-utgard
>>>> arm64/dts: add switch-delay for meson mali
>>>> Revert "drm: Nerf the preclose callback for modern drivers"
>>>> drm/lima: add lima uapi header
>>>> drm/lima: add L2 cache functions
>>>> drm/lima: add GP related functions
>>>> drm/lima: add BCAST related function
>>>> drm/lima: add DLBU related functions
>>>> drm/lima: add TTM subsystem functions
>>>> drm/lima: add buffer object functions
>>>> drm/lima: add GPU schedule using DRM_SCHED
>>>> drm/lima: add context related functions
>>>>
>>>> Simon Shields (1):
>>>> ARM: dts: add gpu node to exynos4
>>>>
>>>> .../bindings/gpu/arm,mali-utgard.txt | 4 +
>>>> arch/arm/boot/dts/exynos4.dtsi | 33 ++
>>>> arch/arm64/boot/dts/amlogic/meson-gxbb.dtsi | 1 +
>>>> .../boot/dts/amlogic/meson-gxl-mali.dtsi | 1 +
>>>> drivers/gpu/drm/Kconfig | 2 +
>>>> drivers/gpu/drm/Makefile | 1 +
>>>> drivers/gpu/drm/drm_file.c | 8 +-
>>>> drivers/gpu/drm/lima/Kconfig | 9 +
>>>> drivers/gpu/drm/lima/Makefile | 19 +
>>>> drivers/gpu/drm/lima/lima_bcast.c | 65 +++
>>>> drivers/gpu/drm/lima/lima_bcast.h | 34 ++
>>>> drivers/gpu/drm/lima/lima_ctx.c | 143 +++++
>>>> drivers/gpu/drm/lima/lima_ctx.h | 51 ++
>>>> drivers/gpu/drm/lima/lima_device.c | 407 ++++++++++++++
>>>> drivers/gpu/drm/lima/lima_device.h | 136 +++++
>>>> drivers/gpu/drm/lima/lima_dlbu.c | 75 +++
>>>> drivers/gpu/drm/lima/lima_dlbu.h | 37 ++
>>>> drivers/gpu/drm/lima/lima_drv.c | 466 ++++++++++++++++
>>>> drivers/gpu/drm/lima/lima_drv.h | 77 +++
>>>> drivers/gpu/drm/lima/lima_gem.c | 459 ++++++++++++++++
>>>> drivers/gpu/drm/lima/lima_gem.h | 41 ++
>>>> drivers/gpu/drm/lima/lima_gem_prime.c | 66 +++
>>>> drivers/gpu/drm/lima/lima_gem_prime.h | 31 ++
>>>> drivers/gpu/drm/lima/lima_gp.c | 293 +++++++++++
>>>> drivers/gpu/drm/lima/lima_gp.h | 34 ++
>>>> drivers/gpu/drm/lima/lima_l2_cache.c | 98 ++++
>>>> drivers/gpu/drm/lima/lima_l2_cache.h | 32 ++
>>>> drivers/gpu/drm/lima/lima_mmu.c | 154 ++++++
>>>> drivers/gpu/drm/lima/lima_mmu.h | 34 ++
>>>> drivers/gpu/drm/lima/lima_object.c | 120 +++++
>>>> drivers/gpu/drm/lima/lima_object.h | 87 +++
>>>> drivers/gpu/drm/lima/lima_pmu.c | 85 +++
>>>> drivers/gpu/drm/lima/lima_pmu.h | 30 ++
>>>> drivers/gpu/drm/lima/lima_pp.c | 418 +++++++++++++++
>>>> drivers/gpu/drm/lima/lima_pp.h | 37 ++
>>>> drivers/gpu/drm/lima/lima_regs.h | 304 +++++++++++
>>>> drivers/gpu/drm/lima/lima_sched.c | 497
>>>> ++++++++++++++++++
>>>> drivers/gpu/drm/lima/lima_sched.h | 126 +++++
>>>> drivers/gpu/drm/lima/lima_ttm.c | 409 ++++++++++++++
>>>> drivers/gpu/drm/lima/lima_ttm.h | 44 ++
>>>> drivers/gpu/drm/lima/lima_vm.c | 312 +++++++++++
>>>> drivers/gpu/drm/lima/lima_vm.h | 73 +++
>>>> include/drm/drm_drv.h | 23 +-
>>>> include/uapi/drm/lima_drm.h | 195 +++++++
>>>> 44 files changed, 5565 insertions(+), 6 deletions(-)
>>>> create mode 100644 drivers/gpu/drm/lima/Kconfig
>>>> create mode 100644 drivers/gpu/drm/lima/Makefile
>>>> create mode 100644 drivers/gpu/drm/lima/lima_bcast.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_bcast.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_ctx.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_ctx.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_device.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_device.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_dlbu.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_dlbu.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_drv.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_drv.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_gem.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_gem.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_gem_prime.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_gp.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_gp.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_l2_cache.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_mmu.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_mmu.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_object.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_object.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_pmu.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_pmu.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_pp.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_pp.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_regs.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_sched.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_sched.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_ttm.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_ttm.h
>>>> create mode 100644 drivers/gpu/drm/lima/lima_vm.c
>>>> create mode 100644 drivers/gpu/drm/lima/lima_vm.h
>>>> create mode 100644 include/uapi/drm/lima_drm.h
>>>>
>
More information about the dri-devel
mailing list