[Mesa-dev] [Mesa-stable] [PATCH 3/3] amd: Apply elf relocations and allow code with relocations
Dieter Nützel
Dieter at nuetzel-hh.de
Thu Jun 13 19:20:29 UTC 2019
Am 13.06.2019 07:10, schrieb Marek Olšák:
> FYI, I just pushed the new linker.
>
> Marek
Thank you very much Marek and _Nicolai_ for this GREAT stuff.
It brings back some speed after 1/8 drop with glmark2, lately.
Maybe my amd-staging-drm-next tree (5.2-rc1) didn't honor the kernel
mitigation parameter right.
@Jan
Go ahead with your nice relocation and image work.
Send me what you have in the works.
Latest Mesa git (with Nicolai's new linker) let all 3 luxmark versions
run.
Only 'Hotel lobby' (with v3.0 and v3.1) show some corruption but do NOT
crash any longer. Numbers for 'Neumann TLM-102 SE' (medium) show ~43000K
(!!!).
https://www.phoronix.com/forums/forum/phoronix/latest-phoronix-articles/1106085-linux-kernel-set-to-expose-hidden-nvidia-hda-controllers-helping-laptop-users?p=1106199#post1106199
Blender crash as expected ;-)
/home/dieter> trying to save userpref at
/home/dieter/.config/blender/2.79/config/userpref.blend ok
Read blend: /data/Blender/barbershop_interior_gpu.blend
scripts disabled for "/data/Blender/barbershop_interior_gpu.blend",
skipping 'generate_customprops.py'
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
skipping driver 'var', automatic scripts are disabled
Device init success
Compiling OpenCL program split
Kernel compilation of split finished in 8.41s.
Compiling OpenCL program base
Kernel compilation of base finished in 4.55s.
Compiling OpenCL program denoising
Kernel compilation of denoising finished in 2.08s.
blender: ../src/gallium/drivers/radeonsi/si_compute.c:319:
si_set_global_binding: Assertion `first + n <= MAX_GLOBAL_BUFFERS'
failed.
[1] Abbruch blender (core dumped)
Gretings,
Dieter
> On Mon, Jun 3, 2019 at 10:39 PM Jan Vesely <jan.vesely at rutgers.edu>
> wrote:
>
>> Fixes piglits:
>> call.cl [1]
>> calls-larget-struct.cl [2]
>> calls-struct.cl [3]
>> calls-workitem-id.cl [4]
>> realign-stack.cl [5]
>> tail-calls.cl [6]
>>
>> Cc: mesa-stable at lists.freedesktop.org
>> Signed-off-by: Jan Vesely <jan.vesely at rutgers.edu>
>> ---
>> The piglit test now pass using llvm-7,8,git.
>> ImageMagick works on my raven, but some test still fail on
>> carrizo/iceland.
>> Other workloads (like shoc) that used function calls also work ok.
>> ocltoys work after removing static keyword from .cl files.
>> src/amd/common/ac_binary.c | 30
>> +++++++++++++++++++++++
>> src/gallium/drivers/radeonsi/si_compute.c | 6 -----
>> 2 files changed, 30 insertions(+), 6 deletions(-)
>>
>> diff --git a/src/amd/common/ac_binary.c b/src/amd/common/ac_binary.c
>> index 18dc72c61f0..4d152fcf1be 100644
>> --- a/src/amd/common/ac_binary.c
>> +++ b/src/amd/common/ac_binary.c
>> @@ -178,6 +178,36 @@ bool ac_elf_read(const char *elf_data, unsigned
>> elf_size,
>>
>> parse_relocs(elf, relocs, symbols, symbol_sh_link, binary);
>>
>> + // Apply relocations
>> + for (int i = 0; i < binary->reloc_count; ++i) {
>> + struct ac_shader_reloc *r = &binary->relocs[i];
>> + uint32_t *loc = (uint32_t*)(binary->code +
>> r->offset);
>> + /* Section target relocations store symbol offsets
>> as
>> + * values in reloc location. We're expected to
>> adjust it for
>> + * start of the section. However, R_AMDGPU_REL32 are
>> + * PC relative relocations, so we need to recompute
>> the
>> + * delta between reloc locatin and the target
>> adress.
>> + */
>> + if (r->target_type == 0x3) { // section relocation
>> + uint32_t target_offset = *loc; // already
>> adjusted
>> + int64_t diff = target_offset - r->offset;
>> + if (r->type == 0xa) { // R_AMDGPU_REL32_LO
>> + // address of the 'lo' instruction
>> is 4B below
>> + // the relocation point, but the
>> target has
>> + // alredy been adjusted.
>> + *loc = (diff & 0xffffffff);
>> + } else if (r->type == 0xb) { //
>> R_AMDGPU_REL32_HI
>> + // 'hi' relocation is 8B above 'lo'
>> relocation
>> + *loc = ((diff - 8) >> 32);
>> + } else {
>> + success = false;
>> + fprintf(stderr, "Unsupported section
>> relocation: type: %d, offset: %lx, value: %x\n",
>> + r->type, r->offset,
>> *loc);
>> + }
>> + } else
>> + success = false;
>> + }
>> +
>> if (elf){
>> elf_end(elf);
>> }
>> diff --git a/src/gallium/drivers/radeonsi/si_compute.c
>> b/src/gallium/drivers/radeonsi/si_compute.c
>> index b9cea00eeeb..88631369a62 100644
>> --- a/src/gallium/drivers/radeonsi/si_compute.c
>> +++ b/src/gallium/drivers/radeonsi/si_compute.c
>> @@ -246,12 +246,6 @@ static void *si_create_compute_state(
>> const amd_kernel_code_t *code_object =
>> si_compute_get_code_object(program,
>> 0);
>> code_object_to_config(code_object,
>> &program->shader.config);
>> - if (program->shader.binary.reloc_count != 0)
>> {
>> - fprintf(stderr, "Error: %d
>> unsupported relocations\n",
>> -
>> program->shader.binary.reloc_count);
>> - FREE(program);
>> - return NULL;
>> - }
>> } else {
>>
>> si_shader_binary_read_config(&program->shader.binary,
>> &program->shader.config, 0);
>> --
>> 2.21.0
>>
>> _______________________________________________
>> mesa-stable mailing list
>> mesa-stable at lists.freedesktop.org
>> https://lists.freedesktop.org/mailman/listinfo/mesa-stable
>
>
> Links:
> ------
> [1] http://call.cl
> [2] http://calls-larget-struct.cl
> [3] http://calls-struct.cl
> [4] http://calls-workitem-id.cl
> [5] http://realign-stack.cl
> [6] http://tail-calls.cl
> _______________________________________________
> mesa-dev mailing list
> mesa-dev at lists.freedesktop.org
> https://lists.freedesktop.org/mailman/listinfo/mesa-dev
More information about the mesa-dev
mailing list