[PATCH v4 1/2] intel: 48b ppgtt support (EXEC_OBJECT_SUPPORTS_48B_ADDRESS flag)

Michel Thierry michel.thierry at intel.com
Tue Oct 13 07:55:41 PDT 2015


On 10/13/2015 3:13 PM, Emil Velikov wrote:
> On 13 October 2015 at 13:16, Michel Thierry <michel.thierry at intel.com> wrote:
>> On 10/6/2015 2:12 PM, Michel Thierry wrote:
>>>
>>> On 10/5/2015 7:06 PM, Kristian Høgsberg wrote:
>>>>
>>>> On Mon, Oct 5, 2015 at 7:03 AM, Michel Thierry
>>>> <michel.thierry at intel.com> wrote:
>>>>>
>>>>> On 9/14/2015 2:54 PM, Michał Winiarski wrote:
>>>>>>
>>>>>>
>>>>>> On Thu, Sep 03, 2015 at 03:23:58PM +0100, Michel Thierry wrote:
>>>>>>>
>>>>>>>
>>>>>>> Gen8+ supports 48-bit virtual addresses, but some objects must
>>>>>>> always be
>>>>>>> allocated inside the 32-bit address range.
>>>>>>>
>>>>>>> In specific, any resource used with flat/heapless
>>>>>>> (0x00000000-0xfffff000)
>>>>>>> General State Heap (GSH) or Instruction State Heap (ISH) must be in a
>>>>>>> 32-bit range, because the General State Offset and Instruction State
>>>>>>> Offset
>>>>>>> are limited to 32-bits.
>>>>>>>
>>>>>>> The i915 driver has been modified to provide a flag to set when the
>>>>>>> 4GB
>>>>>>> limit is not necessary in a given bo
>>>>>>> (EXEC_OBJECT_SUPPORTS_48B_ADDRESS).
>>>>>>> 48-bit range will only be used when explicitly requested.
>>>>>>>
>>>>>>> Callers to the existing drm_intel_bo_emit_reloc function should set
>>>>>>> the
>>>>>>> use_48b_address_range flag beforehand, in order to use full ppgtt
>>>>>>> range.
>>>>>>>
>>>>>>> v2: Make set/clear functions nops on pre-gen8 platforms, and use them
>>>>>>>        internally in emit_reloc functions (Ben)
>>>>>>>        s/48BADDRESS/48B_ADDRESS/ (Dave)
>>>>>>> v3: Keep set/clear functions internal, no-one needs to use them
>>>>>>> directly.
>>>>>>> v4: Don't set 48bit-support flag in emit reloc, check for ppgtt type
>>>>>>>        before enabling set/clear function, print full offsets in debug
>>>>>>>        statements, using port of lower_32_bits and upper_32_bits
>>>>>>> from linux
>>>>>>>        kernel (Michał)
>>>>>>>
>>>>>>> References:
>>>>>>> http://lists.freedesktop.org/archives/intel-gfx/2015-July/072612.html
>>>>>>> Cc: Ben Widawsky <ben at bwidawsk.net>
>>>>>>> Cc: Michał Winiarski <michal.winiarski at intel.com>
>>>>>>
>>>>>>
>>>>>>
>>>>>> +Kristian
>>>>>>
>>>>>> LGTM.
>>>>>> Acked-by: Michał Winiarski <michal.winiarski at intel.com>
>>>>>>
>>>>>>> Signed-off-by: Michel Thierry <michel.thierry at intel.com>
>>>>>
>>>>>
>>>>>
>>>>>
>>>>> Hi Kristian,
>>>>>
>>>>> More comments on this?
>>>>> I've resent the mesa patch with the last changes
>>>>>
>>>>> (http://lists.freedesktop.org/archives/dri-devel/2015-October/091752.html).
>>>>>
>>>>>
>>>>> Michał has agreed with the interface and will use it for his work.
>>>>> If mesa doesn't plan to use this for now, it's ok. The kernel changes
>>>>> have
>>>>> been done to prevent any regressions when the support 48-bit flag is not
>>>>> set.
>>>>
>>>>
>>>> I didn't get any replies to my last comments on this:
>>>>
>>>> http://lists.freedesktop.org/archives/mesa-dev/2015-August/091398.html
>>>>
>>>> Basically, I tried explicitly placing buffers above 8G and didn't see
>>>> the HW problem described (ie it all worked fine).  I still think
>>>> there's some confusion as to what the W/A is about.
>>>>
>>>> Here's my take: the W/A is a SW-only workaround to handle the cases
>>>> where a driver uses heapless and 48-bit ppgtt. The problem is that the
>>>> heaps are limited to 4G but with 48bit ppgtt a buffer can be placed
>>>> anywhere it the 48 bit address space. If that happens it's consideredd
>>>> out-of-bounds for the heap and access fails. To prevent this we need
>>>> to make sure the bos we address in a heapless fashion are placed below
>>>> 4g.
>>>>
>>>> For mesa, we only configure general state base as heap-less, which
>>>> affects scratch bos. What this boils down to is that we need the 4G
>>>> restriction only for the scratch bos set up on 3DSTATE_VS, 3DSTATE_GS
>>>> and 3DSTATE_PS (and tesselation stage eventually). Look for the
>>>> OUT_RELOC64 for stage->scratch_bo in gen8_vs_state.c, gen8_gs_state.c
>>>> and gen8_ps_state.c
>>>
>>>
>>> I think it also affects _3DSTATE_VIEWPORT_STATE_POINTERS_CC, maybe it
>>> isn't exclusive to the scratch bos (the error state points to that
>>> instruction).
>>>
>>>
>>
>> Hi,
>>
>> Anymore inputs about this patch?
>> AFAIK, if changes are required based on further comments from Kristian,
>> these will be in the mesa patch[1], not libdrm. Also, Michał will use this
>> flag in another project.
>>
> The comment seems quite brief and I'm not sure it fully addresses
> Kristian's concern. I'd suspect that providing reference to the HW
> documentation (confirming your assumption) might be beneficial.
>

Sure, attached is the hang I get if I don't set the restriction in 
gen8_misc_state.c and try to use the full 48-bit range 
(i915_error_state_nowa.txt). This is just running gfxbench t-rex.

I see the same hang signature when it is only applied to the scratch bos 
(in gen8_vs_state.c, gen8_gs_state.c and gen8_ps_state.c - 
i915_error_state_scratchbo.txt).

3DSTATE_VIEWPORT_STATE_POINTERS_CC (0x78230000) is defined here:
https://01.org/sites/default/files/documentation/intel-gfx-prm-osrc-bdw-vol02a-commandreference-instructions_0.pdf 
(page 256)

Thanks,

-Michel
-------------- next part --------------
A non-text attachment was scrubbed...
Name: i915_error_states.7z
Type: application/octet-stream
Size: 7118 bytes
Desc: not available
URL: <http://lists.freedesktop.org/archives/dri-devel/attachments/20151013/0f833f62/attachment.obj>


More information about the dri-devel mailing list