[Mesa-dev] [PATCH 08/10] i965: Introduce an OUT_RELOC64 macro.

Kenneth Graunke kenneth at whitecape.org
Mon Jan 13 13:42:51 PST 2014


On 01/13/2014 01:04 PM, Eric Anholt wrote:
> Kenneth Graunke <kenneth at whitecape.org> writes:
> 
>> On 01/09/2014 09:31 PM, Eric Anholt wrote:
>>> Kenneth Graunke <kenneth at whitecape.org> writes:
>>>
>>>> On 12/13/2013 09:28 AM, Daniel Vetter wrote:
>>>>> On Thu, Dec 12, 2013 at 01:26:40AM -0800, Kenneth Graunke wrote:
>>>>>> Broadwell uses 48-bit addresses.  The first DWord is the low 32 bits,
>>>>>> and the second DWord is the high 16 bits.
>>>>>>
>>>>>> Since individual buffers shouldn't be larger than 4GB in size, any
>>>>>> offsets into those buffers (buffer->offset + delta) should fit in the
>>>>>> low 32 bits.  So I believe we can simply emit 0 for the high 16-bits,
>>>>>> and drm_intel_bo_emit_reloc() should patch it up.
>>>>>>
>>>>>> Signed-off-by: Kenneth Graunke <kenneth at whitecape.org>
>>>>>> ---
>>>>>>  src/mesa/drivers/dri/i965/intel_batchbuffer.h | 5 +++++
>>>>>>  1 file changed, 5 insertions(+)
>>>>>>
>>>>>> diff --git a/src/mesa/drivers/dri/i965/intel_batchbuffer.h b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>>>>>> index 159f928..128eed9 100644
>>>>>> --- a/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>>>>>> +++ b/src/mesa/drivers/dri/i965/intel_batchbuffer.h
>>>>>> @@ -178,6 +178,11 @@ void intel_batchbuffer_cached_advance(struct brw_context *brw);
>>>>>>  				read_domains, write_domain, delta);	\
>>>>>>  } while (0)
>>>>>>  
>>>>>> +/* Handle 48-bit address relocations for Gen8+ */
>>>>>> +#define OUT_RELOC64(buf, read_domains, write_domain, delta) \
>>>>>> +   OUT_RELOC(buf, read_domains, write_domain, delta);       \
>>>>>> +   OUT_BATCH(0);
>>>>>
>>>>> Please not. The presumed_offset that libdrm uses is 64bits, and you need
>>>>> to emit the full presumed address (and correctly shifted). Atm the kernel
>>>>> never gives you a presumed reloc offset with the high bits set so it
>>>>> doesn't matter. But I'd prefer if we don't need to make this opt-in
>>>>> behaviour once we enable address spaces with more than 4G.
>>>>>
>>>>> i-g-t gets away with the cheap hack since we're allowed to break igt.
>>>>> Let me check ddx and libva whether I've lost this fight already ...
>>>>> -Daniel
>>>>
>>>> I'm more than happy to do the right thing, I just don't know what that
>>>> is.  I don't see any uint64_t values in the interface we use at all:
>>>>
>>>> OUT_RELOC becomes
>>>>    ret = drm_intel_bo_emit_reloc(brw->batch.bo, 4*brw->batch.used,
>>>>                                  buffer, delta,
>>>>                                  read_domains, write_domain);
>>>
>>> The libdrm ABI is a disaster.  bo->offset is a long, so we're keeping 32
>>> bits of the kernel's returned value on 32 bit userspace, and 64 bits on
>>> 64 bit userspace.  This means that on 32-bit we'll write in an
>>> expected-incorrect offset in the presumed offset for a >4g-located BO,
>>> which the kernel will map and fix up at exec time.  On 64-bit, your
>>> patch would write an expected-incorrect 32-bit value into the batch, but
>>> libdrm would tell the kernel the full expected 64 bit value in the
>>> presumed_offset field, and you'll get brokenness for >4g buffers.
>>>
>>> So, I think you do need a drm_intel_bo_emit_reloc64 that returns a
>>> uint64_t value that the kernel wrote into the presumed offset, which you
>>> then plug into your batchbuffer.
>>>
>>> (In other news, while thinking about this, there are some obscure races
>>> with buffer migration due to presumed_offset being read at a separate
>>> time from when we look up bo->offset to actually write the offset into
>>> the batch, in the presence of context sharing in GL).
>>
>> I'd really like to land this patch as-is, since I need it to land the
>> rest of my Broadwell code.  I would update the commit message to note
>> that it's broken for >4G currently.
> 
> I don't like landing known-broken code that will give you mysterious
> hangs under memory pressure.  I could possibly ack this if there was a
> WARN_ON_ONCE or just having it be a stub or something, but "kind of
> works except when you start running a big app or run something for a
> long time" is not cool.

Well, hooray for double standards, given that every other userspace
component has landed this code, but didn't bother to even consolidate it
into one easily fixable place...

It's been over a year since I wrote most of this code, and I would
REALLY like to actually land some things.

But fine, I'll go write some libdrm patches...

--Ken


More information about the mesa-dev mailing list