[Mesa-dev] [PATCH 31/33] intel: decoder: decouple decoding from memory pointers

Lionel Landwerlin lionel.g.landwerlin at intel.com
Wed Nov 1 15:49:34 UTC 2017


On 01/11/17 15:09, Scott D Phillips wrote:
> Lionel Landwerlin <lionel.g.landwerlin at intel.com> writes:
>
>> On 31/10/17 23:04, Scott D Phillips wrote:
>>> Lionel Landwerlin <lionel.g.landwerlin at intel.com> writes:
>>>
>>>> On 31/10/17 20:54, Scott D Phillips wrote:
>>>>> Lionel Landwerlin <lionel.g.landwerlin at intel.com> writes:
>>>>>
>>>>>> We want to introduce a reader interface for accessing memory, so that
>>>>>> later on we can use different ways of storing the content of the GTT
>>>>>> address space that don't involve a pointer to a linear buffer.
>>>>> I'm kinda sceptical that this is the best way to achieve what you want
>>>>> here. It strikes me as code that we'll look at in a year and wonder
>>>>> what's going on.
>>>>>
>>>>> If I'm understanding, it seems like the essence of what you're going for
>>>>> here is in the one place where you're using the sub_struct_reader. Maybe
>>>>> instead of plumbing the reader object through everywhere, you can add a
>>>>> callback just in gen_print_group for fixing up offsets to pointers, and
>>>>> then leave everywhere else assuming contiguous memory blocks as today.
>>>> First, thanks for you time reviewing this!
>>>>
>>>> I should have stated that in patch 33 I introduce a sparse memory object
>>>> that isn't contiguous.
>>>> It's based on the data structure described here :
>>>> https://en.wikipedia.org/wiki/Hash_array_mapped_trie
>>>>
>>>> The idea is to split the memory into chunks of 4Kb but still make it
>>>> look like it's a 64bit address space.
>>>> The trie structure allows for reuse of pages at different point in time
>>>> without having an actual copy of the whole address space.
>>> What I meant was that most dword reads will really be adjacent in a
>>> piece of memory and leaving the simple pointer math there is
>>> clearer. You will only need to callback for indirection when you're
>>> chasing an offset or an address.
>>>
>>>> Like a couple of pages might have been written by relocations associated
>>>> to the first batch buffer, then 10 batches later you override them.
>>>> The amount of memory we need to allocate for storing 2 snapshots is just
>>>> the modified pages (+ ~12 nodes in the trie but those are less than
>>>> 300bytes).
>>>> That allows the UI to decode 2 batches at the same time as well as all
>>>> the associated memory with a small cost.
>>> Really there's no need to manage any memory for the buffers themselves,
>>> they're immutably stored in the aub file. If you mmap the entire file
>>> then you would just need to have a map of gfx addrs to file addrs that
>>> would help direct your decoding.
>>>
>> Thanks, I'll try that.
> Thinking more about it, I remember that intel_aubdump will break up
> buffers into 32KiB chunks. So that would cause problems for this idea for
> buffers bigger than 32KiB. We could try just not doing that splitting in
> aubdump and see if it has any other adverse effects.
>
I gave a try to your approach and it seems to work but I'm still dealing 
with bugs everywhere :(
Still, I like the idea of the trie :)



More information about the mesa-dev mailing list