GEM-related desktop sluggishness due to linear-time arch_get_unmapped_area_topdown()

Jerome Glisse j.glisse at gmail.com
Tue Mar 29 12:45:34 PDT 2011


On Tue, Mar 29, 2011 at 2:01 PM, Lucas Stach <dev at lynxeye.de> wrote:
> Am Dienstag, den 29.03.2011, 11:23 -0400 schrieb Jerome Glisse:
>> 2011/3/29 r6144 <rainy6144 at gmail.com>:
>> > 在 2011-03-29二的 10:22 -0400,Jerome Glisse写道:
>> >
>> >> Killer solution would be to have no mapping and a decent
>> >> upload/download ioctl that can take userpage.
>> >
>> > Doesn't this sound like GEM's read/write interface implemented by e.g.
>> > the i915 driver?  But if I understand correctly, a mmap-like interface
>> > should still be necessary if we want to implement e.g. glMapBuffer()
>> > without extra copying.
>> >
>> > r6144
>> >
>> >
>> glMapBuffer should not be use, it's really not a good way to do stuff.
>> Anyway the extra copy might be unavoidable given that sometime the
>> front/back might either be in unmappable vram or either have memory
>> layout that is not the one specify at buffer creation (this is very
>> common when using tiling for instance). So even considering MapBuffer
>> or a like function i believe it's a lot better to not allow buffer
>> mapping in userspace but provide upload/download hooks that can use
>> userpage to avoid as much as possible extra copy.
>>
>> Cheers,
>> Jerome
>>
>
> Wouldn't this give us a performance penalty for short lived resources
> like vbo's which are located in GART memory? Mmap allows us to write
> directly to this drm controlled portion of sysram. With a copy based
> implementation we would have to allocate the buffer in sysram just to
> copy it over to another portion of sysram which seems a little insane to
> me, but I'm not an expert here.
>
> -- Lucas

Short lived & small bo would definitly doesn't work well for this kind
of API, it would all be a function of the ioctl cost. But i am not
sure the drawback would be that big, intel tested with pread/pwrite
and gived up don't remember why. For the vbo case you describe the
scheme i was thinking would be : allocate bo and on buffer data call
upload to the allocated bo using the bind user page feature that would
mean zero extra copy operation. For the fire forget case of vbo,
likely somekind of transient buffer would be more appropriate.

Cheers,
Jerome


More information about the dri-devel mailing list