[Intel-gfx] [RFC] drm/i915 : Reduce the shmem page allocation time by using blitter engines for clearing pages.

Wed May 7 12:02:00 CEST 2014

On Tue, 2014-05-06 at 13:12 +0000, Chris Wilson wrote:
> On Tue, May 06, 2014 at 12:59:37PM +0000, Gupta, Sourab wrote:
> > On Tue, 2014-05-06 at 11:34 +0000, Chris Wilson wrote:
> > > On Tue, May 06, 2014 at 04:40:58PM +0530, sourab.gupta at intel.com wrote:
> > > > From: Sourab Gupta <sourab.gupta at intel.com>
> > > > 
> > > > This patch is in continuation of and is dependent on earlier patch
> > > > series to 'reduce the time for which device mutex is kept locked'.
> > > > (http://lists.freedesktop.org/archives/intel-gfx/2014-May/044596.html)
> > > > 
> > > > This patch aims to reduce the allocation time of pages from shmem
> > > > by using blitter engines for clearing the pages freshly alloced.
> > > > This is utilized in case of fresh pages allocated in shmem_preallocate
> > > > routines in execbuffer path and page_fault path only.
> > > > 
> > > > Even though the CPU memset routine is optimized, but still sometimes
> > > > the time consumed in clearing the pages of a large buffer comes in
> > > > the order of milliseconds.
> > > > We intend to make this operation asynchronous by using blitter engine,
> > > > so irrespective of the size of buffer to be cleared, the execbuffer
> > > > ioctl processing time will not be affected. Use of blitter engine will
> > > > make the overall execution time of 'execbuffer' ioctl lesser for
> > > > large buffers.
> > > > 
> > > > There may be power implications here on using blitter engines, and
> > > > we have to evaluate this. As a next step, we can selectively enable
> > > > this HW based memset only for large buffers, where the overhead of
> > > > adding commands in a blitter ring(which will otherwise be idle),
> > > > cross ring synchronization, will be negligible compared to using the
> > > > CPU to clear out the buffer.
> > > 
> > > You leave a lot of holes by which you leak the uncleared pages to
> > > userspace.
> > > -Chris
> > > 
> > Hi Chris,
> > 
> > Are you ok with the overall design as such, and the
> > shmem_read_mapping_page_gfp_noclear interface?
> > Is the leak of uncleared pages happening due to implementation issues?
> > If so, we'll try to mitigate them.
> 
> Actually, along similar lines there is an even more fundamental issue.
> You should only clear the objects if the pages have not been
> prepopulated.
> -Chris
> 
Hi Chris,
This patch is in continuation of the shmem preallocate patch sent by
Akash earlier.
(http://lists.freedesktop.org/archives/intel-gfx/2014-May/044597.html)
We employ this method only in case of the preallocate routine, which
will be called in the first page fault of the object resulting in fresh
allocation of pages.
This is controlled by means of a flag 'require_clear' which is set in
preallocate routine(which will be come into picture only in case of
fresh allocation). If pages are already populated for the object, this
won't come into picture.
Also, we'll try to fix the leak of uncleared pages due to any
implementation issues.

Regards,
Sourab