[PATCH 05/13] drm/ttm: overhaul memory accounting

Jerome Glisse j.glisse at gmail.com
Fri Nov 11 07:47:42 PST 2011


On Fri, Nov 11, 2011 at 08:49:39AM +0100, Thomas Hellstrom wrote:
> On 11/11/2011 12:33 AM, Jerome Glisse wrote:
> >On Thu, Nov 10, 2011 at 09:05:22PM +0100, Thomas Hellstrom wrote:
> >>On 11/10/2011 07:05 PM, Jerome Glisse wrote:
> >>>On Thu, Nov 10, 2011 at 11:27:33AM +0100, Thomas Hellstrom wrote:
> >>>>On 11/09/2011 09:22 PM, j.glisse at gmail.com wrote:
> >>>>>From: Jerome Glisse<jglisse at redhat.com>
> >>>>>
> >>>>>This is an overhaul of the ttm memory accounting. This tries to keep
> >>>>>the same global behavior while removing the whole zone concept. It
> >>>>>keeps a distrinction for dma32 so that we make sure that ttm don't
> >>>>>starve the dma32 zone.
> >>>>>
> >>>>>There is 3 threshold for memory allocation :
> >>>>>- max_mem is the maximum memory the whole ttm infrastructure is
> >>>>>   going to allow allocation for (exception of system process see
> >>>>>   below)
> >>>>>- emer_mem is the maximum memory allowed for system process, this
> >>>>>   limit is>    to max_mem
> >>>>>- swap_limit is the threshold at which point ttm will start to
> >>>>>   try to swap object because ttm is getting close the max_mem
> >>>>>   limit
> >>>>>- swap_dma32_limit is the threshold at which point ttm will start
> >>>>>   swap object to try to reduce the pressure on the dma32 zone. Note
> >>>>>   that we don't specificly target object to swap to it might very
> >>>>>   well free more memory from highmem rather than from dma32
> >>>>>
> >>>>>Accounting is done through used_mem&    used_dma32_mem, which sum give
> >>>>>the total amount of memory actually accounted by ttm.
> >>>>>
> >>>>>Idea is that allocation will fail if (used_mem + used_dma32_mem)>
> >>>>>max_mem and if swapping fail to make enough room.
> >>>>>
> >>>>>The used_dma32_mem can be updated as a later stage, allowing to
> >>>>>perform accounting test before allocating a whole batch of pages.
> >>>>>
> >>>>Jerome, you're removing a fair amount of functionality here, without
> >>>>justifying
> >>>>why it could be removed.
> >>>All this code was overkill.
> >>[1] I don't agree, and since it's well tested, thought throught and
> >>working, I see no obvious reason to alter it,
> >>within the context of this patch series unless it's absolutely
> >>required for the functionality.
> >Well one thing i can tell is that it doesn't work on radeon, i pushed
> >a test to libdrm and here it's the oom that starts doing its beating.
> >Anyway i won't alter it. Was just trying to make it works, ie be useful
> >while also being simpler.
> 
> Well if it doesn't work it should of course be fixed.
> 
> I'm not against fixing it nor making it simpler, but I think that
> requires a detailed understanding of what's going wrong and how it
> needs to be fixed. Not as part of a patch series that really tries
> to accomplish something else.
> 
> The current code was tested extensively with psb and unichrome.
> One good test for drivers with bo-backed textures is to continously
> create fairly large texture images. The end result should be the
> swap space starting to fill up and once there is no more swap space,
> the OOM killer should kill your app, and kmalloc failures should be
> avoided. It should be tricky to get a failure from the global alloc
> system, but a huge amount of small buffer objects or fence objects
> should probably do it.
> 
> Naturally, that requires that all persistent drm objects created
> from user-space are registered with their correct sizes, or at least
> a really good size approximation. That includes things like gem
> flinks, that could otherwise easily be exploited to bring a system
> down, simply by guessing a gem name and create flinks to that name
> in an infinite loop.
> 
> What are the symptoms of the failure you're seeing with Radeon? Any
> suggestions on why it happens?
> 
> Thanks,
> Thomas

I pushed my test case to libdrm yesterday, i basicly alloc ttm object
of 1 page in a loop and expect it to fail. I modified the kernel to
account 2 page for the ttm_buffer_object struct size so that the kernel
area should be exhausted long before i run out of memory on a 8G
config. What happen is that the oom start killing everythings except
my app, even the kernel logger daemon got kill before my app ...

I think the ttm_memory accounting for kernel object is not the right
way. What we want here is trying to avoid to starve kernel memory.
Thing is we are uniq in the kernel afaik (i take a look at bunch
of others driver to check that in net, audio, ...). in that we allocate
a lot of kernel object on the behalf of userspace.

I believe a better solution would be either to put a limit on the number
of objects. For instance 16 millions objects (wether fence of bo struct).
Or start talking with kernel folks about a dev_alloc function that could
be use to allocate memory and account for this memory against a device.
Kernel would be able to decide for a policy for this. There is already
a netdev_alloc_page but it's not use in any meaningfull way afaict.

Then when it comes to memory of buffer object themself, it would be a
lot better if we could somehow make it accountable against user process.
Like hooking into memory group (memcg). This would also allow us to
directly reuse kernel infrastructure, things like watermark on when
to start either swapin or reclaiming memory or simply refusing to
allocate more memory.

Cheers,
Jerome


More information about the dri-devel mailing list