[PATCH] drm/nouveau: fix ttm move notify callback

Jerome Glisse j.glisse at gmail.com
Fri Jan 6 10:22:20 PST 2012


On Fri, Jan 06, 2012 at 11:53:35AM -0500, Konrad Rzeszutek Wilk wrote:
> On Fri, Jan 06, 2012 at 11:51:03AM -0500, Jerome Glisse wrote:
> > On Fri, Jan 6, 2012 at 9:57 AM, Konrad Rzeszutek Wilk
> > <konrad.wilk at oracle.com> wrote:
> > > On Thu, Jan 05, 2012 at 09:14:10PM -0500, Konrad Rzeszutek Wilk wrote:
> > >> On Fri, Jan 06, 2012 at 07:53:13AM +1000, Ben Skeggs wrote:
> > >> > On Thu, 2012-01-05 at 13:31 -0500, j.glisse at gmail.com wrote:
> > >> > > From: Jerome Glisse <jglisse at redhat.com>
> > >> > >
> > >> > > ttm might call the move notify with null new mem placement,
> > >> > > properly handle this case inside nouveau move notify callback.
> > >> > This has been fixed already in a -next tree I sent to Dave.
> > >>
> > >> I just tried -next with your patch (and two other fixes that I had sent):
> > >>
> > >> drm/ttm/dma: Only call set_pages_array_wb when the page is not in WB pool
> > >> drm/ttm/dma: Fix accounting error when calling ttm_mem_global_free_page and don't try to free freed pages
> > >>
> > >> and Jerome's AGP fix:
> > >> ttm: fix agp since ttm tt rework
> > >>
> > >> and got the crash (but only with NVidia cards) after swapping between Xorg and the VCs.
> > >> Look in drm-next.jpg
> > >
> > > http://darnok.org/vga/drm-next.jpg
> > >
> > >>
> > >> With your patch removed ("drm/nouveau/ttm: fix crash as a result of a recent ttm change")
> > >> and the patch below by Jerome I still get it to crash (see drm-next-with-Jerome-fix-revert-Ben.jpg)..
> > >
> > > http://darnok.org/vga/drm-next-with-Jerome-fix-revert-Ben.jpg
> > >
> > 
> > Anything special to trigger it ? I can't trigger it with simple gnome3
> > session (firefox evince ...)
> 
> I ran etracer, then switched over to a framebuffer console (Alt-F2), logged in.
> Then ran perf record and switched back to etracer. Ran a couple of laps and when finished
> quit the perf top. On the PCI-e it took a while (so I had to run a couple of laps).
> 
> On the AGP one it happended immediately, which is no surprise since the code looks
> to be activated when we do garbage collection and the machine only had 2GB. The
> PCIe on has 8GB. Perhaps a better way would be to force the workqueue by setting the
> pool limits to smaller values.
>

Still having difficulty to reproduce can you reproduce with the attached
printk debuging patch and provide the log (only few printk preceding the
oops or segfault are interesting).

Cheers,
Jerome


More information about the dri-devel mailing list