[Nouveau] [PATCH/TESTING(all hw)/DISCUSSION] FIFO (minor) create and (major) destroy instabilities on nv50+

Maarten Maathuis madman2003 at gmail.com
Mon Jan 4 14:54:43 PST 2010


I forgot to mention that you should run nop from fbcon without X
running for reliable lockups.

On Mon, Jan 4, 2010 at 11:39 PM, Ben Skeggs <skeggsb at gmail.com> wrote:
> On Mon, 2010-01-04 at 20:29 +0100, Maarten Maathuis wrote:
>> I've narrowed it down further, the "pgraph->fifo_access" bit is still
>> cleanup (register 0x400500 represents pgraph fifo access), the rest
>> appears needed for the desired effect. The reordering of pfifo and
>> pgraph destroy is needed. As usual, feedback is appreciated.
> I played a bit yesterday and have the gr/fifoctx unload ordering swap
> and queued up already, as well as unconditionally waiting on a fence at
> channel destroy (not really needed, but served as a bit of a cleanup
> anyway).
>
> I'll try and look at the rest of the changes.
>
> Ben.
>>
>> Maarten.
>>
>> On Sat, Jan 2, 2010 at 4:36 PM, Maarten Maathuis <madman2003 at gmail.com> wrote:
>> > Many people using nv50+ hardware are aware of gpu lockups when a fifo
>> > closes under certain conditions. Based on a mmio-trace and some trail
>> > and error testing i've come up with a patch that improves the
>> > situation on my NV96.
>> >
>> > This patch needs testing on NV50+ hardware and regression testing on
>> > older hardware, since i did change some of the common codepaths. This
>> > is very much a work in progress, and if you have anything to
>> > add/correct, please share it.
>> >
>> > I've also attached a 2 test apps, once is bitscan-fail from mwk, use
>> > it like ./bitscan-fail 0x200 to trigger PGRAPH errors. A modified
>> > version only emits NOPs (method 0x100) and represents the no error
>> > situation.
>> >
>> > For me, i can run the NOP program in loops of 10000 iterations with no
>> > problems (i've done so several times), the bitscan-fail survives 10000
>> > iterations sometimes, but can also fail after a few thousand. In
>> > comparison, a single run of bitscan-fail could cause a gpu lockup for
>> > me in the past.
>> >
>> > Please try the gallium driver, the test apps, suspend to ram. Suspend
>> > to ram isn't 100% reliable yet for me (this was always the case after
>> > strange experiments/hammering/etc), but should not regress. This goes
>> > for older hw as well, whatever worked should still work, but i
>> > wouldn't expect serious improvements there.
>> >
>> > As always, feedback is appreciated, especially since this is a touchy subject.
>> >
>> > Maarten.
>> >
>> _______________________________________________
>> Nouveau mailing list
>> Nouveau at lists.freedesktop.org
>> http://lists.freedesktop.org/mailman/listinfo/nouveau
>
>
>


More information about the Nouveau mailing list