[Nouveau] [PATCH] drm/nouveau: fix __nouveau_fence_wait performance regression

Ben Skeggs bskeggs at redhat.com
Mon Mar 7 16:34:37 PST 2011


On Tue, 2011-03-08 at 00:27 +0100, Marcin Slusarz wrote:
> On Tue, Mar 08, 2011 at 12:22:56AM +0100, Marcin Slusarz wrote:
> > On Tue, Mar 08, 2011 at 08:24:26AM +1000, Ben Skeggs wrote:
> > > On Mon, 2011-03-07 at 18:18 +0000, Maarten Maathuis wrote:
> > > > On Fri, Mar 4, 2011 at 4:49 PM, Marcin Slusarz <marcin.slusarz at gmail.com> wrote:
> > > > > On Sun, Feb 13, 2011 at 09:38:04PM +0100, Marcin Slusarz wrote:
> > > > >> Combination of locking and interchannel synchronization changes
> > > > >> uncovered poor behaviour of nouveau_fence_wait, which on HZ=100
> > > > >> configuration could waste up to 10 ms per call.
> > > > >> Depending on application, it lead to 10-30% FPS regression.
> > > > >> To fix it, shorten thread sleep time to 0.1 ms and ensure
> > > > >> spinning happens for at least one *full* tick.
> > > > >>
> > > > >> Signed-off-by: Marcin Slusarz <marcin.slusarz at gmail.com>
> > > > >> ---
> > > > >>  drivers/gpu/drm/nouveau/nouveau_fence.c |   10 ++++++++--
> > > > >>  1 files changed, 8 insertions(+), 2 deletions(-)
> > > > >>
> > > > >> diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > >> index 221b846..75ba5e2 100644
> > > > >> --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > >> +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > > > >> @@ -27,6 +27,9 @@
> > > > >>  #include "drmP.h"
> > > > >>  #include "drm.h"
> > > > >>
> > > > >> +#include <linux/ktime.h>
> > > > >> +#include <linux/hrtimer.h>
> > > > >> +
> > > > >>  #include "nouveau_drv.h"
> > > > >>  #include "nouveau_ramht.h"
> > > > >>  #include "nouveau_dma.h"
> > > > >> @@ -230,9 +233,12 @@ int
> > > > >>  __nouveau_fence_wait(void *sync_obj, void *sync_arg, bool lazy, bool intr)
> > > > >>  {
> > > > >>       unsigned long timeout = jiffies + (3 * DRM_HZ);
> > > > >> -     unsigned long sleep_time = jiffies + 1;
> > > > >> +     unsigned long sleep_time = jiffies + 2;
> > > > >> +     ktime_t t;
> > > > >>       int ret = 0;
> > > > >>
> > > > >> +     t = ktime_set(0, NSEC_PER_MSEC / 10);
> > > > >> +
> > > > >>       while (1) {
> > > > >>               if (__nouveau_fence_signalled(sync_obj, sync_arg))
> > > > >>                       break;
> > > > >> @@ -245,7 +251,7 @@ __nouveau_fence_wait(void *sync_obj, void *sync_arg, bool lazy, bool intr)
> > > > >>               __set_current_state(intr ? TASK_INTERRUPTIBLE
> > > > >>                       : TASK_UNINTERRUPTIBLE);
> > > > >>               if (lazy && time_after_eq(jiffies, sleep_time))
> > > > >> -                     schedule_timeout(1);
> > > > >> +                     schedule_hrtimeout(&t, HRTIMER_MODE_REL);
> > > > >>
> > > > >>               if (intr && signal_pending(current)) {
> > > > >>                       ret = -ERESTARTSYS;
> > > > >> --
> > > > >> 1.7.4.rc3
> > > > >>
> > > > >
> > > > > ping again
> > > > > _______________________________________________
> > > > > Nouveau mailing list
> > > > > Nouveau at lists.freedesktop.org
> > > > > http://lists.freedesktop.org/mailman/listinfo/nouveau
> > > > >
> > > > 
> > > > This looks ok to me, but I would like to get Ben Skeggs ok on this one
> > > > as well. So i've CC'ed him, hopefully he'll notice :-)
> > > Ah sorry, I have actually looked at this quite a while back but came to
> > > no solid conclusion.
> > > 
> > > While yes, I did see some minor performance improvement from it, I also
> > > notice that now we once again get 100% CPU usage while an app is waiting
> > > for the GPU a lot..
> > 
> > It's not "minor" performance improvement:
> > 
> > without this patch (FPS):
> > nexuiz:    53
> > wop:       181
> > tremulous: 157
> > wsw0.5:    89
> > glxgears:  730
> > 
> > with:
> > nexuiz:    63   (+18%)
> > wop:       248  (+37%)
> > tremulous: 156  (-0.6%)
> > wsw0.5:    91   (+2%)
> > glxgears:  1054 (+44%)
> > 
> > 
> > Ok, so you are worried about CPU usage... Let's see what will happen if
> > I remove spinning added by "drm/nouveau: Spin for a bit in 
> > nouveau_fence_wait() before yielding the CPU".
> > 
> > reduced version (attached):
> > nexuiz:    62
> > wop:       248
> > trem:      157
> > wsw0.5:    90
> > glxgears:  1055
> > 
> > Good enough?
> 
> Just for comparison:
> kernel 2.6.37 + pre nvc0fied mesa had:
> nexuiz:    34
> wop:       139
> tremulous: 82
> wsw0.5:    53
> glxgears:  1056
Here's what I see on NVA8 with nv50fiednvc0 (mesa master and nouveau
master), with all cpu cores set to "performance":

Before:
gears: 998 (25.0% cpu)
OA: 197.2
nexuiz: 34.81

After:
gears: 980 (21.0% cpu)
OA: 195.7
nexuiz: 35.30

I also tried with all cores set to "powersave" for comparison.  It's
pretty much the same deal there with little to no difference between
before/after, *except* glxgears gains 100fps here.

Ben.

> 
> > ---
> > From: Marcin Slusarz <marcin.slusarz at gmail.com>
> > Subject: [PATCH] drm/nouveau: fix __nouveau_fence_wait performance regression
> > 
> > Combination of locking and interchannel synchronization changes
> > uncovered poor behaviour of nouveau_fence_wait, which on HZ=100
> > configuration could waste up to 10 ms per call.
> > Depending on application, it lead to 10-30% FPS regression.
> > 
> > To fix it, shorten thread sleep time to 0.1 ms.
> > 
> > Additionally, remove spinning (added by "drm/nouveau: Spin for
> > a bit in nouveau_fence_wait() before yielding the CPU"), because
> > it's not needed anymore.
> > 
> > Signed-off-by: Marcin Slusarz <marcin.slusarz at gmail.com>
> > ---
> >  drivers/gpu/drm/nouveau/nouveau_fence.c |   11 ++++++++---
> >  1 files changed, 8 insertions(+), 3 deletions(-)
> > 
> > diff --git a/drivers/gpu/drm/nouveau/nouveau_fence.c b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > index a244702..010243b 100644
> > --- a/drivers/gpu/drm/nouveau/nouveau_fence.c
> > +++ b/drivers/gpu/drm/nouveau/nouveau_fence.c
> > @@ -27,6 +27,9 @@
> >  #include "drmP.h"
> >  #include "drm.h"
> >  
> > +#include <linux/ktime.h>
> > +#include <linux/hrtimer.h>
> > +
> >  #include "nouveau_drv.h"
> >  #include "nouveau_ramht.h"
> >  #include "nouveau_dma.h"
> > @@ -229,9 +232,11 @@ int
> >  __nouveau_fence_wait(void *sync_obj, void *sync_arg, bool lazy, bool intr)
> >  {
> >  	unsigned long timeout = jiffies + (3 * DRM_HZ);
> > -	unsigned long sleep_time = jiffies + 1;
> > +	ktime_t t;
> >  	int ret = 0;
> >  
> > +	t = ktime_set(0, NSEC_PER_MSEC / 10);
> > +
> >  	while (1) {
> >  		if (__nouveau_fence_signalled(sync_obj, sync_arg))
> >  			break;
> > @@ -243,8 +248,8 @@ __nouveau_fence_wait(void *sync_obj, void *sync_arg, bool lazy, bool intr)
> >  
> >  		__set_current_state(intr ? TASK_INTERRUPTIBLE
> >  			: TASK_UNINTERRUPTIBLE);
> > -		if (lazy && time_after_eq(jiffies, sleep_time))
> > -			schedule_timeout(1);
> > +		if (lazy)
> > +			schedule_hrtimeout(&t, HRTIMER_MODE_REL);
> >  
> >  		if (intr && signal_pending(current)) {
> >  			ret = -ERESTARTSYS;
> > -- 
> > 1.7.4.rc3
> > 
> _______________________________________________
> Nouveau mailing list
> Nouveau at lists.freedesktop.org
> http://lists.freedesktop.org/mailman/listinfo/nouveau




More information about the Nouveau mailing list