[Intel-gfx] [PATCH]DRM i915: IGD big FIFO support
Shaohua Li
shaohua.li at intel.com
Wed Jun 10 08:52:55 CEST 2009
On Wed, Jun 10, 2009 at 06:05:19AM +0800, Jesse Barnes wrote:
> On Tue, 2 Jun 2009 15:49:27 +0800
> Shaohua Li <shaohua.li at intel.com> wrote:
>
> > On Mon, Jun 01, 2009 at 05:26:01PM +0800, Jesse Barnes wrote:
> > > On Mon, 25 May 2009 10:28:36 +0800
> > > Shaohua Li <shaohua.li at intel.com> wrote:
> > >
> > > > On Sat, May 23, 2009 at 03:39:51AM +0800, Eric Anholt wrote:
> > > > > On Mon, 2009-05-18 at 10:44 +0800, Shaohua Li wrote:
> > > > > > + addr = ioremap(mch_bar, HOSTBAR_SIZE);
> > > > > > + if (!addr)
> > > > > > + return -ENOMEM;
> > > > > > + tmp = *(u32 *)(addr + 0xc00);
> > > > > > + iounmap(addr);
> > > > >
> > > > > MCHBAR mapping is tricky. I think we'll need to use the same
> > > > > code here as i915_gem_tiling.c, and that's changing in 2.6.31 to
> > > > > successfully enable it (safely) more often.
> > > > Makes sense, I fixed the issues you pointed out.
> > > >
> > > >
> > > > Big FIFO is a feature to put memory into self-refresh mode when
> > > > CPU enters C3+ state. Gfx has a FIFO to buffer memory access,
> > > > when the watermark of the FIFO is under threshold, Gfx doesn't
> > > > need access memory, so at that time memory can be put into
> > > > self-refresh mode.
> > > >
> > > > The watermark calculation is based on CPU C3 state exit latency.
> > > > If watermark is wrong, when CPU enters C3+, display will be
> > > > broken or flicker.
> > > >
> > > > I had a power measurement about the feature:
> > > > environment: 1920x1400 display, Atom CPU with C4 enabled, system
> > > > FSB is 667 and memory is DDR2 667. Launch X, and gives system
> > > > several minutes to settle down, then test the power of the whole
> > > > system in idle time: without big fifo, idle power is 19.8w
> > > > with it, idle power is 18.6w
> > > >
> > > > The patch doesn't enable HPLL off for CxSR. Last time I heard it's
> > > > broken in current IGD chip, if it works, then I'll add it later.
> > >
> > > Shaohua, can you take a look at this patch and see if it makes
> > > sense to include in yours? I think we could probably share the
> > > latency tables & math (haven't checked mine in this patch yet, it's
> > > untested on platforms needing FIFO adjustment) at the very least...
> >
> > the latency talbe might be ok. the math looks different. My patch is
> > using clock*pixel_size*latency/cachline_size for SR WM, but looks
> > yours takes a different approach for SR WM. does the math depends on
> > chipset? The register write is different. Most fields of FW_BLC and
> > FW_BLC_SELF is reserved in IGD and my patch is using FW1-FW3.
>
> No, the SR WM should probably be shared; you have both docs right?
> It's quite possible I have the wrong formula.
>
> A few comments to help integrate things:
> - the DPMS callout should be like mine, just call it
> update_watermarks or something instead
> - the new IGD_* constants should be in i915_reg.h probably
> - the read frequency function should use new defines for all the
> registers read/written
> - see anholt's comment about MCHBAR; we probably want to put this in
> the tiling.c file and run it at load time rather than mode set time
> - the update_watermarks should either be a function pointer to set
> the chip specific watermark regs or should handle all cases along
> the lines of my patch
>
> I'd like to get both bits of code into 2.6.31 though, so we should get
> them integrated soon.
I did some cleanup and try to make it (hopefully) generic. please check
how to integrate yours.
Thanks,
Shaohua
Big FIFO is a feature to put memory into self-refresh mode when CPU
enters C3+ state. Gfx has a FIFO to buffer memory access, when the
watermark of the FIFO is under threshold, Gfx doesn't need access
memory, so at that time memory can be put into self-refresh mode.
The watermark calculation is based on CPU C3 state exit latency. If
watermark is wrong, when CPU enters C3+, display will be broken or
flicker.
I had a power measurement about the feature:
environment: 1920x1400 display, Atom CPU with C4 enabled, system FSB
is 667 and memory is DDR2 667. Launch X, and gives system several
minutes to settle down, then test the power of the whole system in idle time:
without big fifo, idle power is 19.8w
with it, idle power is 18.6w
The patch doesn't enable HPLL off for CxSR. Last time I heard it's broken
in current IGD chip, if it works, then I'll add it later.
Signed-off-by: Shaohua Li <shaohua.li at intel.com>
---
drivers/gpu/drm/i915/i915_dma.c | 40 +++++++
drivers/gpu/drm/i915/i915_drv.h | 15 ++
drivers/gpu/drm/i915/i915_reg.h | 5
drivers/gpu/drm/i915/intel_display.c | 196 ++++++++++++++++++++++++++++++++++-
4 files changed, 255 insertions(+), 1 deletion(-)
Index: linux/drivers/gpu/drm/i915/i915_drv.h
===================================================================
--- linux.orig/drivers/gpu/drm/i915/i915_drv.h 2009-06-04 15:23:47.000000000 +0800
+++ linux/drivers/gpu/drm/i915/i915_drv.h 2009-06-10 13:49:13.000000000 +0800
@@ -195,6 +195,9 @@ typedef struct drm_i915_private {
int fence_reg_start; /* 4 if userland hasn't ioctl'd us yet */
int num_fence_regs; /* 8 on pre-965, 16 otherwise */
+ bool cxsr_initialized;
+ unsigned int fsb_freq, mem_freq;
+
/* Register state */
u8 saveLBB;
u32 saveDSPACNTR;
@@ -825,4 +828,16 @@ extern int i915_wait_ring(struct drm_dev
#define PRIMARY_RINGBUFFER_SIZE (128*1024)
+/* FIFO watermark */
+#define I915_FIFO_LINE_SIZE 64
+#define IGD_DISPLAY_FIFO 512 /* in 64byte unit */
+#define IGD_MAX_WM 0x1ff
+#define IGD_DFT_WM 0x3f
+#define IGD_DFT_HPLLOFF_WM 0
+#define IGD_GUARD_WM 10
+#define IGD_CURSOR_FIFO 64
+#define IGD_CURSOR_MAX_WM 0x3f
+#define IGD_CURSOR_DFT_WM 0
+#define IGD_CURSOR_GUARD_WM 5
+
#endif
Index: linux/drivers/gpu/drm/i915/intel_display.c
===================================================================
--- linux.orig/drivers/gpu/drm/i915/intel_display.c 2009-06-04 15:23:47.000000000 +0800
+++ linux/drivers/gpu/drm/i915/intel_display.c 2009-06-10 14:03:53.000000000 +0800
@@ -786,7 +786,7 @@ intel_pipe_set_base(struct drm_crtc *crt
}
-
+static void intel_set_cxsr(struct drm_device *dev, bool on);
/**
* Sets the power management mode of the pipe and plane.
*
@@ -848,8 +848,12 @@ static void intel_crtc_dpms(struct drm_c
/* Give the overlay scaler a chance to enable if it's on this pipe */
//intel_crtc_dpms_video(crtc, true); TODO
+ if (crtc->enabled)
+ intel_set_cxsr(dev, true);
break;
case DRM_MODE_DPMS_OFF:
+ if (crtc->enabled)
+ intel_set_cxsr(dev, false);
/* Give the overlay scaler a chance to disable if it's on this pipe */
//intel_crtc_dpms_video(crtc, FALSE); TODO
@@ -1030,6 +1034,194 @@ static int intel_panel_fitter_pipe (stru
return 1;
}
+struct intel_watermark_parameter {
+ unsigned long fifo_size;
+ unsigned long max_wm;
+ unsigned long default_wm;
+ unsigned long guard_size;
+};
+
+static unsigned long intel_calculate_wm(unsigned long clock_in_khz,
+ struct intel_watermark_parameter *wm, unsigned long latency,
+ int line_size)
+{
+ unsigned long bytes_required, wm_size;
+
+ /* always use 32bpp */
+ bytes_required = clock_in_khz * 4 * latency/1000000;
+ bytes_required /= line_size;
+ wm_size = wm->fifo_size - bytes_required - wm->guard_size;
+
+ if (wm_size > wm->max_wm)
+ wm_size = wm->max_wm;
+ if (wm_size == 0)
+ wm_size = wm->default_wm;
+ return wm_size;
+}
+
+struct cxsr_latency {
+ int is_desktop;
+ unsigned long fsb_freq;
+ unsigned long mem_freq;
+ unsigned long display_sr;
+ unsigned long display_hpll_disable;
+ unsigned long cursor_sr;
+ unsigned long cursor_hpll_disable;
+};
+
+static struct cxsr_latency cxsr_latency_table[] = {
+ {1, 800, 400, 3382, 33382, 3983, 33983}, /* DDR2-400 SC */
+ {1, 800, 667, 3354, 33354, 3807, 33807}, /* DDR2-667 SC */
+ {1, 800, 800, 3347, 33347, 3763, 33763}, /* DDR2-800 SC */
+
+ {1, 667, 400, 3400, 33400, 4021, 34021}, /* DDR2-400 SC */
+ {1, 667, 667, 3372, 33372, 3845, 33845}, /* DDR2-667 SC */
+ {1, 667, 800, 3386, 33386, 3822, 33822}, /* DDR2-800 SC */
+
+ {1, 400, 400, 3472, 33472, 4173, 34173}, /* DDR2-400 SC */
+ {1, 400, 667, 3443, 33443, 3996, 33996}, /* DDR2-667 SC */
+ {1, 400, 800, 3430, 33430, 3946, 33946}, /* DDR2-800 SC */
+
+ {0, 800, 400, 3438, 33438, 4065, 34065}, /* DDR2-400 SC */
+ {0, 800, 667, 3410, 33410, 3889, 33889}, /* DDR2-667 SC */
+ {0, 800, 800, 3403, 33403, 3845, 33845}, /* DDR2-800 SC */
+
+ {0, 667, 400, 3456, 33456, 4103, 34106}, /* DDR2-400 SC */
+ {0, 667, 667, 3428, 33428, 3927, 33927}, /* DDR2-667 SC */
+ {0, 667, 800, 3443, 33443, 3905, 33905}, /* DDR2-800 SC */
+
+ {0, 400, 400, 3528, 33528, 4255, 34255}, /* DDR2-400 SC */
+ {0, 400, 667, 3500, 33500, 4079, 34079}, /* DDR2-667 SC */
+ {0, 400, 800, 3487, 33487, 4029, 34029}, /* DDR2-800 SC */
+};
+
+static struct cxsr_latency *intel_get_cxsr_latency(int is_desktop,
+ int fsb, int mem)
+{
+ int i;
+ struct cxsr_latency *latency;
+
+ if (fsb == 0 || mem == 0)
+ return NULL;
+
+ for (i = 0; i < ARRAY_SIZE(cxsr_latency_table); i++) {
+ latency = &cxsr_latency_table[i];
+ if (is_desktop == latency->is_desktop &&
+ fsb == latency->fsb_freq && mem == latency->mem_freq)
+ break;
+ }
+ if (i >= ARRAY_SIZE(cxsr_latency_table)) {
+ DRM_DEBUG("Unknown FSB/MEM found, disable CxSR\n");
+ return NULL;
+ }
+ return latency;
+}
+
+static void intel_set_cxsr(struct drm_device *dev, bool on)
+{
+ u32 reg;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+
+ if (!IS_IGD(dev) || !dev_priv->cxsr_initialized)
+ return;
+ if (on) {
+ /* activate cxsr */
+ reg = I915_READ(DSPFW3);
+ reg |= 1 << 30;
+ I915_WRITE(DSPFW3, reg);
+ } else {
+ /* deactivate cxsr */
+ reg = I915_READ(DSPFW3);
+ reg &= ~(1 << 30);
+ I915_WRITE(DSPFW3, reg);
+ }
+}
+
+static struct intel_watermark_parameter igd_display_wm =
+{IGD_DISPLAY_FIFO, IGD_MAX_WM, IGD_DFT_WM, IGD_GUARD_WM};
+static struct intel_watermark_parameter igd_display_hplloff_wm =
+{IGD_DISPLAY_FIFO, IGD_MAX_WM, IGD_DFT_HPLLOFF_WM, IGD_GUARD_WM};
+static struct intel_watermark_parameter igd_cursor_wm =
+{IGD_CURSOR_FIFO, IGD_CURSOR_MAX_WM, IGD_CURSOR_DFT_WM, IGD_CURSOR_GUARD_WM};
+static struct intel_watermark_parameter igd_cursor_hplloff_wm =
+{IGD_CURSOR_FIFO, IGD_CURSOR_MAX_WM, IGD_CURSOR_DFT_WM, IGD_CURSOR_GUARD_WM};
+
+static void igd_update_watermarks(struct drm_device *dev, unsigned long clock)
+{
+ u32 reg;
+ unsigned long wm;
+ struct drm_i915_private *dev_priv = dev->dev_private;
+ struct cxsr_latency *latency;
+ int pipes_enabled = 0;
+ struct drm_crtc *crtc;
+
+ mutex_lock(&dev->mode_config.mutex);
+ list_for_each_entry(crtc, &dev->mode_config.crtc_list, head) {
+ if (crtc->enabled)
+ pipes_enabled++;
+ }
+ mutex_unlock(&dev->mode_config.mutex);
+ if (pipes_enabled != 1) {
+ DRM_DEBUG("Two pipes are enabled, disable CxSR\n");
+ goto disable_cxsr;
+ }
+
+ latency = intel_get_cxsr_latency(IS_IGDG(dev), dev_priv->fsb_freq,
+ dev_priv->mem_freq);
+ if (!latency) {
+ DRM_DEBUG("Unknown FSB/MEM found, disable CxSR\n");
+ goto disable_cxsr;
+ }
+
+ /* Display SR */
+ wm = intel_calculate_wm(clock, &igd_display_wm, latency->display_sr,
+ I915_FIFO_LINE_SIZE);
+ reg = I915_READ(DSPFW1);
+ reg &= 0x7fffff;
+ reg |= wm << 23;
+ I915_WRITE(DSPFW1, reg);
+ DRM_DEBUG("DSPFW1 register is %x\n", reg);
+
+ /* cursor SR */
+ wm = intel_calculate_wm(clock, &igd_cursor_wm, latency->cursor_sr,
+ I915_FIFO_LINE_SIZE);
+ reg = I915_READ(DSPFW3);
+ reg &= ~(0x3f << 24);
+ reg |= (wm & 0x3f) << 24;
+ I915_WRITE(DSPFW3, reg);
+
+ /* Display HPLL off SR */
+ wm = intel_calculate_wm(clock, &igd_display_hplloff_wm,
+ latency->display_hpll_disable, I915_FIFO_LINE_SIZE);
+ reg = I915_READ(DSPFW3);
+ reg &= 0xfffffe00;
+ reg |= wm & 0x1ff;
+ I915_WRITE(DSPFW3, reg);
+
+ /* cursor HPLL off SR */
+ wm = intel_calculate_wm(clock, &igd_cursor_hplloff_wm,
+ latency->cursor_hpll_disable, I915_FIFO_LINE_SIZE);
+ reg = I915_READ(DSPFW3);
+ reg &= ~(0x3f << 16);
+ reg |= (wm & 0x3f) << 16;
+ I915_WRITE(DSPFW3, reg);
+ DRM_DEBUG("DSPFW3 register is %x\n", reg);
+
+ dev_priv->cxsr_initialized = true;
+
+ DRM_INFO("Big FIFO is enabled\n");
+ return;
+disable_cxsr:
+ dev_priv->cxsr_initialized = false;
+ DRM_INFO("Big FIFO is disabled\n");
+}
+
+static void intel_update_watermarks(struct drm_device *dev, unsigned long clock)
+{
+ if (IS_IGD(dev))
+ igd_update_watermarks(dev, clock);
+}
+
static int intel_crtc_mode_set(struct drm_crtc *crtc,
struct drm_display_mode *mode,
struct drm_display_mode *adjusted_mode,
@@ -1318,6 +1510,8 @@ static int intel_crtc_mode_set(struct dr
if (ret != 0)
return ret;
+ intel_update_watermarks(dev, clock.dot);
+
drm_vblank_post_modeset(dev, pipe);
return 0;
Index: linux/drivers/gpu/drm/i915/i915_reg.h
===================================================================
--- linux.orig/drivers/gpu/drm/i915/i915_reg.h 2009-06-04 15:23:47.000000000 +0800
+++ linux/drivers/gpu/drm/i915/i915_reg.h 2009-06-10 11:15:19.000000000 +0800
@@ -562,6 +562,7 @@
#define C0DRB3 0x10206
#define C1DRB3 0x10606
+#define CLKCFG 0x10c00
/** GM965 GM45 render standby register */
#define MCHBAR_RENDER_STANDBY 0x111B8
@@ -1382,6 +1383,10 @@
#define DSPARB_CSTART_SHIFT 7
#define DSPARB_BSTART_MASK (0x7f)
#define DSPARB_BSTART_SHIFT 0
+
+#define DSPFW1 0x70034
+#define DSPFW2 0x70038
+#define DSPFW3 0x7003c
/*
* The two pipe frame counter registers are not synchronized, so
* reading a stable value is somewhat tricky. The following code
Index: linux/drivers/gpu/drm/i915/i915_dma.c
===================================================================
--- linux.orig/drivers/gpu/drm/i915/i915_dma.c 2009-06-10 11:18:24.000000000 +0800
+++ linux/drivers/gpu/drm/i915/i915_dma.c 2009-06-10 11:26:08.000000000 +0800
@@ -1078,6 +1078,44 @@ void i915_master_destroy(struct drm_devi
master->driver_priv = NULL;
}
+static void i915_get_mem_freq(struct drm_device *dev)
+{
+ drm_i915_private_t *dev_priv = dev->dev_private;
+ u32 tmp;
+
+ if (!IS_IGD(dev))
+ return;
+
+ tmp = I915_READ(CLKCFG);
+
+ switch (tmp & 0x7) {
+ case 1:
+ dev_priv->fsb_freq = 533; /* 133*4 */
+ break;
+ case 2:
+ dev_priv->fsb_freq = 800; /* 200*4 */
+ break;
+ case 3:
+ dev_priv->fsb_freq = 667; /* 167*4 */
+ break;
+ case 5:
+ dev_priv->fsb_freq = 400; /* 100*4 */
+ break;
+ }
+
+ switch ((tmp >> 4) & 0x7) {
+ case 1:
+ dev_priv->mem_freq = 533;
+ break;
+ case 2:
+ dev_priv->mem_freq = 667;
+ break;
+ case 3:
+ dev_priv->mem_freq = 800;
+ break;
+ }
+}
+
/**
* i915_driver_load - setup chip and create an initial config
* @dev: DRM device
@@ -1165,6 +1203,8 @@ int i915_driver_load(struct drm_device *
goto out_iomapfree;
}
+ i915_get_mem_freq(dev);
+
/* On the 945G/GM, the chipset reports the MSI capability on the
* integrated graphics even though the support isn't actually there
* according to the published specs. It doesn't appear to function
More information about the Intel-gfx
mailing list