[PATCH 0/3] drm/panfrost: MMU fixes

List overview All Threads
Download

newer

older

[PATCH] drm/v3d: Fix incorrect...

[PATCH v5 0/6] drm/qxl: fix driver...

Boris Brezillon

1 Feb 2021 1 Feb '21

8:21 a.m.

Hello,

Here are 2 fixes and one improvement for the page fault handling. Those bugs were found while working on indirect draw supports which requires the allocation of a big heap buffer for varyings, and the vertex/tiler shaders seem to have access pattern that trigger those issues. I remember discussing the first issue with Steve or Robin a while back, but we never hit it before (now we do :)).

The last patch is a perf improvement: no need to re-enable hardware interrupts if we know the threaded irq handler will be woken up right away.

Regards,

Boris

Boris Brezillon (3): drm/panfrost: Clear MMU irqs before handling the fault drm/panfrost: Don't try to map pages that are already mapped drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

drivers/gpu/drm/panfrost/panfrost_mmu.c | 20 +++++++++++++++++--- 1 file changed, 17 insertions(+), 3 deletions(-)

-- 2.26.2

Show replies by date

Boris Brezillon

1 Feb 1 Feb

8:21 a.m.

New subject: [PATCH 1/3] drm/panfrost: Clear MMU irqs before handling the fault

When a fault is handled it will unblock the GPU which will continue executing its shader and might fault almost immediately on a different page. If we clear interrupts after handling the fault we might miss new faults, so clear them before.

Cc: stable@vger.kernel.org Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon boris.brezillon@collabora.com --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 7c1b3481b785..904d63450862 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -593,6 +593,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) access_type = (fault_status >> 8) & 0x3; source_id = (fault_status >> 16);

+ mmu_write(pfdev, MMU_INT_CLEAR, mask); + /* Page fault only */ ret = -1; if ((status & mask) == BIT(i) && (exception_type & 0xF8) == 0xC0) @@ -616,8 +618,6 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) access_type, access_type_name(pfdev, fault_status), source_id);

- mmu_write(pfdev, MMU_INT_CLEAR, mask); - status &= ~mask; }

-- 2.26.2

Steven Price

12:13 p.m.

New subject: [PATCH 1/3] drm/panfrost: Clear MMU irqs before handling the fault

On 01/02/2021 08:21, Boris Brezillon wrote:

...

When a fault is handled it will unblock the GPU which will continue executing its shader and might fault almost immediately on a different page. If we clear interrupts after handling the fault we might miss new faults, so clear them before.

Cc: stable@vger.kernel.org Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

Good catch (although this oddly rings a bell - so perhaps it was me you discussed it with before)

Reviewed-by: Steven Price steven.price@arm.com

...

drivers/gpu/drm/panfrost/panfrost_mmu.c | 4 ++-- 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 7c1b3481b785..904d63450862 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -593,6 +593,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) access_type = (fault_status >> 8) & 0x3; source_id = (fault_status >> 16);
mmu_write(pfdev, MMU_INT_CLEAR, mask);
/* Page fault only */ ret = -1; if ((status & mask) == BIT(i) && (exception_type & 0xF8) == 0xC0)
@@ -616,8 +618,6 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) access_type, access_type_name(pfdev, fault_status), source_id);
mmu_write(pfdev, MMU_INT_CLEAR, mask);
status &= ~mask; }

Boris Brezillon

8:21 a.m.

New subject: [PATCH 2/3] drm/panfrost: Don't try to map pages that are already mapped

We allocate 2MB chunks at a time, so it might appear that a page fault has already been handled by a previous page fault when we reach panfrost_mmu_map_fault_addr(). Bail out in that case to avoid mapping the same area twice.

Cc: stable@vger.kernel.org Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon boris.brezillon@collabora.com --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 904d63450862..21e552d1ac71 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -488,8 +488,14 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, } bo->base.pages = pages; bo->base.pages_use_count = 1; - } else + } else { pages = bo->base.pages; + if (pages[page_offset]) { + /* Pages are already mapped, bail out. */ + mutex_unlock(&bo->base.pages_lock); + goto out; + } + }

mapping = bo->base.base.filp->f_mapping; mapping_set_unevictable(mapping); @@ -522,6 +528,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,

dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr);

+out: panfrost_gem_mapping_put(bomapping);

return 0;

-- 2.26.2

Steven Price

12:13 p.m.

New subject: [PATCH 2/3] drm/panfrost: Don't try to map pages that are already mapped

On 01/02/2021 08:21, Boris Brezillon wrote:

...

We allocate 2MB chunks at a time, so it might appear that a page fault has already been handled by a previous page fault when we reach panfrost_mmu_map_fault_addr(). Bail out in that case to avoid mapping the same area twice.

Cc: stable@vger.kernel.org Fixes: 187d2929206e ("drm/panfrost: Add support for GPU heap allocations") Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

Reviewed-by: Steven Price steven.price@arm.com

...

drivers/gpu/drm/panfrost/panfrost_mmu.c | 9 ++++++++- 1 file changed, 8 insertions(+), 1 deletion(-)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 904d63450862..21e552d1ac71 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -488,8 +488,14 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as, } bo->base.pages = pages; bo->base.pages_use_count = 1;

} else
} else { pages = bo->base.pages;
if (pages[page_offset]) {
	/* Pages are already mapped, bail out. */
	mutex_unlock(&bo->base.pages_lock);
	goto out;
}
}

mapping = bo->base.base.filp->f_mapping; mapping_set_unevictable(mapping);
@@ -522,6 +528,7 @@ static int panfrost_mmu_map_fault_addr(struct panfrost_device *pfdev, int as,

dev_dbg(pfdev->dev, "mapped page fault @ AS%d %llx", as, addr);

+out: panfrost_gem_mapping_put(bomapping);

return 0;

Boris Brezillon

8:21 a.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com --- drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again: + for (i = 0; status; i++) { u32 mask = BIT(i) | BIT(i + 16); u64 addr; @@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }

+ /* If we received new MMU interrupts, process them before returning. */ + status = mmu_read(pfdev, MMU_INT_RAWSTAT); + if (status) + goto again; + mmu_write(pfdev, MMU_INT_MASK, ~0); return IRQ_HANDLED; };

-- 2.26.2

Steven Price

12:13 p.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

On 01/02/2021 08:21, Boris Brezillon wrote:

...

Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

Looks fine to me, but I'm interested to know if you actually saw a performance improvement. Back-to-back MMU faults should (hopefully) be fairly uncommon.

Regardless:

Reviewed-by: Steven Price steven.price@arm.com

...

drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again:

for (i = 0; status; i++) { u32 mask = BIT(i) | BIT(i + 16); u64 addr;

@@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }
/* If we received new MMU interrupts, process them before returning. */

status = mmu_read(pfdev, MMU_INT_RAWSTAT);

if (status)
goto again;
mmu_write(pfdev, MMU_INT_MASK, ~0); return IRQ_HANDLED; };

Boris Brezillon

12:59 p.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

On Mon, 1 Feb 2021 12:13:49 +0000 Steven Price steven.price@arm.com wrote:

...

On 01/02/2021 08:21, Boris Brezillon wrote:

...
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

Looks fine to me, but I'm interested to know if you actually saw a performance improvement. Back-to-back MMU faults should (hopefully) be fairly uncommon.

I actually didn't check the perf improvement or the actual number of back-to-back MMU faults, but dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_1000x1000_drawcount_5000 seemed to generate a few of those, so I thought it'd be good to optimize that case given how trivial it is.

...

Regardless:

Reviewed-by: Steven Price steven.price@arm.com

...
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again:

for (i = 0; status; i++) { u32 mask = BIT(i) | BIT(i + 16); u64 addr;

@@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }
/* If we received new MMU interrupts, process them before returning. */

status = mmu_read(pfdev, MMU_INT_RAWSTAT);

if (status)
goto again;
mmu_write(pfdev, MMU_INT_MASK, ~0); return IRQ_HANDLED; };

Steven Price

1:24 p.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

On 01/02/2021 12:59, Boris Brezillon wrote:

...

On Mon, 1 Feb 2021 12:13:49 +0000 Steven Price steven.price@arm.com wrote:

...
On 01/02/2021 08:21, Boris Brezillon wrote:

...
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

Looks fine to me, but I'm interested to know if you actually saw a performance improvement. Back-to-back MMU faults should (hopefully) be fairly uncommon.

I actually didn't check the perf improvement or the actual number of back-to-back MMU faults, but dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_1000x1000_drawcount_5000 seemed to generate a few of those, so I thought it'd be good to optimize that case given how trivial it is.

Fair enough! I was just a little concerned that Panfrost was somehow provoking enough interrupts that this was a measurable performance improvement.

I assume you'll push these to drm-misc-next (/fixes) as appropriate.

Thanks,

Steve

...

...
Regardless:

Reviewed-by: Steven Price steven.price@arm.com

...
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again:

for (i = 0; status; i++) { u32 mask = BIT(i) | BIT(i + 16); u64 addr;

@@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }
/* If we received new MMU interrupts, process them before returning. */

status = mmu_read(pfdev, MMU_INT_RAWSTAT);

if (status)
goto again;
mmu_write(pfdev, MMU_INT_MASK, ~0); return IRQ_HANDLED; };

Boris Brezillon

1:47 p.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

On Mon, 1 Feb 2021 13:24:00 +0000 Steven Price steven.price@arm.com wrote:

...

On 01/02/2021 12:59, Boris Brezillon wrote:

...
On Mon, 1 Feb 2021 12:13:49 +0000 Steven Price steven.price@arm.com wrote:

...
On 01/02/2021 08:21, Boris Brezillon wrote:

...
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

Looks fine to me, but I'm interested to know if you actually saw a performance improvement. Back-to-back MMU faults should (hopefully) be fairly uncommon.

I actually didn't check the perf improvement or the actual number of back-to-back MMU faults, but dEQP-GLES31.functional.draw_indirect.compute_interop.large.drawelements_combined_grid_1000x1000_drawcount_5000 seemed to generate a few of those, so I thought it'd be good to optimize that case given how trivial it is.

Fair enough! I was just a little concerned that Panfrost was somehow provoking enough interrupts that this was a measurable performance improvement.

I assume you'll push these to drm-misc-next (/fixes) as appropriate.

I think I'll just queue everything to misc-next and let the stable maintainers backport the fixes (no one complained about this issue so far).

...

Thanks,

Steve

...
...
Regardless:

Reviewed-by: Steven Price steven.price@arm.com

...
drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again:

for (i = 0; status; i++) { u32 mask = BIT(i) | BIT(i + 16); u64 addr;

@@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }
/* If we received new MMU interrupts, process them before returning. */

status = mmu_read(pfdev, MMU_INT_RAWSTAT);

if (status)
goto again;
mmu_write(pfdev, MMU_INT_MASK, ~0); return IRQ_HANDLED; };

Rob Herring

3 Feb 3 Feb

2:45 p.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

On Mon, Feb 1, 2021 at 2:21 AM Boris Brezillon boris.brezillon@collabora.com wrote:

...

Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again:
  for (i = 0; status; i++) {
          u32 mask = BIT(i) | BIT(i + 16);
          u64 addr;
@@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }
  /* If we received new MMU interrupts, process them before returning. */
  status = mmu_read(pfdev, MMU_INT_RAWSTAT);
  if (status)
          goto again;

Can't we avoid the goto? Change the for loop like this:

i = 0; while (status = mmu_read(pfdev, MMU_INT_RAWSTAT)) { ...

i++; if (i == 16) i = 0; }

...

    mmu_write(pfdev, MMU_INT_MASK, ~0);
    return IRQ_HANDLED;

};

2.26.2

Steven Price

3:43 p.m.

New subject: [PATCH 3/3] drm/panfrost: Stay in the threaded MMU IRQ handler until we've handled all IRQs

On 03/02/2021 14:45, Rob Herring wrote:

...

On Mon, Feb 1, 2021 at 2:21 AM Boris Brezillon boris.brezillon@collabora.com wrote:

...
Doing a hw-irq -> threaded-irq round-trip is counter-productive, stay in the threaded irq handler as long as we can.

Signed-off-by: Boris Brezillon boris.brezillon@collabora.com

drivers/gpu/drm/panfrost/panfrost_mmu.c | 7 +++++++ 1 file changed, 7 insertions(+)

diff --git a/drivers/gpu/drm/panfrost/panfrost_mmu.c b/drivers/gpu/drm/panfrost/panfrost_mmu.c index 21e552d1ac71..65bc20628c4e 100644 --- a/drivers/gpu/drm/panfrost/panfrost_mmu.c +++ b/drivers/gpu/drm/panfrost/panfrost_mmu.c @@ -580,6 +580,8 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) u32 status = mmu_read(pfdev, MMU_INT_RAWSTAT); int i, ret;

+again:
   for (i = 0; status; i++) {
           u32 mask = BIT(i) | BIT(i + 16);
           u64 addr;
@@ -628,6 +630,11 @@ static irqreturn_t panfrost_mmu_irq_handler_thread(int irq, void *data) status &= ~mask; }
  /* If we received new MMU interrupts, process them before returning. */
  status = mmu_read(pfdev, MMU_INT_RAWSTAT);
  if (status)
          goto again;
Can't we avoid the goto? Change the for loop like this:

i = 0; while (status = mmu_read(pfdev, MMU_INT_RAWSTAT)) { ...
 i++;
 if (i == 16)
     i = 0;
}

Well that reads from the RAWSTAT register excessively (which could be expensive at low GPU clock speeds), but we could do:

for(i = 0; status; i++) { ...

if (!status) { i = 0; status = mmu_read(pfdev, MMU_INT_RAWSTAT); } }

(or similar with a while() if you prefer). Of course we could even get rid of the 'i' loop altogether:

status = mmu_read(pfdev, MMU_INT_RAWSTAT); while (status) { int i = ffs(status | (status >> 16)) - 1;

... existing code ...

status &= ~mask; if (!status) status = mmu_read(pfdev, MMU_INT_RAWSTAT); }

Steve

...

     mmu_write(pfdev, MMU_INT_MASK, ~0);
     return IRQ_HANDLED;

};

2.26.2

1566

Age (days ago)

1568

Last active (days ago)

dri-devel@lists.freedesktop.org

11 comments

3 participants

tags (0)

participants (3)

Boris Brezillon
Rob Herring
Steven Price