[Mesa-dev] [PATCH v2 12/14] nv50/ir: use 1024 threads/block for variable local size

Samuel Pitoiset samuel.pitoiset at gmail.com
Sun Sep 11 18:45:32 UTC 2016


When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.

Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
---
 src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
index 4a701f7..0bb14ec 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
@@ -174,7 +174,8 @@ public:
    virtual void getBuiltinCode(const uint32_t **code, uint32_t *size) const = 0;
 
    virtual void parseDriverInfo(const struct nv50_ir_prog_info *info) {
-      threads = info->prop.cp.numThreads;
+      threads =
+         info->prop.cp.numThreads == 0 ? 1024 : info->prop.cp.numThreads;
    }
 
    virtual bool runLegalizePass(Program *, CGStage stage) const = 0;
-- 
2.9.3



More information about the mesa-dev mailing list