Mesa (master): nv50/ir: set number of threads/ block for variable local size
Samuel Pitoiset
hakzsam at kemper.freedesktop.org
Thu Oct 6 22:25:22 UTC 2016
Module: Mesa
Branch: master
Commit: 11e75fffeb4afc5be0021477f11e5a18a6ff6abf
URL: http://cgit.freedesktop.org/mesa/mesa/commit/?id=11e75fffeb4afc5be0021477f11e5a18a6ff6abf
Author: Samuel Pitoiset <samuel.pitoiset at gmail.com>
Date: Wed Sep 7 00:12:51 2016 +0200
nv50/ir: set number of threads/block for variable local size
When a variable local size is defined as specified by
ARB_compute_variable_group_size, the fixed local size is set to 0
and a SIGFPE occurs when we compute the maximum number of regs.
This allows to use 64 GPRs/thread.
v4: - use 512 threads on Fermi, 1024 on Kepler+
Signed-off-by: Samuel Pitoiset <samuel.pitoiset at gmail.com>
---
src/gallium/drivers/nouveau/codegen/nv50_ir_target.h | 2 ++
1 file changed, 2 insertions(+)
diff --git a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
index 4a701f7..eaf50cc 100644
--- a/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
+++ b/src/gallium/drivers/nouveau/codegen/nv50_ir_target.h
@@ -175,6 +175,8 @@ public:
virtual void parseDriverInfo(const struct nv50_ir_prog_info *info) {
threads = info->prop.cp.numThreads;
+ if (threads == 0)
+ threads = info->target >= NVISA_GK104_CHIPSET ? 1024 : 512;
}
virtual bool runLegalizePass(Program *, CGStage stage) const = 0;
More information about the mesa-commit
mailing list