[PATCH 3/3] accel/habanalabs: dump temperature threshold boot error

Oded Gabbay ogabbay at kernel.org
Mon Jun 12 12:07:33 UTC 2023


From: Ofir Bitton <obitton at habana.ai>

Add dump of an error reported from f/w during boot time.
This error indicates a failure with setting temperature threshold.

Signed-off-by: Ofir Bitton <obitton at habana.ai>
Reviewed-by: Oded Gabbay <ogabbay at kernel.org>
Signed-off-by: Oded Gabbay <ogabbay at kernel.org>
---
 drivers/accel/habanalabs/common/firmware_if.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/drivers/accel/habanalabs/common/firmware_if.c b/drivers/accel/habanalabs/common/firmware_if.c
index 370508e98854..c7da69dbfa0a 100644
--- a/drivers/accel/habanalabs/common/firmware_if.c
+++ b/drivers/accel/habanalabs/common/firmware_if.c
@@ -724,6 +724,11 @@ static bool fw_report_boot_dev0(struct hl_device *hdev, u32 err_val,
 		err_exists = true;
 	}
 
+	if (err_val & CPU_BOOT_ERR0_TMP_THRESH_INIT_FAIL) {
+		dev_err(hdev->dev, "Device boot error - Failed to set threshold for temperature sensor\n");
+		err_exists = true;
+	}
+
 	if (err_val & CPU_BOOT_ERR0_DEVICE_UNUSABLE_FAIL) {
 		/* Ignore this bit, don't prevent driver loading */
 		dev_dbg(hdev->dev, "device unusable status is set\n");
-- 
2.40.1



More information about the dri-devel mailing list