[git pull] habanalabs for drm-next-6.5
Oded Gabbay
ogabbay at kernel.org
Thu Jun 8 10:30:43 UTC 2023
Hi Dave, Daniel.
Habanalabs pull request for 6.5.
As Gaudi2 is pretty much stable, this contains mostly bug fixes and small
optimizations and improvements.
Full details are in the signed tag.
Thanks,
Oded
The following changes since commit 2e1492835e439fceba57a5b0f9b17da8e78ffa3d:
Merge tag 'drm-misc-next-2023-06-01' of git://anongit.freedesktop.org/drm/drm-misc into drm-next (2023-06-02 13:39:00 +1000)
are available in the Git repository at:
https://git.kernel.org/pub/scm/linux/kernel/git/ogabbay/linux.git tags/drm-habanalabs-next-2023-06-08
for you to fetch changes up to e6f49e96bc57d34fc0f617f37bfdf62a9b58d2c2:
accel/habanalabs: refactor error info reset (2023-06-08 12:35:56 +0300)
----------------------------------------------------------------
This tag contains additional habanalabs driver changes for v6.5:
- uAPI changes:
- Return 0 when user queries if there was a h/w or f/w error and no such error happened.
Previously we returned an error in such case.
- New features and improvements:
- Add pci health check when we lose connection with the firmware. This can be used to
distinguish between pci link down and firmware getting stuck.
- Add more info to the error print when TPC interrupt occur.
- Reduce amount of code under mutex in the command submission of signal event.
- Firmware related fixes:
- Fixes to the handshake protocol during f/w initialization.
- Display information that the f/w sends us when encountering a DMA error.
- Do soft-reset using a message sent to firmware instead of writing to MMIO.
- Prepare generic code to extract f/w version numbers.
- Bug fixes and code cleanups. Notable fixes are:
- Unsecure certain TPC registers that the user should access.
- Fix handling of QMAN errors
- Fix memory leak when recording errors (to later pass them to the user)
- Multiple fixes to razwi interrupt handling code
----------------------------------------------------------------
Dafna Hirschfeld (6):
accel/habanalabs: add helper to extract the FW major/minor
accel/habanalabs: rename fw_{major/minor}_version to fw_inner_{major/minor}_ver
accel/habanalabs: extract and save the FW's SW major/minor/sub-minor
accel/habanalabs: check fw version using sw version
accel/habanalabs: do soft-reset using cpucp packet
accel/habanalabs: add missing tpc interrupt info
Dan Carpenter (1):
accel/habanalabs: fix gaudi2_get_tpc_idle_status() return
Dani Liberman (4):
accel/habanalabs: use binning info when handling razwi
accel/habanalabs: mask part of hmmu page fault captured address
accel/habanalabs: add description to several info ioctls
accel/habanalabs: refactor error info reset
Koby Elbaz (8):
accel/habanalabs: remove commented code that won't be used
accel/habanalabs: minimize encapsulation signal mutex lock time
accel/habanalabs: refactor abort of completions and waits
accel/habanalabs: poll for device status update following WFE cmd
accel/habanalabs: fix a static warning - 'dubious: x & !y'
accel/habanalabs: rename security functions related arguments
accel/habanalabs: upon DMA errors, use FW-extracted error cause
accel/habanalabs: update state when loading boot fit
Moti Haimovski (3):
accel/habanalabs: fix bug in free scratchpad memory
accel/habanalabs: call to HW/FW err returns 0 when no events exist
accel/habanalabs: fix mem leak in capture user mappings
Oded Gabbay (5):
accel/habanalabs: set unused bit as reserved
accel/habanalabs: align to latest firmware specs
accel/habanalabs: print max timeout value on CS stuck
accel/habanalabs: remove sim code
accel/habanalabs: move ioctl error print to debug level
Ofir Bitton (7):
accel/habanalabs: unsecure TPC bias registers
accel/habanalabs: add pci health check during heartbeat
accel/habanalabs: always fetch pci addr_dec error info
accel/habanalabs: remove support for mmu disable
accel/habanalabs: fix bug of not fetching addr_dec info
accel/habanalabs: unsecure TSB_CFG_MTRR regs
accel/habanalabs: add event queue extra validation
Rakesh Ughreja (1):
accel/habanalabs: allow user to modify EDMA RL register
Tal Cohen (1):
accel/habanalabs: ignore false positive razwi
Tom Rix (1):
accel/habanalabs: remove variable gaudi_irq_name
Tomer Tayar (3):
accel/habanalabs: expose debugfs files later
accel/habanalabs: use lower QM in QM errors handling
accel/habanalabs: print qman data on error only for lower qman
Yang Li (1):
accel/habanalabs: Fix some kernel-doc comments
drivers/accel/habanalabs/common/command_buffer.c | 6 -
.../accel/habanalabs/common/command_submission.c | 61 ++--
drivers/accel/habanalabs/common/debugfs.c | 60 ++--
drivers/accel/habanalabs/common/device.c | 112 ++++---
drivers/accel/habanalabs/common/firmware_if.c | 212 ++++++++++---
drivers/accel/habanalabs/common/habanalabs.h | 77 ++---
drivers/accel/habanalabs/common/habanalabs_drv.c | 9 +-
drivers/accel/habanalabs/common/habanalabs_ioctl.c | 35 +--
drivers/accel/habanalabs/common/irq.c | 2 +-
drivers/accel/habanalabs/common/memory.c | 104 +------
drivers/accel/habanalabs/common/mmu/mmu.c | 56 +---
drivers/accel/habanalabs/common/security.c | 57 ++--
drivers/accel/habanalabs/gaudi/gaudi.c | 13 +-
drivers/accel/habanalabs/gaudi2/gaudi2.c | 334 ++++++++-------------
drivers/accel/habanalabs/gaudi2/gaudi2P.h | 2 +-
drivers/accel/habanalabs/gaudi2/gaudi2_security.c | 15 +-
drivers/accel/habanalabs/goya/goya.c | 3 -
drivers/accel/habanalabs/goya/goya_coresight.c | 9 +-
drivers/accel/habanalabs/include/common/cpucp_if.h | 22 +-
.../accel/habanalabs/include/common/hl_boot_if.h | 41 +--
.../include/gaudi2/asic_reg/gaudi2_regs.h | 11 +
.../accel/habanalabs/include/gaudi2/gaudi2_fw_if.h | 2 +-
include/uapi/drm/habanalabs_accel.h | 10 +
23 files changed, 557 insertions(+), 696 deletions(-)
More information about the dri-devel
mailing list