hang check triggered at low temperatures

Sven Schmitt Sven.Schmitt at mixed-mode.de
Mon Jun 4 06:31:53 UTC 2018


Hello,

we have discovered a sneaky, temperature based problem in the imx6 etnaviv driver. 

If the chip temperature falls bellow 10 degrees celsius and display update rate raises above 200ms the etnaviv gpu hang check is triggered reproducible [1].
We have the suspicion that the runtime pm autosuspend mechanism (configured to 200ms [2]) is the root of the problem. 
Disabling CONFIG_PM heals the bug.

Our test environment:

- temperature bellow 10 degrees celsius
- phyFLEX-i.MX6 Board
- kernel 4.16.3
- CONFIG_PM=y
- Qt 5.9.5 using EGLFS (no wayland or X) 
- libdrm 2.4.90
- mesalib 17.3.9

The attached minimal QT example can trigger the bug [3].

Any ideas what's going wrong here?

Tanks for your replies and best regards

Sven

[1] https://elixir.bootlin.com/linux/v4.16.3/source/drivers/gpu/drm/etnaviv/etnaviv_gpu.c#L918
[2] https://elixir.bootlin.com/linux/v4.16.3/source/drivers/gpu/drm/etnaviv/etnaviv_gpu.c#L1825 
[3]
#ifndef MAINWINDOW_H
#define MAINWINDOW_H

#include <QLabel>
#include <QMainWindow>
#include <QTimer>

#include "ui_mainwindow.h"

namespace Ui {
class MainWindow;
}

class MainWindow : public QMainWindow {
  Q_OBJECT

public:
  explicit MainWindow(QWidget *parent = 0)
      : QMainWindow(parent), ui(new Ui::MainWindow) {
    ui->setupUi(this);

    blinkingLabel = new QLabel("blink", this);
    QTimer *timer = new QTimer(this);
    timer->setInterval(250); // gpu suspends => gpu hangs on temperatures bellow 10°C
    // timer->setInterval(190); // gpu always active => no gpu hang
    this->connect(timer, &QTimer::timeout, this, &MainWindow::blink);
    timer->start();
  }
  ~MainWindow() { delete ui; }

private slots:
  void blink(void) {
    if (blinkingLabel->isVisible())
      blinkingLabel->hide();
    else
      blinkingLabel->show();
  }

private:
  Ui::MainWindow *ui;
  QLabel *blinkingLabel;
};

#endif // MAINWINDOW_H


More information about the etnaviv mailing list