imxdrm issue on SABRE Lite

Russell King - ARM Linux linux at armlinux.org.uk
Mon Feb 13 09:38:14 UTC 2017


On Mon, Feb 13, 2017 at 08:55:53AM +0000, Chris Wilson wrote:
> On Mon, Feb 13, 2017 at 09:05:33AM +0100, Thierry Reding wrote:
> > On Sun, Feb 12, 2017 at 12:15:46AM +0000, Russell King - ARM Linux wrote:
> > > On Sat, Feb 11, 2017 at 09:09:34PM +0000, Dan MacDonald wrote:
> > > diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
> > > index 21f992605541..46668d071d6a 100644
> > > --- a/drivers/gpu/drm/drm_atomic_helper.c
> > > +++ b/drivers/gpu/drm/drm_atomic_helper.c
> > > @@ -1201,9 +1201,8 @@ static void commit_tail(struct drm_atomic_state *state)
> > >  	else
> > >  		drm_atomic_helper_commit_tail(state);
> > >  
> > > -	drm_atomic_helper_commit_cleanup_done(state);
> > > -
> > > -	drm_atomic_state_free(state);
> > > +	if (drm_atomic_helper_commit_cleanup_done(state) == 0)
> > > +		drm_atomic_state_free(state);
> > 
> > Chris (Cc'ed) added reference counting to atomic state for v4.10, maybe
> > that already fixes the issue?
> 
> I'm not confident it will, as there is not an independent ref on the
> state for the phases, and so a forced timeout still leaves a dangling
> pointer. The above chunk goes the opposite way and leaks the state to
> avoid the invalid deref, what we need is a ref around its existence on
> the dependency queue if that is outside the lifetime of the commit.

I said as much in my email - unfortunately, Thierry cut all that context.

Right now, we oops the kernel, which causes:

(a) the death of the calling process
(b) leaking of all memory associated with the modeset

What I'm proposing for the -stable kernels is to _improve_ the situation
by eliminating part of the problem, so it's possible to get a better
idea of which bit went wrong and which outputs have failed.

Fixing it properly is likely to be very invasive, since you'll need to
add reference counting to the drm_crtc_commit structure, a pointer
to that in the drm_pending_event structure, and ensure that the
reference count gets incremented at the appropriate time.  Incrementing
the reference count in drm_atomic_helper_setup_commit() certainly isn't
the right place, that would be at the sites which queue the event, but
they are scattered amongst all the atomic modeset drivers.

For reference, here's my complete patch I posted yesterday:

 drivers/gpu/drm/drm_atomic_helper.c | 15 +++++++++------
 include/drm/drm_atomic_helper.h     |  2 +-
 2 files changed, 10 insertions(+), 7 deletions(-)

diff --git a/drivers/gpu/drm/drm_atomic_helper.c b/drivers/gpu/drm/drm_atomic_helper.c
index 21f992605541..46668d071d6a 100644
--- a/drivers/gpu/drm/drm_atomic_helper.c
+++ b/drivers/gpu/drm/drm_atomic_helper.c
@@ -1201,9 +1201,8 @@ static void commit_tail(struct drm_atomic_state *state)
 	else
 		drm_atomic_helper_commit_tail(state);
 
-	drm_atomic_helper_commit_cleanup_done(state);
-
-	drm_atomic_state_free(state);
+	if (drm_atomic_helper_commit_cleanup_done(state) == 0)
+		drm_atomic_state_free(state);
 }
 
 static void commit_work(struct work_struct *work)
@@ -1591,12 +1590,12 @@ EXPORT_SYMBOL(drm_atomic_helper_commit_hw_done);
  * This is part of the atomic helper support for nonblocking commits, see
  * drm_atomic_helper_setup_commit() for an overview.
  */
-void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
+int drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
 {
 	struct drm_crtc *crtc;
 	struct drm_crtc_state *crtc_state;
 	struct drm_crtc_commit *commit;
-	int i;
+	int i, failed = 0;
 	long ret;
 
 	for_each_crtc_in_state(state, crtc, crtc_state, i) {
@@ -1621,15 +1620,19 @@ void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state)
 		 * not hold a reference of its own. */
 		ret = wait_for_completion_timeout(&commit->flip_done,
 						  10*HZ);
-		if (ret == 0)
-			DRM_ERROR("[CRTC:%d:%s] flip_done timed out\n",
+		if (ret == 0) {
+			DRM_ERROR("[CRTC:%d:%s] flip_done timed out, memory leaked\n",
 				  crtc->base.id, crtc->name);
+			failed = -ETIMEDOUT;
+		}
 
 		spin_lock(&crtc->commit_lock);
 del_commit:
 		list_del(&commit->commit_entry);
 		spin_unlock(&crtc->commit_lock);
 	}
+
+	return failed;
 }
 EXPORT_SYMBOL(drm_atomic_helper_commit_cleanup_done);
 
diff --git a/include/drm/drm_atomic_helper.h b/include/drm/drm_atomic_helper.h
index 7ff92b09fd9c..ee3d642c1feb 100644
--- a/include/drm/drm_atomic_helper.h
+++ b/include/drm/drm_atomic_helper.h
@@ -88,7 +88,7 @@ int drm_atomic_helper_setup_commit(struct drm_atomic_state *state,
 				   bool nonblock);
 void drm_atomic_helper_wait_for_dependencies(struct drm_atomic_state *state);
 void drm_atomic_helper_commit_hw_done(struct drm_atomic_state *state);
-void drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state);
+int drm_atomic_helper_commit_cleanup_done(struct drm_atomic_state *state);
 
 /* implementations for legacy interfaces */
 int drm_atomic_helper_update_plane(struct drm_plane *plane,


-- 
RMK's Patch system: http://www.armlinux.org.uk/developer/patches/
FTTC broadband for 0.8mile line: currently at 9.6Mbps down 400kbps up
according to speedtest.net.


More information about the dri-devel mailing list