[Bug 33867] New: [bisected] Graphics corruption related to pageflip ioctl support in 2.6.38-rc*

bugzilla-daemon at freedesktop.org bugzilla-daemon at freedesktop.org
Thu Feb 3 01:39:49 PST 2011


https://bugs.freedesktop.org/show_bug.cgi?id=33867

           Summary: [bisected] Graphics corruption related to pageflip
                    ioctl support in 2.6.38-rc*
           Product: DRI
           Version: DRI CVS
          Platform: x86-64 (AMD64)
        OS/Version: Linux (All)
            Status: NEW
          Severity: normal
          Priority: medium
         Component: DRM/Radeon
        AssignedTo: dri-devel at lists.freedesktop.org
        ReportedBy: dawitbro at sbcglobal.net


Created an attachment (id=42887)
 --> (https://bugs.freedesktop.org/attachment.cgi?id=42887)
Commits not present in 2.6.37 applied to local branch

I am troubleshooting some graphics corruption I noticed when testing the
post-2.6.37 commits from drm-airlied git (drm-fixes in this case).  I was
trying to produce a kernel as close as possible to the latest stable release
(since 2.6.38 is very early in the rc stage) with all of the newest
radeon-related features.

This is a preliminary report, and may turn out to be invalid, because the
kernel I am using is actually a local branch from v2.6.37 with only the commits
from drm-airlied relevant to my hardware individually cherry-picked.  I have
bisected the problem down to a specific commit, but if I made any errors during
the cherry-pick process then this report is useless.  I plan to confirm that
the problem is real tomorrow, by building directly from the
drm-airlied/drm-fixes tree (which was the most up-to-date tree I could find
today).

As I will report below, there is some similarity with

https://bugs.freedesktop.org/show_bug.cgi?id=33515

and that is the only reason I decided to report my findings before I am really
sure there is a problem.  Hopefully this will help Michel and other devs save
some time and trouble if my guess (that my bug is related) turns out to be
correct.
----------------------

OK, here is what I have done so far:

1.  I made a local git branch based on v2.6.37.
2.  I identified commits I wanted from drm-airlied/drm-fixes (Feb. 2, 2011)
3.  Because of GPU lockups recently cured by Alex Deucher, I picked one
particular commit first (1e644d6d, "drm/radeon/kms: re-emit full context state
for evergreen blits"); without this, testing would be pointless since my GPU
would just lock up when trying and 3D app.
4.  I then cherry-picked the rest of the commits I had chosen, in order from
oldest to newest (according to 'git log').  See attached file,
'applied-cherry-picks.txt', if interested in specifics.

Everything actually seemed to be working fine; I only happened to notice a
small glitch.  I use a locally-built game called 'prboom-plus' with my original
Ultimate DOOM WAD file to test Radeon support (in kernels, Mesa,
xf86-video-ati, etc.).  When a game begins, there is a melt-down animation
which transitions into the new game; there is a similar (I would have guessed
identical before) melting effect when the player is killed but hits the space
bar to restart the game.  In this second melting effect, after being killed,
the "melted" part of the screen was all black, a clear regression.

Before today's testing, I had been using a straight 2.6.37 kernel with the
patch from Alex I mentioned in item 3 above.  With that kernel, both melting
effects work fine.  That seems to rule out a problem with xorg-server, mesa,
libdrm, or xf86-video-ati.

I decided to try bisecting the issue.  I built kernels from the first and last
commits listed in the attached file:  1e644d6d and dca0d612.  The former was
"good" and the latter "bad".  The bisect jumped to 204663c4 and 18007401 next
-- both were "bad".  The next jump, 65705962, caused 'prboom-plus' to hang
during the second kind of melting; I was able to SSH into the machine and 'kill
-9' the game.

Since there was a series of TTM related changes in that series of commits, I
used 'git bisect skip' until I was presented with a commit before or after the
TTM series; I tested the kernels (all hung on the second melt), but told git to
skip them; the ones I pretended to skip were eba67093, 95762c2b, and 702adba2. 
I was offered 147666fb, and it was "bad" but did not hang during the second
kind of melt.

Continuing, the bisect took me back to 2357cbe5, which hung X so that it could
not be killed with 'kill -9'; I had to reboot via SSH.

I skipped ecf7ace9 and 68c4fa31 without building kernels.  The bisect then went
to d6ea8886 and b6724405, which hung (so I pretended to skip them).  I skipped
96726fe50 without building.  The bisect moved to 3e4ea742, which hung, and
6f34be50, which also hung.

Interestingly, f5a80209 was perfectly OK.  This meant that the last commit I
tested was the first bad commit.  I had been skipping the commits causing hangs
because I expected the hang to be a temporary problem resolved somewhere in the
middle of the list; I believed this because of the fact the first and last
commits in my list did _not_ produce hanging kernels.  Of course, I was
actually building most of the kernels that I "skipped", so I knew they were
"bad" even if git did not.

To sum up. the first three cherry-picks (applied to v2.6.37) were fine:

  1e644d6dce366a7bae22484f60133b61ba322911
  drm/radeon/kms: re-emit full context state for evergreen blits

  27641c3f003e7f3b6585c01d8a788883603eb262
  drm/vblank: Add support for precise vblank timestamping.

  f5a8020903932624cf020dc72455a10a3e005087
  drm/kms/radeon: Add support for precise vblank timestamping.

The commit which introduced the hangs -- and these happened entirely
predictably and consistently: when the second kind of "melt" in 'prboom-plus'
occurs -- was

  6f34be50bd1bdd2ff3c955940e033a80d05f248a
  drm/radeon/kms: add pageflip ioctl support (v3)

After a series of several TTM-related commits, the hanging was resolved and the
second kind of melt would display in all black instead, beginning at this
commit:

  147666fb3b93b8c484f562da33a37f886ddff768
  drm/radeon: Use the ttm execbuf utilities


As I mentioned above, these results are preliminary; I will try builds directly
from drm-airlied tomorrow since we are snowed-in here in Michigan, and I don't
have to work tomorrow.

My problems with hangs sound very similar to f.d.o. bug #33515.  This may also
be related to #33418, since that user (Erdem) is using 2.6.38-rc1.


Hardware:  Radeon HD 5750 (Evergreen JUNIPER)

Software:
  kernel 2.6.37 + commits as described above
  libdrm 2.4.23
  xorg-server 1.9.3.902
  xf86-video-ati 6.13.99 (git commit 3dc28c86 of Jan. 27)
  mesa 7.11.0 (git commit 11b15c4d of Jan 30), r600g

-- 
Configure bugmail: https://bugs.freedesktop.org/userprefs.cgi?tab=email
------- You are receiving this mail because: -------
You are the assignee for the bug.


More information about the dri-devel mailing list