Created attachment 63356 [details] Xorg.0.log This happened early May on drm-next somewhere between 4f256e8..d3029b4, and is still there in 3.5rc3 (and in current drm-next). Things are smeared out vertically. Looks like desktop background is not corrupted. By turning off "EXABitmaps" there is less corruption. I haven't done git bisecting, only download bisecting from http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-next/ and v3.4-rc6-295-g4f256e8 from May 8th was good and v3.4-rc6-315-gd3029b4 from May 10th was bad. Unfortunately the build from May 9th has been deleted in the meantime so I can not narrow it down further this way. So the commits in question should be: d3029b4 drm/radeon/kms: fix warning on 32-bit in atomic fence printing f2e3922 drm/radeon: make the ib an inline object f237750 drm/radeon: remove r600 blit mutex v2 68470ae drm/radeon: move the semaphore from the fence into the ib 7c0d409 drm/radeon: immediately free ttm-move semaphore c507f7e drm/radeon: rip out the ib pool a8c0594 drm/radeon: simplify semaphore handling v2 c3b7fe8 drm/radeon: multiple ring allocator v3 0085c950 drm/radeon: use one wait queue for all rings add fence_wait_any v2 557017a drm/radeon: define new SA interface v3 2e0d991 drm/radeon: make sa bo a stand alone object e6661a9 drm/radeon: keep start and end offset in the SA 711a972 drm/radeon: add sub allocator debugfs file a651c55 drm/radeon: add proper locking to the SA v3 dd8bea2 drm/radeon: use inline functions to calc sa_bo addr 8a47cc9 drm/radeon: rework locking ring emission mutex in fence deadlock detection v2 3b7a2b2 drm/radeon: rework fence handling, drop fence list v7 bb63556 drm/radeon: convert fence to uint64_t v4 d6999bc drm/radeon: replace the per ring mutex with a global one 133f4cb drm/radeon: fix possible lack of synchronization btw ttm and other ring 01:00.0 VGA compatible controller [0300]: Advanced Micro Devices [AMD] nee ATI Radeon Mobility X700 (PCIE) [1002:5653]
Created attachment 63357 [details] dmesg output
Created attachment 63359 [details] screenshot (no xorg.conf options)
Can you try to bisect this using git bisect and find the first bad commit?
Sorry, I don't know when I can have time to do that. I'll try harder if the bug can be confirmed by other people too. Maybe the right developer can make an educated guess if it's limited to this card.
Hi guys, can this be related to https://bugs.freedesktop.org/show_bug.cgi?id=54129 ? I ended up in the same area of the git log.
Also can you test if booting with radeon.no_wb=1 fix the issue ?
Thanks, will test this later. BTW I already tried http://people.freedesktop.org/~glisse/0001-drm-radeon-extra-type-safe-for-fence-emission.patch which came up on the dri-devel list, but that did not fix it.
No, booting with radeon.no_wb=1 didn't help.
Created attachment 66942 [details] [review] backport of Christian's patch I tried backporting Christian's patch from https://bugs.freedesktop.org/show_bug.cgi?id=54129#c11 but it did not help either. I suppose the following /sys/kernel/debug/dri/0/radeon_fence_info output indicates that the patch took effect, since the emitted numbers are above 0x100000000LL? --- ring 0 --- Last signaled fence 0x000000020000149f Last emitted 0x0000000100001a9a --- ring 0 --- Last signaled fence 0x000000020000149f Last emitted 0x0000000100002041 --- ring 0 --- Last signaled fence 0x000000020000149f Last emitted 0x000000010000294a
Created attachment 66986 [details] [review] backport of Christian's v2 patch I tried backporting the v2 patch from http://lists.freedesktop.org/archives/dri-devel/2012-September/027608.html to kernel 3.5.2, see attached, but it did not help either. Maybe my card has another issue? Output from /sys/kernel/debug/dri/0/radeon_fence_info --- ring 0 --- Last signaled fence 0x00000000deadbeef Last emitted 0x0000000000000670 --- ring 0 --- Last signaled fence 0x00000000deadbeef Last emitted 0x0000000000000c44
(In reply to comment #10) > Output from /sys/kernel/debug/dri/0/radeon_fence_info > > --- ring 0 --- > Last signaled fence 0x00000000deadbeef > Last emitted 0x0000000000000670 > > --- ring 0 --- > Last signaled fence 0x00000000deadbeef > Last emitted 0x0000000000000c44 WTF? Well that's a very interesting information you've got us here, thanks allot. "deadbeef" is a pattern we usually use for ring and IB tests, and I have no idea how that ended up as last signaled fence value. Could you try Jeromes debugging patch (http://people.freedesktop.org/~glisse/0001-debug-fence-emission-reception.patch) and attach the resulting output. Thx, Christian.
Created attachment 67047 [details] [review] Possible fix
Please give the attached V3 version of the patch a try, it adds the last emitted fence as an upper limit and so should be able to even handle "deadbeef" values. Cheers, Christian.
Yes, v3 works! I applied it to 3.5.2 by replacing rdev->fence_drv[ring].sync_seq[ring] with rdev->fence_drv[ring].seq and there is no more corruption. The /sys/kernel/debug/dri/0/radeon_fence_info is now in sync, or off by one: --- ring 0 --- Last signaled fence 0x0000000000002651 Last emitted 0x0000000000002652 --- ring 0 --- Last signaled fence 0x0000000000002703 Last emitted 0x0000000000002704 Do you still want me to run the debug patch? It seems you are not sure about the 0xdeadbeef and there could be other bugs?
Applied in 3.5.5.
Great, sounds like we can close the bug now.
A patch referencing this bug report has been merged in Linux v3.6-rc6: commit f492c171a38d77fc13a8998a0721f2da50835224 Author: Christian König <deathsimple@vodafone.de> Date: Thu Sep 13 10:33:47 2012 +0200 drm/radeon: make 64bit fences more robust v3
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.