Bug 58659 - With latest kernel 3.8-rc1, compiz crashes after boot
Summary: With latest kernel 3.8-rc1, compiz crashes after boot
Status: RESOLVED FIXED
Alias: None
Product: DRI
Classification: Unclassified
Component: DRM/Radeon (show other bugs)
Version: DRI git
Hardware: Other All
: medium normal
Assignee: Default DRI bug account
QA Contact:
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-12-22 20:24 UTC by Bryan Quigley
Modified: 2013-01-28 17:21 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
syslog (15.47 KB, text/plain)
2012-12-22 20:24 UTC, Bryan Quigley
no flags Details
tail -f output of Xorg, from starting big picture (2.92 KB, text/plain)
2013-01-14 19:29 UTC, Bryan Quigley
no flags Details
possible fix (2.76 KB, patch)
2013-01-15 14:05 UTC, Alex Deucher
no flags Details | Splinter Review
Exclude system placement (1.96 KB, patch)
2013-01-16 22:32 UTC, Jerome Glisse
no flags Details | Splinter Review

Description Bryan Quigley 2012-12-22 20:24:02 UTC
Created attachment 71998 [details]
syslog

The bug was introduced in drm-next between 12/06 and 12/11 according to ubuntu's kernel builds here: http://kernel.ubuntu.com/~kernel-ppa/mainline/drm-next/2012-12-11-raring/

Error messages from syslog attached.

01:00.0 VGA compatible controller: Advanced Micro Devices [AMD] nee ATI RV670 [Radeon HD 3870]

I am using Xorg-Edgers mesa/Xorg, but the crash also appears when I go back to stable mesa/Xorg.
Comment 2 Bryan Quigley 2012-12-23 17:44:02 UTC
My kernel build does have that patch.

Unfortunately, I am having trouble bisecting because something that appears unrelated makes my system fail to boot.  That boot failure already been fixed in more recent builds thought :/.
Comment 3 Alex Deucher 2012-12-23 18:17:57 UTC
Looks like this is probably a duplicate of this issue:
http://lists.freedesktop.org/archives/dri-devel/2012-December/032547.html
See if reverting commit 2d6cc729 fixes the problem.
Comment 4 Bryan Quigley 2012-12-24 04:54:35 UTC
Reverting patch 2d6cc729 does indeed fix the problem.
Comment 5 Bryan Quigley 2013-01-14 19:03:01 UTC
IN 3.8-rc3 this crash no longer happens.  However, there is a new graphical corruption when launching Steam's Big Picture.  This corruption stays until logoff.

It does not affect openbox/lxde, only compiz/unity which is why I thought it might be related to this bug or it's fix. It does not happen in 3.7.
Comment 6 Bryan Quigley 2013-01-14 19:22:52 UTC
https://docs.google.com/file/d/0B9PdLrdrtm1wQVRma3JNdzFNa0k/edit
Video of steam creating the issue, and then the issue disappearing (this doesn't usually happen)

https://docs.google.com/file/d/0B9PdLrdrtm1wS2RyV2g1SmRSbnc/edit
What usually happens to the rest of the desktop.
Comment 7 Bryan Quigley 2013-01-14 19:29:10 UTC
Created attachment 73018 [details]
tail -f output of Xorg, from starting big picture

Syslog doesn't report anything when this happens,  Xorg.0.log shows the following attachment (using tail -f).
Comment 8 Alex Deucher 2013-01-15 14:05:58 UTC
Created attachment 73088 [details] [review]
possible fix

Does the attached kernel patch help?
Comment 9 Alex Deucher 2013-01-15 17:39:23 UTC
Does reverting the following commit fix the corruption issue?

commit d025e9e2b890db679f1246037bf65bd4be512627
Author: Jerome Glisse <jglisse@redhat.com>
Date:   Thu Nov 29 10:35:41 2012 -0500

    drm/radeon: do not move bo to different placement at each cs

    The bo creation placement is where the bo will be. Instead of trying
    to move bo at each command stream let this work to another worker
    thread that will use more advance heuristic.

    agd5f: remove leftover unused variable

    Signed-off-by: Jerome Glisse <jglisse@redhat.com>
    Reviewed-by: Alex Deucher <alexander.deucher@amd.com>
Comment 10 Bryan Quigley 2013-01-15 20:46:38 UTC
Reverting commit d025e9e2b890db679f1246037bf65bd4be512627 does indeed fix the Big Picture issue.  Will test the patch now..
Comment 11 Alex Deucher 2013-01-15 20:57:22 UTC
Same issue as:
https://bugzilla.kernel.org/show_bug.cgi?id=52491
Comment 12 Bryan Quigley 2013-01-15 22:41:50 UTC
Adding the patch from comment #8 does not help.
Comment 13 Jerome Glisse 2013-01-16 22:32:08 UTC
Created attachment 73168 [details] [review]
Exclude system placement

Does applying this patch without reverting anything fix the issue ?
Comment 14 Bryan Quigley 2013-01-17 00:03:32 UTC
I tested with just this second patch and it did not help.

Do you want me to test with both patches applied?
Comment 15 Jerome Glisse 2013-01-17 00:22:20 UTC
You sure you using the module with the patch ? You rebuilded your initrd and all ?

Other user that pointed to same commit have the issue fixed by this patch. A better version of this patch is also at :
http://people.freedesktop.org/~glisse/0001-drm-radeon-exclude-system-placement-when-validating-.patch
Comment 16 Bryan Quigley 2013-01-17 00:31:17 UTC
I'm building with https://wiki.ubuntu.com/KernelTeam/GitKernelBuild#Kernel_Build_and_Installation
I'm running step 9 after doing git apply patch and then git commit.

I'll try the latest patch.
Comment 17 Jerome Glisse 2013-01-17 00:37:35 UTC
You also need step 12 of and please add :

printk(KERN_INFO "TITITOTOTUTUPOPO\n");

to

radeon_device.c line 992 after DRM_INFO("initializing ke ....

And when testing to make sure you are using the patched module dmesg | grep TITITOTO

should tell you if it's the case.
Comment 18 Bryan Quigley 2013-01-17 05:53:50 UTC
I didn't need to do step 12 to get the TITITOTO message printed.  I did what I've been doing all along.  (I'm also trying to update the GitKernelBuild page as I go)  If the message was printed that means it was loaded correctly, right?

The better version of the patch (drm-radeon-exclude-system-placement-when-validating) was tested and it still didn't work.
Comment 19 Jerome Glisse 2013-01-17 15:45:54 UTC
Updated patch

http://people.freedesktop.org/~glisse/0001-drm-radeon-exclude-system-placement-when-validating-.patch


Still weird that you point to same commit and first patch did not solve it.
Comment 20 Bryan Quigley 2013-01-17 19:20:04 UTC
This last patch (0001-drm-radeon-exclude-system-placement-when-validating) creates a full Xorg freeze:
Form xorg.log:
[   207.780] (WW) RADEON(0): flip queue failed: Invalid argument
[   207.781] (WW) RADEON(0): Page flip failed: Invalid argument
Repeated many times

From kern.log:
[  207.595082] radeon 0000:01:00.0: efaa7000 pin failed
[  207.595096] [drm:radeon_crtc_page_flip] *ERROR* failed to pin new rbo buffer before flip
[  207.595434] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -22!
[  207.601745] [drm:radeon_cs_ioctl] *ERROR* Failed to parse relocation -22!
[  207.602094] radeon 0000:01:00.0: efaa7000 pin failed
Repeated many times


I'm going to git pull to latest and try building again... Does it depend on some other patch?
Comment 21 Florian Mickler 2013-01-26 10:51:31 UTC
A patch referencing this bug report has been merged in Linux v3.8-rc5:

commit 20707874fd4fd37e09513f508e642fa8bd06365a
Author: Alex Deucher <alexander.deucher@amd.com>
Date:   Thu Jan 17 13:10:50 2013 -0500

    Revert "drm/radeon: do not move bo to different placement at each cs"
Comment 22 Jerome Glisse 2013-01-28 17:21:23 UTC
Revert merged


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.