Bug 54786

Summary: random crashes of X with failed to idle channel 0xcccc0000
Product: xorg Reporter: Alin M Elena <alinm.elena>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED FIXED QA Contact: Xorg Project Team <xorg-team>
Severity: major    
Priority: medium CC: alexey.brodkin, bigbeerjr, rsalvaterra, s_j_newbury
Version: 7.6 (2010.12)   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Xorg log for the crash
none
nouveau dmesg log none

Description Alin M Elena 2012-09-11 18:44:52 UTC
Created attachment 66988 [details]
Xorg log for the crash

I am using a daily versions of nouveau driver... this is 10/09/2012 version on opensuse 64 bit

[alin@abbaton:~]: uname -a
Linux abbaton.ucd.ie 3.6.0-rc4-3-desktop #1 SMP PREEMPT Fri Sep 7 20:12:44 UTC 2012 (23f3e67). x86_64 x86_64 x86_64 GNU/Linux
xorg version 7.6_1
mesa 8.0.4

x segfaults at random times wiht the message failed to idle channel 0xcccc0000.

I have added in the attach both dmesg | grep nouveau and the Xorg.0.log.old

I did not manage to find a reason that triggers the failure.

Alin
Comment 1 Alin M Elena 2012-09-11 18:45:47 UTC
Created attachment 66989 [details]
nouveau dmesg log
Comment 2 Mark Einon 2013-02-15 10:53:17 UTC
I've run a bisect on Linus' master branch, and narrowed the issue down to this one line change, which isn't much to go on. It looks like the change of struct type has caused issues where it is used elsewhere :

commit 7707b701ebfea64afa6bfb23aa318fd687892754
Author: Marcin Slusarz <marcin.slusarz@gmail.com>
Commit: Ben Skeggs <bskeggs@redhat.com>

    drm/nv40/mpeg: fix context handling

    It slipped in thanks to typeless API.

    Signed-off-by: Marcin Slusarz <marcin.slusarz@gmail.com>
    Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

diff --git a/drivers/gpu/drm/nouveau/core/engine/mpeg/nv40.c b/drivers/gpu/drm/n
index 1241857..f7c581a 100644
--- a/drivers/gpu/drm/nouveau/core/engine/mpeg/nv40.c
+++ b/drivers/gpu/drm/nouveau/core/engine/mpeg/nv40.c
@@ -38,7 +38,7 @@ struct nv40_mpeg_priv {
 };

 struct nv40_mpeg_chan {
- struct nouveau_mpeg base;
+ struct nouveau_mpeg_chan base;
 };
Comment 3 Alin M Elena 2013-02-15 10:58:40 UTC
Thanks for fixing... Unfortunately the bug seemed gone so I am afraid there is no way to test it now.

Alin
Comment 4 Rui Salvaterra 2013-02-21 09:47:58 UTC
This also happens on my NVAC/MCP79 running Ubuntu 12.10 with the xorg-edgers PPA and Linux 3.8. Unity fails to start (only the background image and the mouse cursor appear) and, after a while, dmesg shows a couple of "failed to idle channel 0xcccc0000" messages.
Comment 5 Steven Newbury 2013-02-22 17:20:01 UTC
Still happens here.  Both with linus/master and nouveau-drm-next.  I'm going to try reverting the above mentioned commit, although it's not obvious to me how it would be the cause...
Comment 6 Steven Newbury 2013-02-22 18:03:45 UTC
(In reply to comment #5)
> Still happens here.  Both with linus/master and nouveau-drm-next.  I'm going
> to try reverting the above mentioned commit, although it's not obvious to me
> how it would be the cause...
As expected, reverting 7707b701ebfea64afa6bfb23aa318fd687892754 made no difference. :(
Comment 7 Steven Newbury 2013-02-22 18:05:36 UTC
Hardware is: "NVIDIA Corporation NV35 [GeForce FX 5900] (rev a1)"
Comment 8 bigbeerjr 2013-02-26 01:40:58 UTC
The same random X crashes with the same "failed to idle channel 0xcccc0000" messages are seen with on a NVC0 family card, NVIDIA Corporation GF119 [Quadro NVS 4200M], also on kernel 3.8.

The only way to have a stable system is to disable accel on boot.
Comment 9 bigbeerjr 2013-03-18 16:27:14 UTC
Still present w/ fedora kernel 3.8.3-201.fc18.x86_64
Comment 10 Rui Salvaterra 2013-03-18 16:33:23 UTC
Works for me since Linux 3.9-rc3. Thanks!
Comment 11 Jérôme Carretero 2013-03-31 20:31:54 UTC
Last comment made me update my kernel (got it in 3.8.4) but I got it also in 3.9-rc4; trying latest nouveau patches since then, just in case.
Hardware: GF106GL (Quadro 2000) (nvc0).
I wish I could reproduce it simply.
Comment 12 Ilia Mirkin 2013-08-30 21:21:54 UTC
"failed to idle channel xxx" means "the gpu appears to not have hit a sync point". This can happen for a variety of reasons, differing for different hardware. It has a special propensity for showing up on a resume where operations were going on while the gpu was suspending, but can also happen in other circumstances.

I'm going to close this bug because the issue is reportedly fixed for the original bug reporter, and every other commenter has vastly different hardware. I would advise each of you to update your software (kernel, xf86-video-nouveau, mesa) to the latest, and if problems persist file your own separate bugs, indicate what hardware you have, and what steps to reproduce, including logs. See http://nouveau.freedesktop.org/wiki/Bugs/
Comment 13 Jakub 2014-05-02 07:53:58 UTC
Guys this issue seems to be present in trusty thar now https://bugs.launchpad.net/ubuntu/+source/linux-lts-quantal/+bug/1097178. This is critical bug for tons of people. I have to restart the system 20 times a day. Unfortunately I'm having a similar issue with nvidia driver, so I'm kinda without options.
Comment 14 Ilia Mirkin 2014-05-02 16:10:46 UTC
All the original reasons to have closed the bug still exist. Re-closing. Please read comment 12, in its entirety.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.