Bug 99289

Summary: Display freezes after "gr: TRAP ch 6"
Product: xorg Reporter: afn2
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: NEW --- QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: pierre.morrow
Version: unspecified   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg showing error none

Description afn2 2017-01-05 19:48:02 UTC
Created attachment 128783 [details]
dmesg showing error

Intermittently (a few times a day), my display will completely freeze and doesn't recover. Although the kernel doesn't hang and I can ssh in, I can't chvt to a non-graphical VT.

Whenever this occurs, I see a message like this in dmesg:

[12535.260195] nouveau 0000:01:00.0: gr: TRAP ch 6 [007f778000 Xwayland[854]]
[12535.260211] nouveau 0000:01:00.0: gr: GPC0/TPC1/MP trap: global 00000000 [] warp 3d0001 [STACK_ERROR]
[12539.595312] nouveau 0000:01:00.0: fifo: SCHED_ERROR 0a [CTXSW_TIMEOUT]
[12539.595318] nouveau 0000:01:00.0: fifo: gr engine fault on channel 5, recovering...

Despite the "recovering..." message it never actually recovers and only a reboot solves the problem.

I'm using GNOME on Wayland, and I'm typically running gnome-terminal, Firefox, and/or Chromium. So far I haven't identified any specific action that triggers this failure.

My computer is a mid-2014 Macbook Pro with a GeForce 750M (GK107). I'm running nouveau with "nouveau.nofbaccel=1". I've tried adding "nouveau.config=NvGrUseFW=1" but it complains about not finding /lib/firmware/nvidia/gk107/fecs_inst.bin. Is there an external firmware blob available for this card?

I recently updated my kernel and several relevant packages, and have seen no difference in behavior. I'm running linux 4.9.0, and the latest version of mesa from git (36b5f1d200).

See my attached dmesg output. Is there any debug flag I can enable to shed more light on the situation?

Thanks!
Tony
Comment 1 Ilia Mirkin 2017-01-05 20:00:45 UTC
To answer your immediate question about firmware, you can get blob firmware by following the insturctions at

https://nouveau.freedesktop.org/wiki/VideoAcceleration/#firmware

to the letter. Unfortunately I think that linux 4.9 wants some of these to be renamed, but there's a patch to fix it to look for the "old" names as well:

https://github.com/skeggsb/linux/commit/e137040e0d0376b404fc5155eba44ea07126e3bd.patch

Should be included in a later 4.9.x release.

However I'm only aware of the blob firmware fixing issues for some GTX 660 owners.

nofbaccel=1 is unlikely to be of much help - that disables acceleration of the fbdev device for your terminals.

Note that there are additional patches in Linux 4.10-rc1+ which are likely to improve stability, such as

https://github.com/skeggsb/linux/commit/b27add13f500469127afdf011dbcc9c649e16e54.patch

and you might also want this one, although it'll be a little annoying to apply to upstream tree:

https://github.com/skeggsb/nouveau/commit/b3816f34944ad4824d345b98c323a30710f492d4.patch
Comment 2 afn2 2017-01-05 20:28:48 UTC
Thanks for the quick reply! That's all very informative, especially the bit about the firmware filenames. (I was wondering why the filenames in /lib/firmware/nouveau were so different from the ones in /lib/firmware/nvidia!)

I'll give those patches a shot and follow up with an update soon.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.