Summary: | [nvc1/Quardro1000M|nvc3/Quadro2000M] graphic garbage/corruption/noise on resume | ||
---|---|---|---|
Product: | xorg | Reporter: | michael.weirauch |
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED FIXED | QA Contact: | Xorg Project Team <xorg-team> |
Severity: | major | ||
Priority: | high | CC: | andrew, anssi, bogdan_radulescu99, cdep.illabout+freedesktop, doug.a.brunner, freedesktop, gotobox, jbeh, jglotzer, oschwald, phomes, rcoe, renault, rolf.offermanns, rubyjedi, tcallawa, tomi.orava |
Version: | git | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
Description
michael.weirauch
2013-01-09 14:36:39 UTC
Created attachment 72726 [details]
gdm 3.8.0-rc2 resume garbage/noise nvc3
Issue still present on 3.8.0_rc3 nouveau as of 2012-01-17 and updated drm: S | Name | Type | Version | Arch --+-------------------------------+---------+----------------------------------+------- i | Mesa | package | 9.0.1-202.7 | x86_64 i | kernel | package | 3.8.0_rc3_1_desktop_nouveau01+-7 | x86_64 i | libdrm_nouveau2 | package | 2.4.41-105.1 | x86_64 i | xorg-x11-driver-video-nouveau | package | 1.0.6-58.2 | x86_64 i | xorg-x11-server | package | 7.6_1.13.1-218.5 | x86_64 Somebody got an idea about this issue or some hints on what I should try? Btw, this is with the ThinkPad running closed in the dock and external monitor attached via DVI to DP-3. Opening the lid will not turn on the laptop panel and will also disable the external signal to DP-3. This is another story I think, though. $ xrandr | grep "connect" LVDS-1 unknown connection (normal left inverted right x axis y axis) VGA-1 disconnected (normal left inverted right x axis y axis) DP-1 disconnected (normal left inverted right x axis y axis) DP-2 disconnected (normal left inverted right x axis y axis) DP-3 connected 1920x1200+0+0 (normal left inverted right x axis y axis) 518mm x 324mm Created attachment 74037 [details]
dmesg 3.8.0-rc5 resume garbage/noise nvc3
Still the same on 3.8.0-rc5.
dmesg excerpt:
[ 124.859583] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1472]]
[ 124.859603] nouveau E[ PGRAPH][0000:01:00.0] GPC0/TPC2/MP: 0x001beff2 0x0000000f
[ 124.860408] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1472]]
[ 124.860414] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa204021e
[ 125.169451] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1472]]
[ 125.169460] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa204021e
[ 125.169536] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1472]]
[ 125.169544] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa204021e
Components:
Name | Typ | Version
------------------------------+-------+-----------------------------------
Mesa | Paket | 9.0.2-210.1
kernel | Paket | 3.8.0_rc5_1_desktop_nouveau01+-15
libdrm_nouveau2 | Paket | 2.4.41-105.1
xorg-x11-driver-video-nouveau | Paket | 1.0.6+git@2013-01-29
xorg-x11-server | Paket | 7.6_1.13.2-223.1
*** Bug 59858 has been marked as a duplicate of this bug. *** Created attachment 74325 [details]
dmesg 3.8.0-rc5 resume garbage/noise nvc3
Regular report with updated companion components:
* Still same garbage as before. System luckily didn't lock up when killing X.
* Can use system after restarting X.
* Garbage/Noise switched to white after "Tab"ing and moving mouse a bit on gdm screen.
demsg excerpt:
[ 197.939969] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1476]]
[ 197.939992] nouveau E[ PGRAPH][0000:01:00.0] GPC0/TPC1/MP: 0x001beff2 0x0000000f
[ 197.940851] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1476]]
[ 197.940858] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa204021e
[ 197.952710] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1476]]
[ 197.952722] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa204021e
[ 197.952805] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x007fd18000 gnome-shell[1476]]
[ 197.952816] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa204021e
Components:
Name | Typ | Version
------------------------------+-------+-----------------------------------
Mesa | Paket | 9.0.98.82-221.1
kernel | Paket | 3.8.0_rc5_1_desktop_nouveau01+-15
libdrm_nouveau2 | Paket | 2.4.42-109.1
xorg-x11-driver-video-nouveau | Paket | 1.0.6+git@2013-02-07
xorg-x11-server | Paket | 7.6_1.13.99.901-226.1
For what it is worth, I can reproduce this as well on Fedora 18, ThinkPad W520 4270CTO - NVIDIA Corporation GF108GLM [Quadro 1000M] Still present on nouveau 3.8.0-rc7 as of 2013-02-13. Name | Typ | Version ------------------------------+-------+---------------------------------- Mesa | Paket | 9.0.98.83-222.1 kernel | Paket | 3.8.0_rc7_1_desktop_nouveau01+-19 libdrm_nouveau2 | Paket | 2.4.42-109.1 xorg-x11-driver-video-nouveau | Paket | 1.0.6-58.5 xorg-x11-server | Paket | 7.6_1.13.99.901-226.2 I have switched to using the proprietary nvidia drivers as this issue was too debilitating for daily use. I can confirm: 1 - Both suspend and resume are now flawless. 2 - The issue I had been having with gnome-shell repeatedly crashing has now apparently stopped. Sorry to be the bearer of bad news. Dell Latitude 6530 kernel-3.7.6-201.fc18.x86_64 xorg-x11-drv-nouveau-1.0.6-1.fc18.x86_64 mesa-dri-drivers-9.0.1-4.fc18.i686 I am experiencing the same problem on my w520 with Nvidia GF106 [Quadro 2000M].. I get the same symptoms / messages. I noticed that suspend / resume works if using libdrm-nouveau1a only (without libdrm-nouveau2), however I the mouse gets lost on resume. If using libdrm-nouveau2, I get the problems.. Maybe the problem is somehow related to dri2? Michael Weirauch, I noticed your whitty comment on bug #50121 and thought that I should report my debug findings on issue's like these. I have a comparable bug with my 7300GT (NV4B) at bug #23223. The card suspends fine, but resume's with a lot of garbage. - Altough both cards are not the same, in fact the differences are probably huge. - Altough I'm not a developer or expert in any of this. I'm just sharing my knowledge here so you could hopefully aid the developers on doing their magic. First: - Martin Slusarz mentions in bug #23223 comment 18 a script limiter. Maybe you can try. I report my results at comment 19 in that same bug. Please note there is an off by one error mentioned in comment 20 you should take into account if you start to use it. But it works fine. This method probably helps you, as "nouveau.config=DEVINIT=NvForcePost=1" also gave garbage in your case. Which means that these scripts are doing something wrong or unexpected/unknown. Do this with the latest git kernel plus the latest nouveau tree. Second: - Comment 8 in this bug mentioned that switching to the proprietary driver fixed his issue. This is not bad news per sé. As this could give hope of gaining more intel by doing an mmiotrace of a working state: http://nouveau.freedesktop.org/wiki/MmioTrace Also check bug #23223 on my findings while doing this. It's best to not use X or Wayland, but only to enable udev. You have to use mmiotrace across a suspend/resume cycle. I think without using X is the best way to do it. Try to compress the resulting file with 'xz --best'. I did that, and was able to upload it to the bug itself. Do this with the latest git kernel plus the latest nouveau tree. Third and final: - I made a VBIOS dump to aid in debugging, which was also requested by Marcin Slusarz. https://bugs.freedesktop.org/show_bug.cgi?id=23223#c14 You have to use a v3.6(.9) kernel to use the mechanism I used since that piece of infrastructure was not ported to later kernel versions during the big rework in v3.7. Also I'm not sure, but it is said you can also retrieve a VBIOS from an MMIOTRACE. But alas, I'm not sure so it might be a good idea to seperately post it. Above procedures should keep you occupied on a nice rainy sunday afternoon. If you are going to do this, please mention that you do this on IRC. Maybe someone might get interested in this as more information has become available. Suspend/resume is working on my w520 with NV30 using kernel 3.4.32! After trying many different vanilla kernel versions (3.2.x/3.4.x/3.6.x/3.7x/3.8.x) I finally found out that using 3.4.32 the suspend/resume works flawlessly! As a side note: starting with the kernel versions 3.2.x up to 3.4.32 there is a dedicated nv30x.c code (along with the nv40) in the kernel. Starting with 3.7 the code file for nv30 disappeared, however, the nv40 is still present. This is somehow mysterious for me, as the nv30 family is a bit different from nv40.. Does anybody have a clue for this? same problem here, really annoying. NVC1 (GT540M) kernel 3.7.7-1-ARCH mesa 9.0.2-1 nouveau-dri 9.0.2-1 xf86-video-nouveau 1.0.6-1 xorg-server-1.13.2-1 Created attachment 75129 [details] [review] fix suspend bug in nvc0 fence implementation Guys, try this patch by Maarten Lankhorst. Marcin, do you think the patch could also help on nv30? GF106 is nvc3, not nv30 - no idea where did you get it from... so: no, it won't help for nv30, and yes, it will help for nvc3 My post #11 is about my NV30 - so I thought that it may help too. What can I do about my NV30? Do you have some suggestions / recommendations? You cannot have nv30 card (produced ~8 years ago) in 1-2 years old laptop. That would be insane. If "w520" (which I assume is ThinkPad W520) was mentioned as a mistake, then please open new bug report. http://nouveau.freedesktop.org/wiki/Bugs http://nouveau.freedesktop.org/wiki/CodeNames Ok, that must be my mistake. lspci says: 01:00.0 VGA compatible controller: NVIDIA Corporation GF106GLM [Quadro 2000M] (rev a1) yes, I have a Thinkpad W520 (In reply to comment #17) > You cannot have nv30 card (produced ~8 years ago) in 1-2 years old laptop. > That would be insane. If "w520" (which I assume is ThinkPad W520) was > mentioned as a mistake, then please open new bug report. > > http://nouveau.freedesktop.org/wiki/Bugs > http://nouveau.freedesktop.org/wiki/CodeNames Created attachment 75164 [details]
dmesg 3.7.9, suspend / resume, garbage/noise NVidia GF106GLM [Quadro 2000M]
Comment on attachment 75164 [details]
dmesg 3.7.9, suspend / resume, garbage/noise NVidia GF106GLM [Quadro 2000M]
I patched the kernel 3.7.9 with the patch recommended by Marcin yesterday and tried the suspend / resume cycle with the patched kernel. Unfortunatelly, the garbage is still there and the error messages
[ 100.755954] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fce1000]
[ 100.758599] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e
keep on repeating until I stop X. One difference though - the error messages seem to repeat faster now than before the patch..
Any ideas?
Same problem occurs for me, yet another person, on resume (ArchLinux) 01:00.0 VGA compatible controller: NVIDIA Corporation GF116 [GeForce GTX 550 Ti] (rev a1) - linux 3.7.9-1 - xorg-server 1.13.2-1 - mesa 9.0.2-1 - nouveau-dri 9.0.2-1 - xf86-video-nouveau 1.0.6-1 - gnome 3.6 I'm seeing what I think is the same problem on my Thinkpad T530. When I resume the system, xscreensaver's lock screen always seems to look and behave normally, but when I unlock the screen, either the whole screen is corrupted, or all the previously opened windows will be drawn incorrectly. I can even lock the screen again and xscreensaver still looks fine. When unlocked, the system can still be used semi-blindly, and if I can feel my way to a terminal, I can start some SDL and OpenGL apps that work fine, while newly-opened GTK/etc apps are corrupted. Logging out or otherwise restarting X gets everything back to normal. Much like others that have tested, the proprietary nvidia driver works fine. dmesg: [85334.271612] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x001fcfa000] [85334.271621] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e *repeats numerous times* lspci: (Thought I would chime in as I have a chip that hasn't been mentioned yet.) 01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [Quadro NVS 5400M] (rev a1) Packages: (Gentoo/~amd64) CFLAGS="-O2 -pipe -march=native" gentoo-sources 3.7.8 USE="-build -deblob -symlink" mesa 9.1_rc2 USE="classic egl gallium gles1 gles2 llvm nptl openvg osmesa pax_kernel shared-glapi xa xorg xvmc -bindist -debug -gbm -pic -r600-llvm-compiler (-selinux) -vdpau -wayland" libdrm 2.4.42 USE="libkms -static-libs" xorg-server 1.13.2 USE="ipv6 kdrive nptl suid udev xorg -dmx -doc -minimal (-selinux) -static-libs -tslib -xnest -xvfb" xf86-video-nouveau 1.0.6 (In reply to comment #13) > Created attachment 75129 [details] [review] [review] > fix suspend bug in nvc0 fence implementation > > Guys, try this patch by Maarten Lankhorst. Tried that patch with git head 3.9, suspend still comes back to garbage + mouse cursor, dmesg logs are filled with this, repeating endlessly: [ 302.118755] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 302.119858] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 302.120869] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 302.121951] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 302.122961] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 302.126800] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 302.127434] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 302.128113] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 302.128726] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.138071] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.145020] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 314.146127] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.147142] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 314.148232] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.149266] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.153112] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.153780] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 314.154518] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.155163] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.155873] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.156526] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.158712] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.159432] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 314.160207] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.160919] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.160995] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.161001] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.164003] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.164622] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040a1e [ 314.165791] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.166415] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.167099] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.167713] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 314.168395] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000] [ 314.169017] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e I compiled the nouveau kernel yesterday in the hope that the newest nouveau source will fix the suspend / resume issue and help with the hibernation. Nevertheless, it does not help. It is even a bit worse since normal work is impossible with that kernel - the card seems to be very slow and locks.. However, I was able to use 3.4.32 for the normal suspend / resume without a glitch (coming from hibernate results in black screen though) for the last 5 days.. So it mean to me, the card was working already, but stopped to work probably in course of the 3.5.x refactoring.. I was digging in the nouveau code a bit and looking where the endlessly repeating messages [ 66.090322] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fce1000] [ 66.090358] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 66.090457] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fce1000] [ 66.090485] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e It looks to me like something should be initialized but does not. I got to the source code file nouveau/core/engine/graph/nvc0.c switch (nvc0_graph_class(priv)) { case 0x9097: nv_engine(priv)->sclass = nvc0_graph_sclass; break; case 0x9197: nv_engine(priv)->sclass = nvc1_graph_sclass; break; case 0x9297: nv_engine(priv)->sclass = nvc8_graph_sclass; break; } What I am just curious about - I do not see here any implementation for nvc3 - is this maybe the problem or is nvc3 already handled by any of the 3 methods listed here? When I find some time, I could debug this more. Can somebody point me in the right direction? Is there any documentation for the implementation? I would like to fix the problem :) BTW: the code above is from 3.8-rc7 Just compiled 3.8.0 and tested suspend resume. Although the resume is still not working, there are some changes. The error messages do not repeat endlessly anymore, instead I get: [ 97.941502] PM: Finishing wakeup. [ 97.942484] usb 2-1.4: USB disconnect, device number 3 [ 97.941503] Restarting tasks ... done. [ 97.942949] video LNXVIDEO:01: Restoring backlight state [ 97.943075] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.943120] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 97.944405] cdc_ncm 2-1.4:1.6 wwan0: unregister 'cdc_ncm' usb-0000:00:1d.0-1.4, Mobile Broadband Network Device [ 97.944885] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.944897] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 97.944976] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.944985] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 97.945063] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.945071] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 97.945155] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.945164] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 97.945237] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.945246] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 97.945429] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x007fe00000] [ 97.945435] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa0040800 [ 98.168304] usb 2-1.4: new high-speed USB device number 4 using ehci-pci [ 98.303689] cdc_acm 2-1.4:1.1: ttyACM0: USB ACM device [ 98.307171] cdc_acm 2-1.4:1.3: ttyACM1: USB ACM device [ 98.315544] cdc_wdm 2-1.4:1.5: cdc-wdm0: USB WDM device [ 98.331090] usb 2-1.4: MAC-Address: 02:80:37:ec:02:00 [ 98.331590] cdc_ncm 2-1.4:1.6 wwan0: register 'cdc_ncm' at usb-0000:00:1d.0-1.4, Mobile Broadband Network Device, 02:80:37:ec:02:00 [ 98.332666] cdc_wdm 2-1.4:1.8: cdc-wdm1: USB WDM device [ 98.333153] cdc_acm 2-1.4:1.9: ttyACM2: USB ACM device [ 98.500834] e1000e 0000:00:19.0: irq 53 for MSI/MSI-X [ 98.603498] e1000e 0000:00:19.0: irq 53 for MSI/MSI-X [ 98.604001] IPv6: ADDRCONF(NETDEV_UP): eth0: link is not ready [ 99.289242] IPv6: ADDRCONF(NETDEV_UP): wwan0: link is not ready [ 99.291621] cdc_ncm: wwan0: network connection: disconnected [ 104.358920] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 104.359024] IPv6: ADDRCONF(NETDEV_CHANGE): eth0: link becomes ready [ 134.407906] nouveau E[ 1268] failed to idle channel 0xcccc0000 Notice the "failed to idle channel" Restarting X after this does not work - nouveau complains about some not existing pages - I do not have the dmesg for this now, I try to get it later. Coming from hibernate I get: [ 365.858983] nouveau E[ PDISP][0000:01:00.0][0xc000857b][ffff88022c468e00] timeout1: 0x00000000 [ 365.858985] nouveau E[ PDISP][0000:01:00.0][0xc000857b][ffff88022c468e00] init failed, -16 [ 365.858993] nouveau E[ DRM] 0xdddddddd:0xd1500000 init failed with -16 [ 365.859383] nouveau E[ DRM] 0xffffffff:0xdddddddd init failed with -16 [ 365.859682] nouveau E[ DRM] 0xffffffff:0xffffffff init failed with -16 [ 365.859695] nouveau [ VBIOS][0000:01:00.0] running init tables [ 365.865846] e1000e: eth0 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: Rx/Tx [ 367.937496] nouveau [ DRM] resuming display... Should I go and try with the current nouveau from git? Created attachment 75581 [details]
Complete dmesg from suspend/resume on 3.8.0
Just found out that suspend and resume works on vanilla 3.7.9 and 3.8 when nouveau.noaccel=1 is set. So it seems the problem with the resume is directly related to the acceleration.. When hibernating on 3.7.9 the computer reboots automatically, hibernate on 3.8 works, but resuming from hibernation results in a black screen.. My results from 3.9.0rc1: Suspend ------- Mar 11 18:07:46 rof-lap kernel: [ 222.304276] nouveau [ DRM] suspending fbcon... Mar 11 18:07:46 rof-lap kernel: [ 222.304298] nouveau [ DRM] suspending display... Mar 11 18:07:46 rof-lap kernel: [ 222.304310] nouveau [ DRM] unpinning framebuffer(s)... Mar 11 18:07:46 rof-lap kernel: [ 222.304409] nouveau [ DRM] evicting buffers... Mar 11 18:07:46 rof-lap kernel: [ 222.517459] sd 0:0:0:0: [sda] Stopping disk Mar 11 18:07:46 rof-lap kernel: [ 222.652940] nouveau [ DRM] suspending client object trees... Mar 11 18:07:46 rof-lap kernel: [ 222.653078] nouveau W[ PFIFO][0000:01:00.0] INTR 0x00000001: 0x00000004 Mar 11 18:07:46 rof-lap kernel: [ 222.653254] nouveau W[ PFIFO][0000:01:00.0] INTR 0x00000001: 0x00000004 [...] Resume ------ Mar 11 18:07:46 rof-lap kernel: [ 224.980347] nouveau [ DRM] re-enabling device... Mar 11 18:07:46 rof-lap kernel: [ 224.980367] nouveau [ DRM] resuming client object trees... Mar 11 18:07:46 rof-lap kernel: [ 224.980374] nouveau [ VBIOS][0000:01:00.0] running init tables Mar 11 18:07:46 rof-lap kernel: [ 225.167093] nouveau [ PTHERM][0000:01:00.0] programmed thresholds [ 90(3), 95(3), 105(5), 135(5) ] Mar 11 18:07:46 rof-lap kernel: [ 225.168112] nouveau [ DRM] resuming display... [...] Mar 11 18:07:47 rof-lap kernel: [ 227.558681] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.558690] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.559484] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.559491] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.559645] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.559651] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.559795] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.559801] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.559946] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.559952] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.560097] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.560102] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.560247] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.560253] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.560403] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.560409] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.561045] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] Mar 11 18:07:47 rof-lap kernel: [ 227.561052] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Mar 11 18:07:47 rof-lap kernel: [ 227.561201] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x003fe10000 X[920]] : [...] Restarting Xorg resolves the problem until next suspend. 01:00.0 VGA compatible controller: NVIDIA Corporation GF108 [GeForce GT 540M] (rev a1) Created attachment 76393 [details]
kernel log with nouveau.debug=trace
Boot -> suspend -> resume -> restart Xorg
Also happens here with Linux 3.8.3. It used to happen with all the 3.7 versions I tested. The card as reported by lspci is: 01:00.0 VGA compatible controller: nVidia Corporation GF108 [GeForce GT 540M] (rev a1) Subsystem: Sony Corporation Device 9089 Kernel driver in use: nouveau My messages in dmesg are: [ 487.986709] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 3 [0x003fb8c000] [ 487.986718] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa2040a04 Hope this nouveau bug will get fixed soon as it's really annoying. I have this problem also. After suspend/resume firefox, gvim, and other graphic applications do not display. The window only displays the border and no graphics inside the window. I tried the patch without success. The workaround 'nouveau.noaccel=1' makes suspend/resume work. Also, some screen corruption in a graphic application is not occurring. lspci 01:00.0 VGA compatible controller: NVIDIA Corporation Device 0dfc (rev a1) 01:00.0 0300: 10de:0dfc (rev a1) [ 237.125651] nouveau [ DEVICE][0000:01:00.0] BOOT0 : 0x0c1e00a1 [ 237.125659] nouveau [ DEVICE][0000:01:00.0] Chipset: GF108 (NVC1) [ 237.125664] nouveau [ DEVICE][0000:01:00.0] Family : NVC0 libdrm_nouveau2-2.4.42-1.1.1.x86_64 xorg-x11-driver-video-nouveau-1.0.6-2.1.1.x86_64 kernel 3.7.10 Still present on nouveau 3.9.0-rc4 (master@git) as of 2013-04-03. Name | Type | Version ------------------------------+---------+--------------------------------- Mesa | package | 9.1.1-247.1 kernel | package | 3.9.0_rc4_1_desktop_nouveau01+-3 libdrm_nouveau2 | package | 2.4.43-118.1 xorg-x11-driver-video-nouveau | package | 1.0.7@git-2013-04-03 xorg-x11-server | package | 7.6_1.14.0-243.8 @Marcin Slusarz, if you are reading this: Do you think it's worthwhile trying your script limiter patch from bug 23223 comment 18 mentioned by Ronald in conjunction with "nouveau.config=DEVINIT=NvForcePost=1"? Tested the bios script limiting patch a bit: Plain excerpt: [ 2.601923] nouveau [ VBIOS][0000:01:00.0] checking PRAMIN for image... [ 2.684360] nouveau [ VBIOS][0000:01:00.0] ... appears to be valid [ 2.684362] nouveau [ VBIOS][0000:01:00.0] using image from PRAMIN [ 2.684476] nouveau [ VBIOS][0000:01:00.0] BIT signature found [ 2.684478] nouveau [ VBIOS][0000:01:00.0] version 70.06.33.00.04 [ 2.684480] nouveau D[ VBIOS][0000:01:00.0] created [ 2.684665] nouveau D[ VBIOS][0000:01:00.0] reset [ 2.684666] nouveau D[ VBIOS][0000:01:00.0] initialised [ 2.684671] nouveau [ VBIOS][0000:01:00.0] executing script 0, offset: 54347 [ 2.704681] nouveau [ VBIOS][0000:01:00.0] executing script 1, offset: 56127 [ 2.704701] nouveau [ VBIOS][0000:01:00.0] executing script 2, offset: 61021 [ 2.704702] nouveau [ VBIOS][0000:01:00.0] executing script 3, offset: 61031 [ 2.704705] nouveau [ VBIOS][0000:01:00.0] executing script 4, offset: 61500 [ 2.704706] nouveau [ VBIOS][0000:01:00.0] executing special script, offset: 61601 Booting with nouveau.config=DEVINIG=NvForcePost=1 and nouveau.minscript=0 and maxscript alternatively with [999,4,3,2,1,0] I always get black screen where the plymouth dm_crypt unlock passphrase splash should appear after loading initrd. System seems "fine", though. (Can't access the box via ssh at that stage. No network set up.) Might be there is good news ahead. At least for me. Since the rebase on 3.10-rc2 some days ago (2013-05-24) I can suspend and resume fine from within a gnome-session and gdm-login. No graphics distortion whatsoever. Been downgrading (a while before though) to stable X11/Mesa repo and not the bleeding edge git versions. This shouldn't be the issue, because I was actually switching to the git-variants in order to see their effect on the "bug" here. Name | Typ | Version ------------------------------+-------+---------------------------- Mesa | Paket | 9.1.3-240.1 kernel | Paket | 3.10.0_rc2_2.24_desktop+-22 libdrm_nouveau2 | Paket | 2.4.45-110.1 xorg-x11-driver-video-nouveau | Paket | 1.0.7-60.3 xorg-x11-server | Paket | 7.6_1.14.1-234.3 It's even frightening I can safely take the laptop (Thinkpad W520) out of the dock (would usually freeze with high system load before) and the display (LVDS-1) turns on automagically and on putting it back in, it switches back to the DVI-attached LCD and the Thinkpad display goes out. Haven't tested behaviour on taking out the laptop with closed lid and opening afterwards, though. Can somebody confirm their issues gone, too? Confirming that suspend works on Quadro 1000M. This is on current rawhide (kernel 3.10.0-0.rc3) Still works with recent 3.10-rc4 merge on nouveau-master. Just to answer myself, taking the ThinkPad out the dock with closed lid and opening afterwards also works as expected. Many issues seem to be gone now apart from this resume-blocker here. I'd consider this bug resolved/fixed since some time. I hate to resurrect a zombie thread, but I am still affected by this suspend/resume issue. My system also reports "SHADER 0xa004021e" as a common point of failure. On Fedora 18, kernel 3.10.10-100.fc18.x86_64), my screen resumes from hibernation with garbage on-screen as described in this ticket, and messages like these are repeated in the logs: Sep 2 14:23:38 ufo-laptop kernel: [ 66.074252] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 1 [0x005fbb1000 X[846]] Sep 2 14:23:38 ufo-laptop kernel: [ 66.074264] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e lspci -nnv: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation Device [10de:1251] (rev a1) (prog-if 00 [VGA controller]) Subsystem: CLEVO/KAPOK Computer Device [1558:5102] Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at f4000000 (32-bit, non-prefetchable) [size=32M] Memory at e8000000 (64-bit, prefetchable) [size=128M] Memory at f0000000 (64-bit, prefetchable) [size=64M] I/O ports at e000 [size=128] Expansion ROM at f6000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau The only thing that works at the moment is setting nouveau.noaccel=1. Setting nouveau.config=DEVINIG=NvForcePost=1 causes bootup to Plymouth with a black, unlit screen. I would really like to have a long-term solution to this, and am willing to act upon any suggestions or patches you may have to reach that goal. Thanks! Try and see if 3.11 helps, a bunch of init stuff was changed for nvcx. If not, please attach the relevant logs and reopen the bug. However if it's sufficiently different from the original, it may be less confusing to just open a fresh one. Unfortunately still present in 3.11. On Ubuntu Saucy with 3.11.0-15-generic on x86_64, suspend/resume results in the same corrupted screen. I see vague shapes of windows that can be moved with alt-drag, but most visual elements including text are absent. Console and syslog have messages: Jan 9 00:28:24 codex kernel: [ 80.883153] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Jan 9 00:28:24 codex kernel: [ 80.883195] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x005fba00 00 Xorg[4341]] repeating ad infinitum until I log into a console and restart lightdm (and thus X server). The machine then operates normally (can use X as expected) until the next reboot. Setting noaccel=1 works around the problem, but obviously with reduced graphics performance. I also tried kernel 3.12 from the Ubuntu kernel PPA - same behavior. From lspci -nnv: 01:00.0 VGA compatible controller [0300]: NVIDIA Corporation GF116M [GeForce GT 560M] [10de:1251] (rev a1) (prog-if 00 [VGA controller]) Subsystem: CLEVO/KAPOK Computer Device [1558:7100] Flags: bus master, fast devsel, latency 0, IRQ 16 Memory at f4000000 (32-bit, non-prefetchable) [size=32M] Memory at e8000000 (64-bit, prefetchable) [size=128M] Memory at f0000000 (64-bit, prefetchable) [size=64M] I/O ports at e000 [size=128] Expansion ROM at f6000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau From grep nouveau /var/log/syslog: Jan 9 00:27:19 codex kernel: [ 18.225269] fb: conflicting fb hw usage nouveaufb vs simple - removing generic driver Jan 9 00:27:19 codex kernel: [ 18.227186] nouveau [ DEVICE][0000:01:00.0] BOOT0 : 0x0cf880a1 Jan 9 00:27:19 codex kernel: [ 18.227189] nouveau [ DEVICE][0000:01:00.0] Chipset: GF116 (NVCF) Jan 9 00:27:19 codex kernel: [ 18.227192] nouveau [ DEVICE][0000:01:00.0] Family : NVC0 Jan 9 00:27:19 codex kernel: [ 18.231456] nouveau [ VBIOS][0000:01:00.0] checking PRAMIN for image... Jan 9 00:27:20 codex kernel: [ 18.334067] nouveau [ VBIOS][0000:01:00.0] ... appears to be valid Jan 9 00:27:20 codex kernel: [ 18.334071] nouveau [ VBIOS][0000:01:00.0] using image from PRAMIN Jan 9 00:27:20 codex kernel: [ 18.334222] nouveau [ VBIOS][0000:01:00.0] BIT signature found Jan 9 00:27:20 codex kernel: [ 18.334227] nouveau [ VBIOS][0000:01:00.0] version 70.26.29.00.06 Jan 9 00:27:20 codex kernel: [ 18.358154] nouveau [ MXM][0000:01:00.0] BIOS version 3.0 Jan 9 00:27:20 codex kernel: [ 18.361036] nouveau [ MXM][0000:01:00.0] MXMS Version 3.0 Jan 9 00:27:20 codex kernel: [ 18.361084] nouveau [ PFB][0000:01:00.0] RAM type: GDDR5 Jan 9 00:27:20 codex kernel: [ 18.361087] nouveau [ PFB][0000:01:00.0] RAM size: 1536 MiB Jan 9 00:27:20 codex kernel: [ 18.361089] nouveau [ PFB][0000:01:00.0] ZCOMP: 0 tags Jan 9 00:27:20 codex kernel: [ 18.402251] nouveau [ PTHERM][0000:01:00.0] FAN control: none / external Jan 9 00:27:20 codex kernel: [ 18.402262] nouveau [ PTHERM][0000:01:00.0] fan management: disabled Jan 9 00:27:20 codex kernel: [ 18.402268] nouveau [ PTHERM][0000:01:00.0] internal sensor: yes Jan 9 00:27:20 codex kernel: [ 18.438029] nouveau [ DRM] VRAM: 1536 MiB Jan 9 00:27:20 codex kernel: [ 18.438030] nouveau [ DRM] GART: 1048576 MiB Jan 9 00:27:20 codex kernel: [ 18.438034] nouveau [ DRM] TMDS table version 2.0 Jan 9 00:27:20 codex kernel: [ 18.438036] nouveau [ DRM] DCB version 4.0 Jan 9 00:27:20 codex kernel: [ 18.438038] nouveau [ DRM] DCB outp 00: 01000313 00010034 Jan 9 00:27:20 codex kernel: [ 18.438040] nouveau [ DRM] DCB outp 07: 08013382 00020030 Jan 9 00:27:20 codex kernel: [ 18.438041] nouveau [ DRM] DCB outp 08: 040383b6 0f220014 Jan 9 00:27:20 codex kernel: [ 18.438043] nouveau [ DRM] DCB outp 11: 02027362 00020010 Jan 9 00:27:20 codex kernel: [ 18.438044] nouveau [ DRM] DCB outp 13: 02013380 00000000 Jan 9 00:27:20 codex kernel: [ 18.438046] nouveau [ DRM] DCB conn 00: 00000040 Jan 9 00:27:20 codex kernel: [ 18.438047] nouveau [ DRM] DCB conn 01: 00001161 Jan 9 00:27:20 codex kernel: [ 18.438049] nouveau [ DRM] DCB conn 02: 00001231 Jan 9 00:27:20 codex kernel: [ 18.438050] nouveau [ DRM] DCB conn 03: 01000330 Jan 9 00:27:20 codex kernel: [ 18.438052] nouveau [ DRM] DCB conn 04: 01000446 Jan 9 00:27:20 codex kernel: [ 18.438053] nouveau [ DRM] DCB conn 05: 02000546 Jan 9 00:27:20 codex kernel: [ 18.438054] nouveau [ DRM] DCB conn 06: 00010661 Jan 9 00:27:20 codex kernel: [ 18.438055] nouveau [ DRM] DCB conn 07: 00010761 Jan 9 00:27:20 codex kernel: [ 18.438057] nouveau [ DRM] DCB conn 08: 00020847 Jan 9 00:27:20 codex kernel: [ 18.438059] nouveau [ DRM] DCB conn 09: 00000900 Jan 9 00:27:20 codex kernel: [ 18.439274] nouveau [ DRM] ACPI backlight interface available, not registering our own Jan 9 00:27:20 codex kernel: [ 18.439495] nouveau [ DRM] 3 available performance level(s) Jan 9 00:27:20 codex kernel: [ 18.439500] nouveau [ DRM] 0: core 50MHz shader 101MHz memory 135MHz voltage 820mV Jan 9 00:27:20 codex kernel: [ 18.439504] nouveau [ DRM] 1: core 202MHz shader 405MHz memory 324MHz voltage 820mV Jan 9 00:27:20 codex kernel: [ 18.439507] nouveau [ DRM] 3: core 775MHz shader 1550MHz memory 1250MHz voltage 1000mV Jan 9 00:27:20 codex kernel: [ 18.439511] nouveau [ DRM] c: core 202MHz shader 405MHz memory 324MHz voltage 1000mV Jan 9 00:27:20 codex kernel: [ 18.445776] nouveau [ DRM] MM: using COPY1 for buffer copies Jan 9 00:27:20 codex kernel: [ 18.705449] nouveau [ DRM] allocated 1920x1080 fb: 0x60000, bo ffff88060a4e8000 Jan 9 00:27:20 codex kernel: [ 18.706813] fbcon: nouveaufb (fb0) is primary device Jan 9 00:27:21 codex kernel: [ 20.089762] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device Jan 9 00:27:21 codex kernel: [ 20.089768] nouveau 0000:01:00.0: registered panic notifier Jan 9 00:27:21 codex kernel: [ 20.089773] [drm] Initialized nouveau 1.1.1 20120801 for 0000:01:00.0 on minor 0 Jan 9 00:28:24 codex kernel: [ 74.287528] nouveau [ DRM] suspending display... Jan 9 00:28:24 codex kernel: [ 74.287549] nouveau [ DRM] unpinning framebuffer(s)... Jan 9 00:28:24 codex kernel: [ 74.287678] nouveau [ DRM] evicting buffers... Jan 9 00:28:24 codex kernel: [ 74.669560] nouveau [ DRM] waiting for kernel channels to go idle... Jan 9 00:28:24 codex kernel: [ 74.669589] nouveau [ DRM] suspending client object trees... Jan 9 00:28:24 codex kernel: [ 74.670070] nouveau [ DRM] suspending kernel object tree... Jan 9 00:28:24 codex kernel: [ 78.372083] nouveau [ DRM] re-enabling device... Jan 9 00:28:24 codex kernel: [ 78.372094] nouveau [ DRM] resuming kernel object tree... Jan 9 00:28:24 codex kernel: [ 78.372100] nouveau [ VBIOS][0000:01:00.0] running init tables Jan 9 00:28:24 codex kernel: [ 78.599853] nouveau [ DRM] resuming client object trees... Jan 9 00:28:24 codex kernel: [ 78.600083] nouveau [ DRM] resuming display... Jan 9 00:28:24 codex kernel: [ 80.727990] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x005fba0000 Xorg[4341]] Jan 9 00:28:24 codex kernel: [ 80.728043] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Jan 9 00:28:24 codex kernel: [ 80.728120] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x005fba0000 Xorg[4341]] Jan 9 00:28:24 codex kernel: [ 80.728168] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Jan 9 00:28:24 codex kernel: [ 80.728251] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x005fba0000 Xorg[4341]] Jan 9 00:28:24 codex kernel: [ 80.728297] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Jan 9 00:28:24 codex kernel: [ 80.728434] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x005fba0000 Xorg[4341]] Jan 9 00:28:24 codex kernel: [ 80.728476] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e <and so on> Created attachment 94996 [details]
Remove NVDEV_ENGINE_COPY1 from GF116 to mirror GF106
Since Michael Weirauch reports this issue has been resolved for his card (0xc3, according to his dmesg uploads), and the issue remains on Doug Brunner and my card (0xcf), and that both cards once used the same code-base and functioned correctly in the Linux 3.4 Kernels, I went on a hunch.
By changing the 0xc3 definition to 0xcf (forcing my card to be recognized as a GF106), I was able to reproduce Michael's suspend/resume success with my own card.
Having discovered that THAT works, it became a mission to figure out what was different between the 0xc3 (GF106, working) and 0xcf (GF116, non-working).
By removing the declaration of NVDEV_ENGINE_COPY1 from the 0xcf case, as this patch does, the suspend/resume issue no longer affects my card.
I have submitted this patch for your review, and sincerely hope it is accepted. Thanks!
(In reply to comment #42) > Created attachment 94996 [details] > Remove NVDEV_ENGINE_COPY1 from GF116 to mirror GF106 > > Since Michael Weirauch reports this issue has been resolved for his card > (0xc3, according to his dmesg uploads), and the issue remains on Doug > Brunner and my card (0xcf), and that both cards once used the same code-base > and functioned correctly in the Linux 3.4 Kernels, I went on a hunch. > > By changing the 0xc3 definition to 0xcf (forcing my card to be recognized as > a GF106), I was able to reproduce Michael's suspend/resume success with my > own card. > > Having discovered that THAT works, it became a mission to figure out what > was different between the 0xc3 (GF106, working) and 0xcf (GF116, > non-working). > > By removing the declaration of NVDEV_ENGINE_COPY1 from the 0xcf case, as > this patch does, the suspend/resume issue no longer affects my card. > > I have submitted this patch for your review, and sincerely hope it is > accepted. Thanks! Can you check if the issue still occurs with 3.14-rcX (but without your patch)? Some changes were made to better respect the engine disables in register 22500. (Speaking of which, can you grab envytools and do a "nvapeek 22500"... and also 22580) Configured and compiled linux-3.14-rc4.tar.xz as found on kernel.org, and the suspend/resume issue still exists. "nvapeek 22500" and "nvapeek 22580" just yielded "...", so I'm posting the contents of "nvapeek 22400 400" for a wider view of that area: 00022400: 00000000 00000000 00000000 00000002 00022410: 00000000 00000000 30000000 00000000 00022420: 00000000 00000000 00000800 00000000 00022430: 00000001 00000004 00000003 00000000 ... 00022600: 0000001c 00000000 00000000 00000000 ... 00022680: 8000001c 00000000 00000000 00000000 ... (In reply to comment #44) > Configured and compiled linux-3.14-rc4.tar.xz as found on kernel.org, and > the suspend/resume issue still exists. > > > "nvapeek 22500" and "nvapeek 22580" just yielded "...", so I'm posting the OK, well "..." means "0" -- a little confusing, but oh well. So none of the DISABLE bits are set, which means that 3.14-rcX will not help you. If you want, you can achieve the same effect with nouveau.config=PCE1=0 . Not sure why enabling it causes a resume issue. Thanks for the command-line tip, that does indeed have the desired effect on standard-built Fedora kernels, and is much better than setting nouveau.noaccel=1 (which, when activated, is now leading to some font-rendering glitches). Actually, it does kind of make sense that an invalid copy engine would cause screen garbage to be displayed, and that this would be triggered by the rapid screen repainting that happens on Resume. I'll leave it up to more knowledgeable minds whether to act upon this patch or not; but it's worth noting that only 4 other devices out of the 9 declared in nvc0.c have an entry for a secondary, "COPY1" engine -- the GF100, GF104, GF110, and GF114. I'm just thankful to have a usable solution on my rig in the meantime. Thanks! Setting config=PCE1=0 in my modprobe .conf file also fixes the issue for me; I was able to remove noaccel=1 and still suspend and resume without problems. I updated to Ubuntu's backported kernel 3.12.2 since my last post (to fix an unrelated Ethernet issue). I haven't tried that kernel with no nouveau options, can do so if it would be helpful; I suspect not though, since Laurence Lee found the issue still existed in 3.14. (In reply to comment #46) > Thanks for the command-line tip, that does indeed have the desired effect on > standard-built Fedora kernels, and is much better than setting > nouveau.noaccel=1 (which, when activated, is now leading to some > font-rendering glitches). > > Actually, it does kind of make sense that an invalid copy engine would cause > screen garbage to be displayed, and that this would be triggered by the > rapid screen repainting that happens on Resume. > > I'll leave it up to more knowledgeable minds whether to act upon this patch > or not; but it's worth noting that only 4 other devices out of the 9 > declared in nvc0.c have an entry for a secondary, "COPY1" engine -- the > GF100, GF104, GF110, and GF114. > > I'm just thankful to have a usable solution on my rig in the meantime. > Thanks! I am also still experiencing the same issue on NVCF (nouveau for-next from earlier today), but it is _NOT_ limited to resume at least in my case. I can also confirm that nouveau.config=PCE1=0 seems to workaround/fix the issue. The corruption does not happen immediately, but after using the system for a long time (days) and/or after running some graphics-intensive games corruptions start to slowly appear (old random data in various windows). I was also experiencing the problem of corruption after a suspend/resume. Adding "nouveau.config=PCE1=0" has seemed to fix it. I am on Arch Linux. Here is my lspci -v: 01:00.0 VGA compatible controller: NVIDIA Corporation GF116M [GeForce GT 560M] (rev a1) (prog-if 00 [VGA controller]) Subsystem: CLEVO/KAPOK Computer Device 5102 Flags: bus master, fast devsel, latency 0, IRQ 54 Memory at f4000000 (32-bit, non-prefetchable) [size=32M] Memory at e8000000 (64-bit, prefetchable) [size=128M] Memory at f0000000 (64-bit, prefetchable) [size=64M] I/O ports at e000 [size=128] Expansion ROM at f6000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau Kernel modules: nouveau I got the same errors in my dmesg as everyone else. These two lines basically repeat forever: [ 74.912769] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x005fba0000 X[2244]] [ 74.912788] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e Here are my Arch Linux package versions: extra/nouveau-dri 10.1.0-4 local/nouveau-fw 325.15-1 extra/xf86-video-nouveau 1.0.10-2 extra/libdrm 2.4.52-1 core/linux 3.14-4 (base) core/linux-api-headers 3.13.2-1 core/linux-firmware 20140316.dec41bc-1 core/linux-headers 3.14-4 extra/mesa 10.1.0-4 extra/mesa-demos 8.1.0-1 extra/mesa-libgl 10.1.0-4 Xorg.0.log doesn't have anything particularly interesting in it. Same problem here after suspend/resume. The console is flooded with the same messages below for a short period then a white screen appears with a slight noise. CTL-ALT-F1 does not work so I have to reboot. [ 392.485883] nouveau E[ PGRAPH][0000:01:00.0] SHADER 0xa004021e [ 392.489922] nouveau E[ PGRAPH][0000:01:00.0] TRAP ch 2 [0x00bfa80000 Xorg[1154]] I created a file in /etc/modprobe.d with the line "nouveau.config=PCE1=0" but that does not help. Is this correct? Or do I need a more recent kernel? I use an Ubuntu 14.04 64-bits. Here is my system information: lspci: 01:00.0 VGA compatible controller: NVIDIA Corporation GF116M [GeForce GT 560M] (rev a1) (prog-if 00 [VGA controller]) Subsystem: ASUSTeK Computer Inc. Device 204a Flags: bus master, fast devsel, latency 0, IRQ 47 Memory at f2000000 (32-bit, non-prefetchable) [size=32M] Memory at e0000000 (64-bit, prefetchable) [size=128M] Memory at e8000000 (64-bit, prefetchable) [size=64M] I/O ports at d000 [size=128] Expansion ROM at f4000000 [disabled] [size=512K] Capabilities: [60] Power Management version 3 Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+ Capabilities: [78] Express Endpoint, MSI 00 Capabilities: [b4] Vendor Specific Information: Len=14 <?> Capabilities: [100] Virtual Channel Capabilities: [128] Power Budgeting <?> Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?> Kernel driver in use: nouveau kernel: 3.13.0-34-generic libdrm: 2.4.52-1 mesa: 10.1.3-0ubuntu0.1 (same with 10.2.5 compiled from git) xorg: 1.15.1-0ubuntu2.1 xorg-video-nouveau: 1.0.10-1ubuntu2 Well, while updating initramfs, the command complained about bad syntax of "nouveau.config=PCE1=0" in conf files. I just realized that it was for kernel options so I used "options nouveau config=PCE1=0" in my conf file instead. Indeed the nouveau messages flood and the white screen disappear. The login screen is showed and I can move the mouse cursor but I cannot interact at all and CTL-ATL-F1 does not work. The mouse cursor disappear after a while. I cannot do anything apart from rebooting. I also tried kernel 3.16.1 with same result. This bug covered a lot of different issues over its lifetime. The last one of them is a NVCF issue where the second copy engine does not appear to be there. We've disabled nouveau attempting to use it on any NVCF's and the patch is in 3.18 (and being backported to stable trees). If you feel like you still have an issue related to this bug, open a new one, do not under any circumstances reopen this one, as it has been too polluted by unrelated issues and comments. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.