| Summary: | [NV94] INVALID_STATE error, X fails to start on GeForce 9600 GT with dual monitors, kernels 3.18.0-0.rc0.git8.2.fc22.1 onwards | ||
|---|---|---|---|
| Product: | xorg | Reporter: | Adam Williamson <adamw> | 
| Component: | Driver/nouveau | Assignee: | Roy <nouveau> | 
| Status: | RESOLVED FIXED | QA Contact: | Nouveau Project <nouveau> | 
| Severity: | critical | ||
| Priority: | medium | CC: | michael, zcalusic | 
| Version: | unspecified | ||
| Hardware: | x86-64 (AMD64) | ||
| OS: | Linux (All) | ||
| Whiteboard: | |||
| i915 platform: | i915 features: | ||
| Attachments: | |||
| 
 
        
          Description
        
        
          Adam Williamson
        
        
        
        
          2014-10-17 23:13:05 UTC
        
       
    Created attachment 108011 [details]
journalctl from an affected boot with drm.debug=15
    Downstream reported case that resembles: Nouveau display(DVI) broken - kernel 3.18 https://bugzilla.redhat.com/show_bug.cgi?id=1157191 A user on IRC bisected a failure that resulted in PDISP getting very unhappy to: commit 1dce6264045cd23e9c07574ed0bb31c7dce9354f Author: Roy Spliet <rspliet@eclipso.eu> Date: Fri Sep 12 18:00:13 2014 +0200 drm/nv50/kms: Set VBLANK time in modeset script Solves blinking on reclocking memory. The value set is an underestimate, but with non-reduced vblanking this should give us plenty of time Signed-off-by: Roy Spliet <rspliet@eclipso.eu> Signed-off-by: Ben Skeggs <bskeggs@redhat.com> You should safely be able to revert this commit, see if that helps. Thanks for the pointer, I'll try and do a Fedora kernel build with the patch reverted sometime soon. Created attachment 108425 [details] [review] Revert "drm/nv50/kms: Set VBLANK time in modeset script" - git 1dce626 Ilia an Mr. user, thanks. :) I have the same problem in 3.18-rc1, and can confirm that the issue is fixed if I revert 1dce62640. Also GF 9600 GT and also dual monitor setup. In my case, my primary display (DVI) would go blank right after switching to FB and monitor would go to powersave. Reverting the mentioned commit fixes the issue completely. If there's an updated patch, I'm willing to test it before it goes mainstream. Created attachment 108457 [details] [review] Fix vblank period setting on G94 Instead of reverting said patch, please test the attached fix. (In reply to Roy from comment #7) > Created attachment 108457 [details] [review] [review] > Fix vblank period setting on G94 Only for G94? What about the rest of Family : NV50 - G98, GT215, GT216, MCP79/MCP7A, etc. > Instead of reverting said patch, please test the attached fix. Chipset: G98 (NV98) Family : NV50 All tests PASSED. Tested with: 3.18.0-rc1.git-2fd5b07-drm-fixes+ I also patched kernel-3.18.0-0.rc1.git4.1.fc22 http://koji.fedoraproject.org/koji/buildinfo?buildID=587854 and tested: - suspend(S3) core debug # echo core > /sys/power/pm_test # echo mem > /sys/power/state & RESUME - hibernate(S4) core debug # echo core > /sys/power/pm_test # echo disk > /sys/power/state & THAW - suspend(S3) none debug (systemctl suspend) # echo none > /sys/power/pm_test # echo mem > /sys/power/state & RESUME - hibernate(S4) none debug (systemctl hibernate) # echo none > /sys/power/pm_test # echo disk > /sys/power/state & THAW - soft-off(S5) # systemctl poweroff/reboot & BOOT Display is powered on and stays powered on. kernel-3.18.0-0.rc1.git4.NV50.fc21.x86_64 All tests PASSED. (In reply to poma from comment #9) > (In reply to Roy from comment #7) > > Created attachment 108457 [details] [review] [review] [review] > > Fix vblank period setting on G94 > > Only for G94? > What about the rest of Family : NV50 - G98, GT215, GT216, MCP79/MCP7A, etc. Bug reports I've seen only mentioned G94 as problematic - 3.18rc1 works fine on NV92, NVA3, NVA5, NVA8 and NVAC as I observed myself. This patch changes behaviour across all board ranging from NV50 to NVD9, but should have no visible effects on most chips. > > > Instead of reverting said patch, please test the attached fix. > > Chipset: G98 (NV98) > Family : NV50 > > All tests PASSED. > > Tested with: 3.18.0-rc1.git-2fd5b07-drm-fixes+ Thanks. Adam Williamson: does this patch fix your issues as well? Sorry, I didn't have time to test yet. I'll try and do it today. The patch works on v3.18-rc1 and v3.18-rc2. My X.org didn't start then, though. I had to update X.org (from 1.12 -> 1.16) and xf86-driver-nouveau (1.0.1 -> 1.0.11). Finally, I ended up upgrading Debian Wheezy to Jessie. After a few quirks with missing Gnome Shell packages everything worked fine. Tested-by: Michael Riesch <michael@riesch.at> (In reply to Roy from comment #7) > Created attachment 108457 [details] [review] [review] > Fix vblank period setting on G94 > > Instead of reverting said patch, please test the attached fix. The patch fixes the issue for me. Now 2 days running, no problems at all. Roy, when you intend to merge this patch? Fix looks good here too, thanks very much. System boots and X starts. Created attachment 108710 [details] [review] debug patch Can you guys please revert the fixes you're using, apply this debugging patch, and then send me your kernel logs of the issue occuring. Thanks, Ben. Created attachment 108713 [details]
dmesg-3.18.0-rc2.git-d34d4d8+a7e3f94-drm-fixes+nouveau-NV50
    Besides, with 0001-evo-debug.patch, occasionally(cca every second time) the machine does not boot up, at all - stuck in the middle of nowhere. Created attachment 108735 [details]
0001-evo-debug dmesg
    Hi Distro: Gentoo ~amd64 VGA compatible controller: NVIDIA Corporation GT218 [GeForce 210] (rev a2) 2 physical displays (1x 1680x1050, 1x 1440x900) on 3.18-rc1, without the vblank fix, logged http://codepad.org/cAjYm8BO (although the number of sucessful boots is still a lot higher than without the vblank fix) still on 3.18-rc1, with the vblank fix, logged http://codepad.org/bSWQeg3N (lot's of corruption on the framebuffer - normally the fb is black, this time i've found my screens with a lot of blueish artefacts, idk what to call them). Created attachment 109023 [details]
dmesg 3.18.0-0.rc3.git2.1.fc22 & darktama nouveau git b6dc8ef
Ben, Roy, is there a new patch, will the fix land in 3.18 mix?
    Created attachment 109248 [details]
dmesg-3.18.0-0.rc4.git0.1.fc22.x86_64-NV50
    (In reply to poma from comment #23) > Created attachment 109248 [details] > dmesg-3.18.0-0.rc4.git0.1.fc22.x86_64-NV50 That's not surprising given this fix was not merged in that tree. Please be patient, we'll get the fix in (or a different one if new data pops up) before 3.18 gets released. Created attachment 109287 [details]
dmesg-3.18.0-rc3.git-03dca70-drm-fixes+NV50
            == ALL TESTS PASSED ==
    Combination also tested, works OK: http://koji.fedoraproject.org/koji/buildinfo?buildID=592269 & http://cgit.freedesktop.org/~darktama/nouveau/commit/?id=ae69cfb $ modinfo nouveau -n /lib/modules/3.18.0-0.rc4.git0.2.fc22.x86_64/updates/nouveau.ko Thanks guys. The fix for this bug was merged in kernel 3.18 RC5. Thank you all for your feedback. If your problem persists with kernel 3.18 RC5 or newer, please re-open this bug. I have a similar problem, but with a different card. I believe my errors start the same way as the original poster. modinfo nouveau -n /lib/modules/4.2.3-300.fc23.x86_64/kernel/drivers/gpu/drm/nouveau/nouveau.ko.xz lspci -v | grep -i vga 01:00.0 VGA compatible controller: NVIDIA Corporation G92GLM [Quadro FX 2800M] (rev a2) (prog-if 00 [VGA controller]) I'm using a laptop (dell m6500) and get a bunch of nouveau errors printed when I connect my 2nd display via the display port. So long as I stay running in init 3, the primary laptop LCD continues to function, but once I go init 5, X appears to start, and then just hangs. I've attached dmesg output for when I connect the display port, and when I disconnect the display port. Please let me know if I've posted to the wrong bug or can help in any way. Created attachment 118804 [details]
dmesg boot
    Created attachment 118805 [details]
dmesg.connected_display_port.4.2.3-300.fc23.x86_64
    Created attachment 118806 [details]
dmesg.disconnect_display_port.4.2.3-300.fc23.x86_64
    (In reply to J from comment #28) > I have a similar problem, but with a different card. I believe my errors > start the same way as the original poster. Similar problem = new bug.  | 
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.