I have my nv43 back in action - and unfortunately new bugs to report. When I compiled new 3.7-rc kernel i noticed anything 3d related, even trivial/tri example from mesa/demos started to hang my X server - usually I was able to switch consoles and reboot clearly, but not always. Strangely, _indirect_ rendering worked ok! kernels 3.4.6, 3.5.7, 3.6.x - all OK. will add kernel/X log after reboot into freshly compiled nouveau kernel, now I want to attach partial bisection logs, done with mainline kernel.
Created attachment 71920 [details] nv43 bisection log [partial]
Created attachment 71921 [details] possible bad commits Unfortunately, all those commits resulted in BUG or simply black screen during boot, with completely non-working DRM. So, i can't test them.
Created attachment 71922 [details] Kernel .config This is my minimal kernel config - but bug also happens with bigger one (SMP, SLUB, tons of modules, etc). What I haven't checked - if changing ARCH from i486 to something more modern will do any good - will try this, too.
Created attachment 71923 [details] working X log This is on kernel 3.6.11. bash-4.2$ glxgears Running synchronized to the vertical refresh. The framerate should be approximately the same as the monitor refresh rate. 809 frames in 5.0 seconds = 161.695 FPS 1080 frames in 5.0 seconds = 215.978 FPS 997 frames in 5.0 seconds = 199.132 FPS 951 frames in 5.0 seconds = 189.950 FPS 988 frames in 5.0 seconds = 197.346 FPS XIO: fatal IO error 11 (Resource temporarily unavailable) on X server ":0" after 16815 requests (16815 known processed) with 0 events remaining. bash-4.2$
Created attachment 71927 [details] X log for affected kernel
Created attachment 71928 [details] dmesg after X start and launching gears Note: I switched away from X's vt shortly after black glxgears window appear - and captured this dmesg from another vt.
Interesting, if I disable ACPI completely - at least glxgears starts to works! (I need to disable it in kernel config, because simply booting with acpi=off stops ACPI-aware nouveau.ko from loading :( )
Created attachment 74851 [details] Kernel config [working glxgears with direct rendering] Kernel was from nouveau tree at commit commit f253235ed48aca9ebf008952d8484e59e64bebae Author: Ben Skeggs <bskeggs@redhat.com> Date: Thu Feb 14 13:43:21 2013 +1000 drm/nv84-/fence: prepare for emit/sync support of sysram sequences --------------- Sadly, launching seamonkey caused GPU lockup, still. But I was able to switch back to console and reboot. But it was different bug, this one doesn't show any 'GPU lockup' messages.
Created attachment 83509 [details] Nouveau git kernel + experimental pmpeg path also hangs I also tried many new kernels, up to "drm/nouveau/vm: make vm refcount into a kref" (3.11-rc3 based) - they all hang with enabled ACPI. This dmesg was captured after applying mesa patch from this thread [for pmpeg testing]: http://lists.freedesktop.org/archives/mesa-dev/2013-July/042473.html As you hopefully can see - XvMC also hangs like DRI clients.
The PMPEG hangs are expected... pre-NV44 doesn't have context switching. I have a patch (which I sent to the ML) but it needs some work. So don't worry about that not working. Although it shouldn't hang, it just shouldn't do any actual decoding (and generate all those errors). Some of the warnings in your dmesg should have be fixed as of the latest nouveau/master or 3.11-rc7. Are you saying that simply running trivial/tri will hang X? I don't see that on a NV42 (PCIe), which is rather similar to your NV43 (AGP). Could you show a dmesg log of trivial/tri hanging things? (Are you using mesa 9.2 or mesa-git?)
Created attachment 85161 [details] dmesg from nouveau/master Using mesa git (9.3.0-6b5c802) trivial/tri also hangs X server. (I switched away from it to working VT, and captured this dmesg.)
I have lockups at NV43 too. Affected kernels are at least 3.10 and 3.9. More info: https://bugzilla.redhat.com/show_bug.cgi?id=979537
I seems I can avoid (workaround) this bug by simply changing in my .config CONFIG_PREEMPT_NONE=y to CONFIG_PREEMPT_VOLUNTARY=y . I made this discovery while trying official Slackware kernel from Slackware-current. May be this bug has something in common with bug described in https://www.kernel.org/pub/linux/kernel/v3.x/ChangeLog-3.18.3 as "mm, vmscan: prevent kswapd livelock due to pfmemalloc-throttled process being killed' . Bug seems to hit only nv40 (and less?) codepath, because my nv50 works fine without preemption. Should I leave this bugreport open, or resolve it as WORKSFORME? (currently I'm compiling additional test kernel based on nouveau/linux-2.6 linux-3.19 branch , just to test my workaround one more time.) Reclocking still seems to work only partially (lost display and another kind of hang after just few seconds of glxgears on highest perf level, lost display after I switch clocks back to lower frequencies), but this is another issue
I think I can close this one, currently running 4.10.0-rc5-i486 #2 and it seems stable (after I tweaked BIOS settings - enabled "PnP OS installed" and disabled APIC).
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.