Created attachment 62357 [details] Kernel messages - starting gdm3 Original report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=675302 The system hangs as soon as starting xorg when using the nouveau module. The non-free nvidia driver or just the fbdrv work fine. I'm including logs for what happens when starting gdm3 under debian using a 3.3.6 kernel and when starting Xorg under partedmagic also using linux 3.3.6
Created attachment 62358 [details] Kernel messages - partedmagic - starting Xorg
More details from http://bugs.debian.org/675302: 1. Starting and using X with the fbdev driver works fine. 2. "startx /usr/bin/xterm" to start a minimal X session using the nouveau driver freezes, too. | Now tried running startx /usr/bin/xterm with nouveau, | | [ 82.427553] [drm] nouveau 0000:01:00.0: PDISP: DCB for 6/0xbad00103 not found | [ 82.428536] [drm] nouveau 0000:01:00.0: PDISP: DCB for 0/0xbad00103 not found | [ 82.429483] [drm] nouveau 0000:01:00.0: Table 0x0103 not found for 0/2, using first | | I kept a previously opened ssh connection. | When starting X, the screen went black, but didn't totally lock up until | I killed the X process from SSH. No further netconsole output, the | machine went totally dead. 3. This is a regression. | A while back (like around linux 3.0.0) nouveau and gnome 2.30 did work | on this same machine
Kernel crashes on this line in nvd0_display.c/evo_wait: disp->evo[id].ptr[put] = 0x20000000; because "put" has unexpectedly large value: 2eb40040 / 2eb7c400 / 2eb40040. I bet evo id is wrong and we read wrong register. I'm not sure how to proceed from here. For now please test 3.4 kernel, attach new dmesg and vbios [1] (last messages indicate it might be related to vbios parsing). [1] http://nouveau.freedesktop.org/wiki/DumpingVideoBios
Instructions for testing a more recent kernel: # prerequisites apt-get install git build-essential # get a copy of the kernel history, if you do not already have it git clone \ git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git # get latest rev from nouveau tree git remote add nouveau \ git://anongit.freedesktop.org/git/nouveau/linux-2.6 git fetch nouveau git checkout nouveau/master # configure cp /boot/config-$(uname -r) .config; # current configuration scripts/config --disable DEBUG_INFO # optional: minimize configuration # This works by disabling all modules that are not loaded, # which are generally drivers for hardware you don't have. # Make sure nouveau is loaded before doing this. make localmodconfig # build, test make deb-pkg; # optionally with -j<num> for parallel build dpkg -i ../<name of package>; # as root reboot
Created attachment 62981 [details] ROM of my "ASUS ENGT520 Silent/DI/1GD3(LP) GeForce GT 520" Sorry for such a long delay. This was dumped using vbtracetool. NVIDIA non-free drivers were in use at the time, hope it doesn't matter.
On 6/1/2012 3:55 PM, bugzilla-daemon@freedesktop.org wrote: > # get latest rev from nouveau tree > git remote add nouveau \ > git://anongit.freedesktop.org/git/nouveau/linux-2.6 ~$ git remote add nouveau git://anongit.freedesktop.org/git/nouveau/linux-2.6 fatal: Not a git repository (or any parent up to mount parent ) Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). What am I doing wrong?
> ~$ git remote add nouveau git://anongit.freedesktop.org/git/nouveau/linux-2.6 > fatal: Not a git repository (or any parent up to mount parent ) > Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set). > > What am I doing wrong? My bad. The step I left out is "cd linux".
Created attachment 62998 [details] Linux 3.5 - works OK sorry that was really obvious.. :D gdm3 starts fine with linux 3.5 and I can log in and use the desktop, so far I see no obvious issue. Guess I should try the 3.4.1 currently in experimental..
Created attachment 62999 [details] Linux 3.4.1 - works Yea, that works too. OK, anything else I can do to help?
Gedalya <gedalya@gedalya.net> wrote: > OK, anything else I can do to help? If you can find the fix by bisecting, that would make me very happy. Note that this will be a little confusing. When you say "git bisect bad", that means the kernel did _not_ exhibit the problem. When you say "git bisect good", that means the kernel _did_ exhibit the problem. The "first bad commit" is the patch that fixed it. :) It works like this: cd linux git bisect start v3.5-rc1 v3.3 -- drivers/gpu/drm/nouveau make deb-pkg; # maybe with -j4 That builds a version half-way between to test. So: dpkg -i ../<name of package>; # as root reboot cd linux git bisect good; # if it crashed git bisect bad; # if it worked fine git bisect skip; # if some other bug makes it hard to test Then another version to test will be automatically checked out and the process continues. Again, remember: good = crashy. Finding the patch should take about 8 rounds. If the gitk package is installed, you can see the range with the fix narrowing by running "git bisect visualize" at any step. If you get bored before the process finishes, "git bisect log" will list the results discovered so far. Even a few rounds can help a lot in narrowing down which patch is responsible. Then we should be better prepared to fix this in stable kernels. Thanks, Jonathan
On 6/13/2012 9:43 PM, bugzilla-daemon@freedesktop.org wrote: > If you can find the fix by bisecting, that would make me very happy. > I hope I got it all right, I tried to be very careful... $ git bisect bad 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2 is the first bad commit commit 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2 Author: Ben Skeggs <bskeggs@redhat.com> Date: Mon Mar 12 15:23:44 2012 +1000 drm/nvd0/disp: disconnect encoders before reprogramming them Signed-off-by: Ben Skeggs <bskeggs@redhat.com> :040000 040000 4648bcf9d08c294fdb7ef76343e6f89e3cc2fe24 2c31aa85ebf6c6f3b96fc6e91934a895bb42c0a3 M drivers $ git bisect log # bad: [f8f5701bdaf9134b1f90e5044a82c66324d2073f] Linux 3.5-rc1 # good: [c16fa4f2ad19908a47c63d8fa436a1178438c7e7] Linux 3.3 git bisect start 'v3.5-rc1' 'v3.3' '--' 'drivers/gpu/drm/nouveau' # bad: [4a206ffc0bfe8e8c3fc0468a052f5b0bb625a57b] drm/nouveau: oops, create m2mf for nvd9 too git bisect bad 4a206ffc0bfe8e8c3fc0468a052f5b0bb625a57b # good: [7d3a766b6aa4e293e72bfd6add477f05ac7fdf5a] drm/nouveau/pm: init only after display subsystem has been created git bisect good 7d3a766b6aa4e293e72bfd6add477f05ac7fdf5a # bad: [f1377998eede7a8caa124fcf6a589b02c9e2bac7] drm/nouveau: add userspace fallback hints. git bisect bad f1377998eede7a8caa124fcf6a589b02c9e2bac7 # good: [c11dd0da5277596d0ccdccb745b273d69a94f2d7] drm/nouveau/pm: fix oops if chipset has no pm support at all git bisect good c11dd0da5277596d0ccdccb745b273d69a94f2d7 # good: [6e83fda2c055f17780b2feef404f06803a49a261] drm/nvd0/disp: initial implementation of displayport git bisect good 6e83fda2c055f17780b2feef404f06803a49a261 # good: [3488c57b983546e6bf4c9e0bfd0f7f2a1292267a] drm/nvd0/disp: move syncs/magic setup to or mode_set git bisect good 3488c57b983546e6bf4c9e0bfd0f7f2a1292267a # bad: [2f5394c3ed573de2ab18cdac503b8045cd16ac5e] drm/nouveau: map first page of mmio early and determine chipset earlier git bisect bad 2f5394c3ed573de2ab18cdac503b8045cd16ac5e # bad: [4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2] drm/nvd0/disp: disconnect encoders before reprogramming them git bisect bad 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2
Adding Ben Skeggs to cc. Ben, can you think of any reason not to include commit 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2 Author: Ben Skeggs <bskeggs@redhat.com> Date: Mon Mar 12 15:23:44 2012 +1000 drm/nvd0/disp: disconnect encoders before reprogramming them Signed-off-by: Ben Skeggs <bskeggs@redhat.com> in stable kernels? Gedalya is finding it fixes a hang that can also be experienced with 3.2.y and 3.3.y.
Created attachment 63122 [details] [review] drm/nvd0/disp: disconnect encoders before reprogramming them Thanks again for your hard work tracking this down. 3.3.y (unlike 3.2.y) is not maintained any more, but the backport to 3.3 is easier, so let's start there. Please test the attached patch against the 3.3.y tree, like so: cd linux # fetch point releases git remote add stable \ git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git git fetch stable # 3.3.y git checkout stable/linux-3.3.y make silentoldconfig make deb-pkg; # maybe with -j4 dpkg -i ../<name of package> reboot # hopefully it reproduces the problem. So try the patch: cd linux git am -3sc /path/to/patch make deb-pkg; # maybe with -j4 dpkg -i ../<name of package> reboot
On 6/16/2012 8:54 PM, bugzilla-daemon@freedesktop.org wrote: > Thanks again for your hard work tracking this down. Thank you!! You've been very helpful and this is a great learning experience! > Please test the attached patch against the 3.3.y tree, like so: > # hopefully it reproduces the problem. So try the patch: It did indeed. > cd linux > git am -3sc /path/to/patch > make deb-pkg; # maybe with -j4 > dpkg -i ../<name of package> > reboot > So yea, it applied with no issues, compiled, and the computer seems to be quite usable, installed chromium etc. and so far so good. This thing showed up, as before - [ 60.936501] colord[2769]: segfault at 8 ip 000000000040bc6d sp 00007fff020a9110 error 4 in colord[400000+20000] Not sure what it is, perhaps something else is wrong somewhere, perhaps unrelated.
What about crashes on linux 3.5.1 with applications uses 3d acceleration? I`m booting with nouveau.noaccel=0. Gears sometime works, if enable kde effects - driver is freeze.
Sounds like the original issue is fixed. Vova, if you're still having issues, file a new bug describing your symptoms, take a look at http://nouveau.freedesktop.org/wiki/Bugs/ for doing that effectively.
Yes, bug is fixed for me now.
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.