Bug 50571 - nouveau crashes with GeForce GT 520
Summary: nouveau crashes with GeForce GT 520
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium normal
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2012-06-01 00:10 UTC by Gedalya
Modified: 2013-08-22 15:49 UTC (History)
4 users (show)

See Also:
i915 platform:
i915 features:


Attachments
Kernel messages - starting gdm3 (64.83 KB, text/plain)
2012-06-01 00:10 UTC, Gedalya
no flags Details
Kernel messages - partedmagic - starting Xorg (39.21 KB, text/plain)
2012-06-01 00:11 UTC, Gedalya
no flags Details
ROM of my "ASUS ENGT520 Silent/DI/1GD3(LP) GeForce GT 520" (60.50 KB, application/octet-stream)
2012-06-13 10:36 UTC, Gedalya
no flags Details
Linux 3.5 - works (66.24 KB, text/plain)
2012-06-13 17:08 UTC, Gedalya
no flags Details
Linux 3.4.1 - works (62.10 KB, text/plain)
2012-06-13 17:16 UTC, Gedalya
no flags Details
drm/nvd0/disp: disconnect encoders before reprogramming them (2.21 KB, patch)
2012-06-16 17:54 UTC, Jonathan Nieder
no flags Details | Splinter Review

Description Gedalya 2012-06-01 00:10:33 UTC
Created attachment 62357 [details]
Kernel messages - starting gdm3

Original report: http://bugs.debian.org/cgi-bin/bugreport.cgi?bug=675302

The system hangs as soon as starting xorg when using the nouveau module. The non-free nvidia driver or just the fbdrv work fine.

I'm including logs for what happens when starting gdm3 under debian using a 3.3.6 kernel and when starting Xorg under partedmagic also using linux 3.3.6
Comment 1 Gedalya 2012-06-01 00:11:45 UTC
Created attachment 62358 [details]
Kernel messages - partedmagic - starting Xorg
Comment 2 Jonathan Nieder 2012-06-01 00:27:50 UTC
More details from http://bugs.debian.org/675302:

1. Starting and using X with the fbdev driver works fine.

2. "startx /usr/bin/xterm" to start a minimal X session using the nouveau driver freezes, too.

| Now tried running startx /usr/bin/xterm with nouveau,
| 
| [   82.427553] [drm] nouveau 0000:01:00.0: PDISP: DCB for 6/0xbad00103 not found
| [   82.428536] [drm] nouveau 0000:01:00.0: PDISP: DCB for 0/0xbad00103 not found
| [   82.429483] [drm] nouveau 0000:01:00.0: Table 0x0103 not found for 0/2, using first
|
| I kept a previously opened ssh connection.
| When starting X, the screen went black, but didn't totally lock up until
| I killed the X process from SSH. No further netconsole output, the
| machine went totally dead.

3. This is a regression.

| A while back (like around linux 3.0.0) nouveau and gnome 2.30 did work 
| on this same machine
Comment 3 Marcin Slusarz 2012-06-01 12:42:40 UTC
Kernel crashes on this line in nvd0_display.c/evo_wait:
disp->evo[id].ptr[put] = 0x20000000;
because "put" has unexpectedly large value: 2eb40040 / 2eb7c400 / 2eb40040.
I bet evo id is wrong and we read wrong register.

I'm not sure how to proceed from here. For now please test 3.4 kernel, attach  new dmesg and vbios [1] (last messages indicate it might be related to vbios parsing).

[1] http://nouveau.freedesktop.org/wiki/DumpingVideoBios
Comment 4 Jonathan Nieder 2012-06-01 12:55:45 UTC
Instructions for testing a more recent kernel:

# prerequisites
apt-get install git build-essential

# get a copy of the kernel history, if you do not already have it
git clone \
  git://git.kernel.org/pub/scm/linux/kernel/git/torvalds/linux.git

# get latest rev from nouveau tree
git remote add nouveau \
  git://anongit.freedesktop.org/git/nouveau/linux-2.6
git fetch nouveau
git checkout nouveau/master

# configure
cp /boot/config-$(uname -r) .config; # current configuration
scripts/config --disable DEBUG_INFO

# optional: minimize configuration
# This works by disabling all modules that are not loaded,
# which are generally drivers for hardware you don't have.
# Make sure nouveau is loaded before doing this.
make localmodconfig

# build, test
make deb-pkg; # optionally with -j<num> for parallel build
dpkg -i ../<name of package>; # as root
reboot
Comment 5 Gedalya 2012-06-13 10:36:23 UTC
Created attachment 62981 [details]
ROM of my "ASUS ENGT520 Silent/DI/1GD3(LP) GeForce GT 520"

Sorry for such a long delay.
This was dumped using vbtracetool.
NVIDIA non-free drivers were in use at the time, hope it doesn't matter.
Comment 6 Gedalya 2012-06-13 15:42:15 UTC
On 6/1/2012 3:55 PM, bugzilla-daemon@freedesktop.org wrote:
> # get latest rev from nouveau tree
> git remote add nouveau \
>    git://anongit.freedesktop.org/git/nouveau/linux-2.6
~$ git remote add nouveau 
git://anongit.freedesktop.org/git/nouveau/linux-2.6
fatal: Not a git repository (or any parent up to mount parent )
Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).

What am I doing wrong?
Comment 7 Jonathan Nieder 2012-06-13 15:49:56 UTC
> ~$ git remote add nouveau git://anongit.freedesktop.org/git/nouveau/linux-2.6
> fatal: Not a git repository (or any parent up to mount parent )
> Stopping at filesystem boundary (GIT_DISCOVERY_ACROSS_FILESYSTEM not set).
>
> What am I doing wrong?

My bad.  The step I left out is "cd linux".
Comment 8 Gedalya 2012-06-13 17:08:19 UTC
Created attachment 62998 [details]
Linux 3.5 - works

OK sorry that was really obvious.. :D

gdm3 starts fine with linux 3.5 and I can log in and use the desktop, so far I see no obvious issue.

Guess I should try the 3.4.1 currently in experimental..
Comment 9 Gedalya 2012-06-13 17:16:18 UTC
Created attachment 62999 [details]
Linux 3.4.1 - works

Yea, that works too.

OK, anything else I can do to help?
Comment 10 Jonathan Nieder 2012-06-13 18:43:24 UTC
Gedalya <gedalya@gedalya.net> wrote:

> OK, anything else I can do to help?

If you can find the fix by bisecting, that would make me very happy.

Note that this will be a little confusing.  When you say "git bisect
bad", that means the kernel did _not_ exhibit the problem.  When you
say "git bisect good", that means the kernel _did_ exhibit the
problem.  The "first bad commit" is the patch that fixed it. :)

It works like this:

	cd linux
	git bisect start v3.5-rc1 v3.3 -- drivers/gpu/drm/nouveau
	make deb-pkg; # maybe with -j4

That builds a version half-way between to test.  So:

	dpkg -i ../<name of package>; # as root
	reboot

	cd linux
	git bisect good; # if it crashed
	git bisect bad; # if it worked fine
	git bisect skip; # if some other bug makes it hard to test

Then another version to test will be automatically checked out
and the process continues.

Again, remember: good = crashy.

Finding the patch should take about 8 rounds.  If the gitk package is
installed, you can see the range with the fix narrowing by running
"git bisect visualize" at any step.  If you get bored before the
process finishes, "git bisect log" will list the results discovered so
far.  Even a few rounds can help a lot in narrowing down which patch
is responsible.

Then we should be better prepared to fix this in stable kernels.

Thanks,
Jonathan
Comment 11 Gedalya 2012-06-16 16:10:08 UTC
On 6/13/2012 9:43 PM, bugzilla-daemon@freedesktop.org wrote:
> If you can find the fix by bisecting, that would make me very happy.
>

I hope I got it all right, I tried to be very careful...


$ git bisect bad
4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2 is the first bad commit
commit 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2
Author: Ben Skeggs <bskeggs@redhat.com>
Date:   Mon Mar 12 15:23:44 2012 +1000

     drm/nvd0/disp: disconnect encoders before reprogramming them

     Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

:040000 040000 4648bcf9d08c294fdb7ef76343e6f89e3cc2fe24 
2c31aa85ebf6c6f3b96fc6e91934a895bb42c0a3 M      drivers


$ git bisect log
# bad: [f8f5701bdaf9134b1f90e5044a82c66324d2073f] Linux 3.5-rc1
# good: [c16fa4f2ad19908a47c63d8fa436a1178438c7e7] Linux 3.3
git bisect start 'v3.5-rc1' 'v3.3' '--' 'drivers/gpu/drm/nouveau'
# bad: [4a206ffc0bfe8e8c3fc0468a052f5b0bb625a57b] drm/nouveau: oops, 
create m2mf for nvd9 too
git bisect bad 4a206ffc0bfe8e8c3fc0468a052f5b0bb625a57b
# good: [7d3a766b6aa4e293e72bfd6add477f05ac7fdf5a] drm/nouveau/pm: init 
only after display subsystem has been created
git bisect good 7d3a766b6aa4e293e72bfd6add477f05ac7fdf5a
# bad: [f1377998eede7a8caa124fcf6a589b02c9e2bac7] drm/nouveau: add 
userspace fallback hints.
git bisect bad f1377998eede7a8caa124fcf6a589b02c9e2bac7
# good: [c11dd0da5277596d0ccdccb745b273d69a94f2d7] drm/nouveau/pm: fix 
oops if chipset has no pm support at all
git bisect good c11dd0da5277596d0ccdccb745b273d69a94f2d7
# good: [6e83fda2c055f17780b2feef404f06803a49a261] drm/nvd0/disp: 
initial implementation of displayport
git bisect good 6e83fda2c055f17780b2feef404f06803a49a261
# good: [3488c57b983546e6bf4c9e0bfd0f7f2a1292267a] drm/nvd0/disp: move 
syncs/magic setup to or mode_set
git bisect good 3488c57b983546e6bf4c9e0bfd0f7f2a1292267a
# bad: [2f5394c3ed573de2ab18cdac503b8045cd16ac5e] drm/nouveau: map first 
page of mmio early and determine chipset earlier
git bisect bad 2f5394c3ed573de2ab18cdac503b8045cd16ac5e
# bad: [4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2] drm/nvd0/disp: 
disconnect encoders before reprogramming them
git bisect bad 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2
Comment 12 Jonathan Nieder 2012-06-16 16:41:44 UTC
Adding Ben Skeggs to cc.  Ben, can you think of any reason not to include

 commit 4cbb0f8d2b06c72aae3552ff1a0a57814c6ce7d2
 Author: Ben Skeggs <bskeggs@redhat.com>
 Date:   Mon Mar 12 15:23:44 2012 +1000

      drm/nvd0/disp: disconnect encoders before reprogramming them

      Signed-off-by: Ben Skeggs <bskeggs@redhat.com>

in stable kernels?  Gedalya is finding it fixes a hang that can also
be experienced with 3.2.y and 3.3.y.
Comment 13 Jonathan Nieder 2012-06-16 17:54:10 UTC
Created attachment 63122 [details] [review]
drm/nvd0/disp: disconnect encoders before reprogramming them

Thanks again for your hard work tracking this down.

3.3.y (unlike 3.2.y) is not maintained any more, but the backport
to 3.3 is easier, so let's start there.

Please test the attached patch against the 3.3.y tree, like so:

  cd linux

  # fetch point releases
  git remote add stable \
    git://git.kernel.org/pub/scm/linux/kernel/git/stable/linux-stable.git
  git fetch stable

  # 3.3.y
  git checkout stable/linux-3.3.y
  make silentoldconfig
  make deb-pkg; # maybe with -j4
  dpkg -i ../<name of package>
  reboot

  # hopefully it reproduces the problem. So try the patch:
  cd linux
  git am -3sc /path/to/patch
  make deb-pkg; # maybe with -j4
  dpkg -i ../<name of package>
  reboot
Comment 14 Gedalya 2012-06-16 20:01:39 UTC
On 6/16/2012 8:54 PM, bugzilla-daemon@freedesktop.org wrote:
> Thanks again for your hard work tracking this down.
Thank you!! You've been very helpful and this is a great learning 
experience!

> Please test the attached patch against the 3.3.y tree, like so:
>    # hopefully it reproduces the problem. So try the patch:
It did indeed.
>    cd linux
>    git am -3sc /path/to/patch
>    make deb-pkg; # maybe with -j4
>    dpkg -i ../<name of package>
>    reboot
>
So yea, it applied with no issues, compiled, and the computer seems to 
be quite usable, installed chromium etc. and so far so good.

This thing showed up, as before -
[   60.936501] colord[2769]: segfault at 8 ip 000000000040bc6d sp 
00007fff020a9110 error 4 in colord[400000+20000]
Not sure what it is, perhaps something else is wrong somewhere, perhaps 
unrelated.
Comment 15 Vova 2012-08-18 17:42:16 UTC
What about crashes on linux 3.5.1 with applications uses 3d acceleration? I`m booting with nouveau.noaccel=0. Gears sometime works, if enable kde effects - driver is freeze.
Comment 16 Ilia Mirkin 2013-08-19 17:41:14 UTC
Sounds like the original issue is fixed. Vova, if you're still having issues, file a new bug describing your symptoms, take a look at http://nouveau.freedesktop.org/wiki/Bugs/ for doing that effectively.
Comment 17 Vova 2013-08-22 15:49:18 UTC
Yes, bug is fixed for me now.


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.