Bug 77371 - [NVA3] gpu lockup on boot unless noaccel=1
Summary: [NVA3] gpu lockup on boot unless noaccel=1
Status: RESOLVED DUPLICATE of bug 33165
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium critical
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2014-04-12 19:14 UTC by jw.hendy
Modified: 2014-07-04 08:03 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg after normal startup on Arch x86_64, prior to startx (69.15 KB, text/plain)
2014-04-12 19:14 UTC, jw.hendy
no flags Details
dmesg after normal startup on Arch x86_64, after startx (83.49 KB, text/plain)
2014-04-12 19:17 UTC, jw.hendy
no flags Details
journalctl after normal startup on Arch x86_64, after startx (104.21 KB, text/plain)
2014-04-12 19:19 UTC, jw.hendy
no flags Details
dmesg after noaccel=1 on Arch x86_64, prior to startx (69.20 KB, text/plain)
2014-04-12 19:20 UTC, jw.hendy
no flags Details
dmesg after noaccel=1 on Arch x86_64, after startx (69.20 KB, text/plain)
2014-04-12 19:21 UTC, jw.hendy
no flags Details
journalctl after noaccel=1 on Arch x86_64, after startx (104.88 KB, text/plain)
2014-04-12 19:22 UTC, jw.hendy
no flags Details
screen after startx with acceleration (1.02 MB, text/plain)
2014-04-12 19:37 UTC, jw.hendy
no flags Details
xorg.0.log on Arch x86_64, acceleration enabled (30.36 KB, text/plain)
2014-04-12 21:34 UTC, jw.hendy
no flags Details
xorg.0.log on Arch x86_64, acceleration disabled with noaccel=1 (29.75 KB, text/plain)
2014-04-12 21:35 UTC, jw.hendy
no flags Details
mmiotrace using `xinit "sleep 10"` with acceleration enabled (2.45 MB, application/x-xz)
2014-04-12 21:44 UTC, jw.hendy
no flags Details
mmiotrace using `xinit -e sh -c "glxgears & sleep 10"` with acceleration enabled (2.87 MB, text/plain)
2014-04-12 21:49 UTC, jw.hendy
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description jw.hendy 2014-04-12 19:14:13 UTC
Created attachment 97262 [details]
dmesg after normal startup on Arch x86_64, prior to startx

System details:

$ uname -a
Linux bigBang 3.14.0-4-ARCH #1 SMP PREEMPT Wed Apr 9 21:11:25 CEST 2014 x86_64 GNU/Linux

$ dmesg | grep -i chipset
[   19.583102] nouveau  [  DEVICE][0000:01:00.0] Chipset: GT215 (NVA3)

$ lspci |grep -i vga
01:00.0 VGA compatible controller: NVIDIA Corporation GT215GLM [Quadro FX 1800M] (rev a2)

Description:

I can only successfully `startx` if I modprobe nouveau with no acceleration (`nouveau.noaccel=1` appended to kernel line or `options nouveau noaccel=1` in /etc/modprobe.d/nouveau.conf).

With noaccel=0 (default), I get a GPU lockup message and borked X session (see attached photo of screen).

Similar bugs:

- https://bugs.freedesktop.org/show_bug.cgi?id=33165 (seems most similar)
- https://bugs.freedesktop.org/show_bug.cgi?id=69488
- https://bugs.freedesktop.org/show_bug.cgi?id=73373 (?, can startx)
- https://bugs.freedesktop.org/show_bug.cgi?id=69203 (?, no gpu lockup)
- https://bugs.freedesktop.org/show_bug.cgi?id=69465 (?, no gpu lockup)

Things tried:
--- firmware: I tried the nouveau-fw packages from Arch Linux's AUR, which simply packages the firmware instructions here: http://nouveau.freedesktop.org/wiki/VideoAcceleration/. I have diff'ed the contents of /lib/firmware/nouveau/ and /tmp/nouveau/vuc-* + /tmp/nouveau/nv*, and they are identical.

From what I can tell, NVA3 (NV50 Tesla family) shouldn't need 3D Accel firmware (http://nouveau.freedesktop.org/wiki/InstallDRM/, 3D-accel firmware section):

"NV40 - NV50: No firmware needed. The Nouveau driver generates the ctxprogs and related state internally (commits 48c6dfb8 and 266229a5)."

With firmware in /lib/firmware/nouveau, I've tried with and without `nouveau.config=NvGrUseFW=1`, though per the Kernel Parameters page (http://nouveau.freedesktop.org/wiki/KernelModuleParameters/), this only seems relevant for NVC0.

--- Per bug 69488, I tried with `nouveau.rumpm=0` and still get the lockup, not to mention that report suggests this is fixed in 3.13, and I'm on 3.14
Comment 1 jw.hendy 2014-04-12 19:17:59 UTC
Created attachment 97264 [details]
dmesg after normal startup on Arch x86_64, after startx

Booted as normal with acceleration enabled (default), did `startx`, wait for borked X session to start, then switched to TTY2 and piped dmesg.
Comment 2 jw.hendy 2014-04-12 19:19:12 UTC
Created attachment 97265 [details]
journalctl after normal startup on Arch x86_64, after startx

Boot with acceleration enabled (default), did `startx`, waited for borked X session, switched to TTY2, and piped `journalctl -b`.
Comment 3 jw.hendy 2014-04-12 19:20:24 UTC
Created attachment 97266 [details]
dmesg after noaccel=1 on Arch x86_64, prior to startx

Booted with `nouveau.noaccel=1` appended to kernel line, piped dmesg prior to starting X
Comment 4 jw.hendy 2014-04-12 19:21:06 UTC
Created attachment 97267 [details]
dmesg after noaccel=1 on Arch x86_64, after startx

Booted with `nouveau.noaccel=1` appended to kernel line, then started X successfully with `startx`
Comment 5 jw.hendy 2014-04-12 19:22:12 UTC
Created attachment 97268 [details]
journalctl after noaccel=1 on Arch x86_64, after startx

Booted with `nouveau.noaccel=1` appended to kernel line, started X, then piped `journalctl -b`
Comment 6 jw.hendy 2014-04-12 19:37:25 UTC
Created attachment 97269 [details]
screen after startx with acceleration

Boot normally (default is acceleration on), run `startx`, wait a couple minutes for X, take a picture.
Comment 7 jw.hendy 2014-04-12 21:34:27 UTC
Created attachment 97282 [details]
xorg.0.log on Arch x86_64, acceleration enabled

Booted normally (acceleration enabled per default), ran `startx` waited for borked X session, switched to new TTY, piped /var/log/Xorg.0.log to file.
Comment 8 jw.hendy 2014-04-12 21:35:17 UTC
Created attachment 97283 [details]
xorg.0.log on Arch x86_64, acceleration disabled with noaccel=1

Booted with `nouveau.noaccel=1` appended to kernel line, started X with `startx` successfully, piped Xorg.0.log to file.
Comment 9 jw.hendy 2014-04-12 21:42:37 UTC
I just sent two traces to mmio [dot] dumps [at] gmail [dot] com, following the instructions here:
- Setup: http://nouveau.freedesktop.org/wiki/MmioTrace/
- Running xinit/xinit + glxgears: https://wiki.ubuntu.com/X/MMIOTracing

Specific process:
- install nvidia drivers

$ sudo pacman -Q | grep nvidia
nvidia 334.21-4
nvidia-libgl 334.21-7
nvidia-utils 334.21-7

- boot system into recovery by adding `s` to kernel line

- unload nvidia and drm

# rmmod nvidia drm

- verify nvidia not loaded

# lsmod | grep nvidia   # no results returned

- start trace

(/sys/kernel/debug was already mounted)
# echo mmiotrace > /sys/kernel/debug/tracing/current_tracer
# cat /sys/kernel/debug/tracing/trace_pipe > /home/user/file.txt

- run test

1) ran `xinit "sleep 10"`
2) ran `xinit -e sh -c "glxgears & sleep 10"`

- stop trace

# echo nop > /sys/kernel/debug/tracing/current_tracer

- compressed files

$ xz -z file.txt
$ mv file.txt trace_quadro-fx-1800m_nv50-family-nva3_[xinit/glxgears].xz

- Send files and upload here as well (will come shortly)

This also prompted me to provide some more system details in case that is of assistance:

- Computer: HP EliteBook 8540W (laptop)

- Running openbox 3.5.2 with following ~/.xinitrc contents:
#!/bin/sh
#
# ~/.xinitrc
#
# Executed by startx (run your window manager from here)

if [ -d /etc/X11/xinit/xinitrc.d ]; then
  for f in /etc/X11/xinit/xinitrc.d/*; do
    [ -x "$f" ] && . "$f"
  done
  unset f
fi

exec dbus-launch openbox-session

- various package information:

$ sudo pacman -Q | grep nouveau
nouveau-dri 10.1.0-4
xf86-video-nouveau 1.0.10-2

$ sudo pacman -Q |grep mesa
mesa 10.1.0-4
mesa-libgl 10.1.0-4

$ sudo pacman -Q | grep libtxc_dxtn
libtxc_dxtn 1.0.1-5

$ sudo pacmen -Q | grep xorg-server
xorg-server 1.15.0-5
xorg-server-common 1.15.0-5
xorg-server-utils 7.6-3

- screen information:

$ xrandr
Screen 0: minimum 320 x 200, current 1600 x 900, maximum 8192 x 8192
LVDS-1 connected 1600x900+0+0 (normal left inverted right x axis y axis) 344mm x 193mm
   1600x900      60.04*+
   1152x864      59.96  
   1024x768      59.92  
   800x600       59.86  
   640x480       59.38  
   720x400       59.55  
   640x400       59.95  
   640x350       59.77  
DP-1 disconnected (normal left inverted right x axis y axis)
DP-2 disconnected (normal left inverted right x axis y axis)
DP-3 disconnected (normal left inverted right x axis y axis)
VGA-1 disconnected (normal left inverted right x axis y axis)

(No external monitors connected, if that wasn't apparent.)
Comment 10 jw.hendy 2014-04-12 21:44:43 UTC
Created attachment 97284 [details]
mmiotrace using `xinit "sleep 10"` with acceleration enabled

See comment here for description of trace procedure:
- https://bugs.freedesktop.org/show_bug.cgi?id=77371#c9
Comment 11 jw.hendy 2014-04-12 21:49:57 UTC
Created attachment 97285 [details]
mmiotrace using `xinit -e sh -c "glxgears & sleep 10"` with acceleration enabled

Ran trace per procedure here (this one for 3D acceleration by running glxgears):
- https://bugs.freedesktop.org/show_bug.cgi?id=77371#c9
Comment 12 Ilia Mirkin 2014-07-04 08:03:07 UTC
It appears that all GDDR5 NVA3's hang on start. I've acquired such a card, hopefully I'll be able to make it work... we'll see.

*** This bug has been marked as a duplicate of bug 33165 ***


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.