Bug 33165

Summary: [NVA3] GDDR5 vram -> flickering screen (previously GPU lockup)
Product: xorg Reporter: s47 <shulenkov>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: a.j.buxton, AlesSvoboda, andrew, andrius, eugene, justusranvier, jw.hendy, kricsek, ktmdms, nemesis, oleg, phix.nay, raymondmeester, tom.winterhalder, travneff, xomachiner
Version: 7.5 (2009.10)   
Hardware: x86 (IA32)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
VBIOS for NVIDIA GT ® 240
none
kernel_log.txt
none
kernel_log_drm_debug_14.txt
none
kernel_log_nouveau_drm_debug.txt
none
initialize stuff for gddr5 ram
none
another patch for gddr5 ram
none
noise example
none
kernel log for weston freeze
none
Firefox and glxgears at the same time none

Description s47 2011-01-15 11:06:04 UTC
In most of the modern distros by default is loading nouveau for nvidia cards. When it loading by default, it hangs completely, don`t response on all known keys (example, Alt+SysRq+B). Logs was clear with 0 bytes. Problem was solved by change card for an older nvidia 9x00 and installing proprietary driver. After changing card was all fine. Card was checked for problems with some tests, all were passed.
Comment 1 Daniel Wyatt 2011-01-17 08:53:41 UTC
Yes, this is quite annoying.
I recall I upgraded Ubuntu and rebooted only to encounter this hang.
I can't even boot up some of the live CDs (Ubuntu, Linux Mint, ...) without bypassing nouveau.
I've switched back to the proprietary driver and everything is fine.
But we need to get to the bottom of this.
Comment 2 Jonathan Heaney 2011-01-31 10:17:45 UTC
The nouveau driver does not appear to provide acceleration (at least for the moment) with GT240 cards.  You can get X to start, however, by using

Option "NoAccel" "true"

In your X config.

It is painfully slow though, so the only real option is to use the proprietary NVidia driver.

The KMS/Nouveau framebuffer works as well, but again without acceleration, I have a copy of the dmesg output as it loads the driver and will upload if it will be useful; I guess the devs know of this issue with GT240's though.
Comment 3 Eduard Gotwig 2012-06-19 11:25:46 UTC
Still present :(
Comment 4 Eduard Gotwig 2012-06-19 11:31:08 UTC
Created attachment 63234 [details]
VBIOS for NVIDIA GT ® 240
Comment 5 Maarten Maathuis 2012-06-19 11:56:16 UTC
You might want to actually post a recent kernel log showing your problem, because the developers won't get anywhere with vague descriptions.
Comment 6 Lucas Stach 2012-06-19 14:05:20 UTC
Also following up on your mailinglist post:
there is no general problem with nouveau and nv92, which is one of the most common chipsets and works without problems for the most of us.

So you have to be more specific about what problems you do encounter in order to get real help.
Comment 7 Dinny Wu 2012-11-22 05:29:00 UTC
I'm trying to customize a live system based on Debian and it has encoutered the same issue with GT240 card. It seems that it works fine with other nVidia cards. But only for GT240, it simply boots and freezes when booting into X system.

If you need any logs, please let me know how to get them.
Comment 8 Marcin Slusarz 2012-11-22 19:22:27 UTC
Please follow http://nouveau.freedesktop.org/wiki/Bugs and open new bug report.
Comment 10 Ilia Mirkin 2013-08-24 03:14:26 UTC
*** Bug 57211 has been marked as a duplicate of this bug. ***
Comment 11 Ilia Mirkin 2013-08-24 03:14:40 UTC
*** Bug 49057 has been marked as a duplicate of this bug. ***
Comment 12 Ilia Mirkin 2013-08-24 03:15:04 UTC
*** Bug 52509 has been marked as a duplicate of this bug. ***
Comment 13 Ilia Mirkin 2013-08-24 03:15:34 UTC
*** Bug 41333 has been marked as a duplicate of this bug. ***
Comment 14 Ilia Mirkin 2013-08-24 03:17:20 UTC
Please re-test with the latest kernel. There are reports of all nva3+ cards working fine for most people with current software.
Comment 15 Andrius Štikonas 2013-08-25 20:11:21 UTC
I can still reproduce it with kernel 3.11 rc6. Should I run different kernel?
Comment 16 Ilia Mirkin 2013-08-25 22:23:40 UTC
(In reply to comment #15)
> I can still reproduce it with kernel 3.11 rc6. Should I run different kernel?

That is the right kernel. Can you describe exactly what it is that you can reproduce? Logs? etc. See http://nouveau.freedesktop.org/wiki/Bugs/ for what we need.

[I know some of the bugs I marked as a dup of this one had that for old versions, but I'd still like to see updated info.]
Comment 17 Andrius Štikonas 2013-08-26 09:16:20 UTC
Created attachment 84637 [details]
kernel_log.txt

GPU locks up when nouveau module is started with KMS enabled. It is no longer possible to start X in this case but the ttys are still usable.
Comment 18 Andrius Štikonas 2013-08-26 09:17:57 UTC
Created attachment 84638 [details]
kernel_log_drm_debug_14.txt

Kernel log with drm.debug=14
Comment 19 Ilia Mirkin 2013-08-26 14:02:37 UTC
Wow, so right on boot:

[   15.639931] nouveau E[     DRM] GPU lockup - switching to software fbcon
[   15.643599] [drm] Initialized nouveau 1.1.1 20120801 for 0000:04:00.0 on minor 0

It locks up before the probe is even done! One thing to check -- do you have any old firmware in /lib/firmware/nouveau? You shouldn't need any firmware to run nva3 (except the video firmware if you want vdpau).

Also, could you produce a log with

drm.debug=14 nouveau.debug=trace

That might provide more info as to what's going on.
Comment 20 Andrius Štikonas 2013-08-26 14:49:55 UTC
Created attachment 84652 [details]
kernel_log_nouveau_drm_debug.txt

There is certainly no firmware in /lib/firmware/nouveau.
I've now attached a new dmesg output.
Comment 21 Ilia Mirkin 2013-08-26 15:46:08 UTC
*** Bug 61460 has been marked as a duplicate of this bug. ***
Comment 22 Kevin Martin 2013-08-26 17:27:35 UTC
I no longer have accel turned off but I have had to add:

        Option     "ShadowFB" "1"           	# [<bool>]

to get mine to work.  

Xorg.0.log:
[   140.225] Current Operating System: Linux ktmtoshiba 3.11.0-0.rc6.git2.1.fc21.x86_64 #1 SMP Thu Aug 22 21:02:34 UTC 2013 x86_
64
[   140.225] Kernel command line: BOOT_IMAGE=/vmlinuz-3.11.0-0.rc6.git2.1.fc21.x86_64 root=UUID=895ff4c3-db90-4071-aeab-a0ec2d6b
94cf ro rd.md=0 rd.lvm=0 rd.dm=0 rd.luks=0 ipv6.disable=1 rhgb quiet acpi_backlight=vendor acpi_osi=Linux
[   140.225] Build Date: 30 July 2013  06:10:29AM
[   140.225] Build ID: xorg-x11-server 1.14.2-9.fc20 
[   140.225] Current version of pixman: 0.30.0

[   140.567] (II) Module nouveau: vendor="X.Org Foundation"
[   140.567] 	compiled for 1.14.2, module version = 1.0.9
[   140.567] 	Module class: X.Org Video Driver
[   140.567] 	ABI class: X.Org Video Driver, version 14.1

dmesg:
[    4.516476] fbcon: nouveaufb (fb0) is primary device
[    6.059440] nouveau E[     DRM] GPU lockup - switching to software fbcon
[    6.072773] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    6.072776] nouveau 0000:01:00.0: registered panic notifier
[    6.072852] [drm] Initialized nouveau 1.1.1 20120801 for 0000:01:00.0 on minor 0
Comment 23 Martin Peres 2013-08-27 14:41:32 UTC
So, Stikonas's bug is due to PGRAPH not processing any commands .... ever.

I pushed his mmiotrace in the vbios repo for anyone willing to check what is wrong in nouveau's way of setting up pgraph.
Comment 24 Andrius Štikonas 2013-08-27 22:15:46 UTC
*** Bug 52244 has been marked as a duplicate of this bug. ***
Comment 25 Oleg Bulatov 2014-02-17 20:52:32 UTC
Still present in 3.13.2-gentoo. I found that -EBUSY returns from nv50_fbcon_imageblit.

> what is wrong in nouveau's way of setting up pgraph.
Does it mean that the bug somewhere in nv50_grctx_generate? Just to be sure that bug in this function, is it possible to replace it using data from mmiotrace of nvidia driver? For me it is not obvious how to modificate nv50_grctx_fill in that case, what nv50_grctx_init should write to *size and, after all, may it help somehow?
Comment 26 Ilia Mirkin 2014-03-09 11:22:05 UTC
Created attachment 95408 [details] [review]
initialize stuff for gddr5 ram

This patch is based on Andrew's mmiotrace. I saw a sequence of init that I didn't see in other traces for working cards. Andrew -- would be interesting if you could check it out and see if it helps. The algorithm I came up with is insane -- there has to be some sort of data-driven thing going on there, but I couldn't work it out.

For people having issues, please do a mmiotrace of the blob loading on your card (see https://wiki.ubuntu.com/X/MMIOTracing for instructions), and send it (xz -9'd) to mmio.dumps@gmail.com.
Comment 27 Andrius Štikonas 2014-03-09 11:25:03 UTC
(In reply to comment #26)
> Created attachment 95408 [details] [review] [review]
> initialize stuff for gddr5 ram
> 
> This patch is based on Andrew's mmiotrace. I saw a sequence of init that I
> didn't see in other traces for working cards. Andrew -- would be interesting
> if you could check it out and see if it helps. The algorithm I came up with
> is insane -- there has to be some sort of data-driven thing going on there,
> but I couldn't work it out.
> 
> For people having issues, please do a mmiotrace of the blob loading on your
> card (see https://wiki.ubuntu.com/X/MMIOTracing for instructions), and send
> it (xz -9'd) to mmio.dumps@gmail.com.

I might be able to test this patch in the middle of April. Sorry, I can't access that computer now.
Comment 28 Oleg Bulatov 2014-03-09 13:19:39 UTC
For me, with this patch the problem is still present. I sent my mmiotrace (NVIDIA Corporation GT215 [GeForce GT 240]).
Comment 29 Oleg Bulatov 2014-03-09 16:27:28 UTC
Created attachment 95418 [details] [review]
another patch for gddr5 ram

I made this patch using my mmiotrace, but it doesn't help either.
Comment 30 Oleg Bulatov 2014-03-09 16:34:36 UTC
Comment on attachment 95408 [details] [review]
initialize stuff for gddr5 ram

Review of attachment 95408 [details] [review]:
-----------------------------------------------------------------

::: drivers/gpu/drm/nouveau/core/subdev/fb/ramnva3.c
@@ +365,5 @@
> +		/* XXX this algorithm is insane, find some sanity to it. */
> +		/* [1] MMIO32 R 0x100268 0x30030200 PFB.SUBPART_CONFIG => { SELECT_MASK = 0x2 | UNK16 = 0x3 | ENABLE_MASK = 0x3 } */
> +		nv_wr32(pfb, 0x10fcac, 0x00001f01);
> +		for (o = 0; o < 4; o++) {
> +			int off = offsets[i];

maybe int off = offsets[o]; ?

@@ +387,5 @@
> +			for (i = 0x20, idx = 0; i < 0x30; i++, idx++) {
> +				int pat = pattern[2 + (idx % 2)];
> +				if (i == 0x26)
> +					pat = 0;
> +				if (i == 0x1f)

looks like this condition is always false in this for-loop
Comment 31 Ilia Mirkin 2014-03-09 22:53:42 UTC
(In reply to comment #29)
> Created attachment 95418 [details] [review] [review]
> another patch for gddr5 ram
> 
> I made this patch using my mmiotrace, but it doesn't help either.

Oh well. Indeed the values in your mmiotrace are different, but your and Andrew's mmiotrace are the only traces I've seen that actually have the writes to those 0x10fxyz registers. (A few other traces have writes to the similar-but-different DDR3 registers.)

And yes, I definitely did want to use offsets[o] in my patch.

Well, perhaps this 0x10f stuff has nothing to do with the issue. My observation was that it seemed like for all the people it was broken for had GDDR5 ram, which is why I went in this direction.
Comment 32 jw.hendy 2014-04-12 02:11:33 UTC
I'm trying like crazy to figure out why I have GPU lockups, but all the bugs seem to dead end. What is the current status of this? More information, not going to fix, unconfirmed? I have an NVA3 card (Quadro FX 1800M, GT215) which cannot startx with acceleration, even with firmware.

Let me know if that's applicable to this bug and how I can provide information, if so. For reference, I'm on Arch 64bit. Trying to track down similar bugs to decide if I'm a duplicate issue or if I should start a new report.
Comment 33 Ilia Mirkin 2014-07-04 08:03:07 UTC
*** Bug 77371 has been marked as a duplicate of this bug. ***
Comment 34 Ben Skeggs 2014-10-15 03:50:28 UTC
Could you guys give this patch a try?

http://cgit.freedesktop.org/~darktama/nouveau/commit/?id=1205e927a835cf9d707bd558c067f4c00ed31ec5

Thanks,
Ben.
Comment 35 Oleg Bulatov 2014-10-16 20:03:02 UTC
Awesome, it works for me.

Linux 3.14.14-gentoo
OpenGL vendor string: nouveau
OpenGL renderer string: Gallium 0.4 on NVA3
OpenGL core profile version string: 3.3 (Core Profile) Mesa 10.2.7
OpenGL core profile shading language version string: 3.30
Comment 36 Oleg Bulatov 2014-10-17 08:06:08 UTC
Created attachment 107979 [details]
noise example

In glxgears on some frames appears color noise, and it disappears or changes on next frame.
Noise appears at some frames in glxgears, and it disappears or on next frame.
Same style noise may appear (shuffled pixels) while scrolling in Firefox or switching tabs.
Comment 37 Oleg Bulatov 2014-10-17 08:15:24 UTC
Created attachment 107980 [details]
kernel log for weston freeze

In weston at some time computer stops responding at keyboard and mouse events.
Comment 38 cybjit 2014-10-18 16:55:18 UTC
I applied that patch to Ubuntu Trusty 3.13.0-38.65, and it gives a huge improvement. X no longer crashes on startup, and 3D acceleration works.

However I also get the glitchy noise a few times a second, except when showing a static image. It is all over the screen, and sometimes appears as triangles.

Running glxgears and Firefox at the same time also starts giving errors in the kernel log pretty quickly.
Comment 39 cybjit 2014-10-18 16:58:04 UTC
Created attachment 108030 [details]
Firefox and glxgears at the same time
Comment 40 Ilia Mirkin 2014-10-20 00:35:38 UTC
*** Bug 78570 has been marked as a duplicate of this bug. ***
Comment 41 Ilia Mirkin 2015-04-14 17:03:56 UTC
*** Bug 89991 has been marked as a duplicate of this bug. ***
Comment 42 Martin Peres 2019-12-04 08:25:22 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/12.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.