Bug 20780

Summary: nouveau corrupts at start, then crashes after a few drawing operations. 7800gt when NoAccel=false
Product: xorg Reporter: Andy Matteson <xt.knight>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED INVALID QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: xt.knight
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
Attachment described below.
none
excerpt from tar.gz file: annotated dmesg of failing nouveau, NoAccel=false
none
~git200801xx xorg.log
none
~git200801xx xorg.conf
none
~git200801xx dmesg none

Description Andy Matteson 2009-03-20 15:12:51 UTC
Created attachment 24097 [details]
Attachment described below.

Distro: Ubuntu Jaunty development version

Attached are three test cases containing the configuration used for each one and the corresponding logs.

In cases 411pm and 426pm, the video card will crash with corruption and xorg will exit with a GPU lockup message.  In 431pm, the card starts using NoAccel=true.

In all cases, Xorg was started with:

"sudo Xorg -logverbose 6 :0"

over ssh from another machine.  I couldn't find any backtraces from gdb since Xorg exited softly with code 1 (I think that #), no crash.

Using:
	git xf86-video-nouveau  ~ Thu Mar 19 2009 03:40 AM EDT
	git drm			~ Thu Mar 19 2009 03:36 AM EDT

$ dpkg -s xserver-xorg|grep Version
Version: 1:7.4~5ubuntu16

$ uname -a
Linux ubuntu 2.6.28.8 #1 SMP Thu Mar 19 22:31:12 EDT 2009 x86_64 GNU/Linux

( Custom compiled Jaunty kernel )

$ sudo lspci -vv

03:00.0 VGA compatible controller: nVidia Corporation G70 [GeForce 7800 GT] (rev a1)
	Subsystem: eVga.com. Corp. Device c518
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR+ FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 30
	Region 0: Memory at fa000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at d0000000 (64-bit, prefetchable) [size=256M]
	Region 3: Memory at f9000000 (64-bit, non-prefetchable) [size=16M]
	Region 5: I/O ports at bc00 [size=128]
	Expansion ROM at fbbe0000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 2
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] Message Signalled Interrupts: Mask- 64bit+ Queue=0/0 Enable-
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v1) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s <512ns, L1 <4us
			ExtTag- AttnBtn- AttnInd- PwrInd- RBE- FLReset-
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag- PhantFunc- AuxPwr- NoSnoop+
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq+ AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Latency L0 <1us, L1 <4us
			ClockPM- Suprise- LLActRep- BwNot-
		LnkCtl:	ASPM Disabled; RCB 128 bytes Disabled- Retrain- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x16, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
	Capabilities: [100] Virtual Channel <?>
	Capabilities: [128] Power Budgeting <?>
	Kernel driver in use: nouveau
	Kernel modules: nvidiafb

$ lsmod

Module                  Size  Used by
nouveau               344256  1 
drm                   198624  2 nouveau
binfmt_misc            14476  1 
bridge                 59424  0 
stp                     7300  1 bridge
bnep                   18944  2 
vmnet                  50052  13 
ppdev                  12872  0 
parport_pc             40936  0 
vmblock                19664  3 
vmci                   61224  0 
vmmon                  81040  0 
input_polldev           8848  0 
video                  25108  0 
output                  7808  1 video
lp                     15620  0 
parport                45744  3 ppdev,parport_pc,lp
mt2131                 10500  1 
s5h1409                14404  1 
snd_hda_intel         537396  1 
snd_cmipci             45152  4 
snd_pcm_oss            47936  0 
gameport               17488  1 snd_cmipci
snd_mixer_oss          20928  1 snd_pcm_oss
snd_opl3_lib           16512  1 snd_cmipci
snd_pcm                93704  4 snd_hda_intel,snd_cmipci,snd_pcm_oss
snd_hwdep              12936  1 snd_opl3_lib
snd_mpu401_uart        12800  1 snd_cmipci
snd_seq_dummy           7620  0 
cx25840                33712  0 
snd_seq_oss            38016  0 
snd_seq_midi           11904  0 
cx23885               104204  0 
snd_rawmidi            29952  2 snd_mpu401_uart,snd_seq_midi
compat_ioctl32         14336  1 cx23885
videodev               41280  2 cx23885,compat_ioctl32
v4l1_compat            19972  1 videodev
cx2341x                18308  1 cx23885
videobuf_dma_sg        18756  1 cx23885
snd_seq_midi_event     12736  2 snd_seq_oss,snd_seq_midi
videobuf_dvb           12612  1 cx23885
dvb_core              102124  1 videobuf_dvb
snd_seq                62304  6 snd_seq_dummy,snd_seq_oss,snd_seq_midi,snd_seq_midi_event
videobuf_core          25732  3 cx23885,videobuf_dma_sg,videobuf_dvb
snd_timer              29200  3 snd_opl3_lib,snd_pcm,snd_seq
snd_seq_device         12372  6 snd_opl3_lib,snd_seq_dummy,snd_seq_oss,snd_seq_midi,snd_rawmidi,snd_seq
v4l2_common            19456  3 cx25840,cx23885,cx2341x
btcx_risc               9992  1 cx23885
tveeprom               19524  1 cx23885
snd                    74184  22 snd_hda_intel,snd_cmipci,snd_pcm_oss,snd_mixer_oss,snd_opl3_lib,snd_pcm,snd_hwdep,snd_mpu401_uart,snd_seq_oss,snd_rawmidi,snd_seq,snd_timer,snd_seq_device
soundcore              12896  1 snd
snd_page_alloc         14864  2 snd_hda_intel,snd_pcm
iTCO_wdt               17680  0 
iTCO_vendor_support     8580  1 iTCO_wdt
usbhid                 42688  0 
pcspkr                  7296  0 
ohci1394               37620  0 
ieee1394              103232  1 ohci1394
sky2                   57796  0 
ehci_hcd               44940  0 
uhci_hcd               30368  0
Comment 1 Andy Matteson 2009-03-20 15:14:53 UTC
Created attachment 24098 [details]
excerpt from tar.gz file:  annotated dmesg of failing nouveau, NoAccel=false
Comment 2 Andy Matteson 2009-03-20 15:16:47 UTC
To be clear, xorg is crashing after I do DISPLAY=:0 xterm and type a few commands in xterm to cause a few painting commands.

It exits with this message in the Xorg log:


Fatal server error:
Detected GPU lockup


Please consult the The X.Org Foundation support 
	 at http://wiki.x.org
 for help. 
Please also check the log file at "/var/log/Xorg.0.log" for additional information.

[   42.661694] (II) AT Translated Set 2 keyboard: Close
[   42.661732] (II) UnloadModule: "evdev"
[   42.669699] (II) Macintosh mouse button emulation: Close
[   42.669721] (II) UnloadModule: "evdev"
[   42.677687] (II) Microsoft Microsoft Optical Mouse with Tilt Wheel: Close
[   42.677709] (II) UnloadModule: "evdev"
[   42.677732] (II) NOUVEAU(0): NVLeaveVT is called.
[   44.677756] (II) NOUVEAU(0): Restoring encoders
[   44.677771] (II) NOUVEAU(0): 0xD417: Parsing digital output script table
[   44.678111] (II) NOUVEAU(0): 0xD467: Parsing digital output script table
[   44.698112] (II) NOUVEAU(0): Restoring crtcs
[   44.698534] (II) NOUVEAU(0): Restoring CRTC_OWNER to 3.
 ddxSigGiveUp: Closing log
Comment 3 Andy Matteson 2009-03-20 15:31:28 UTC
Also very important: I think that accelerated EXA USED TO WORK on my card about a few months back!  (I am not sure about this.  I never messed with a NoAccel parameter to my knowledge and I'm not sure of the default back then or when it was working.)  I do recall being able to play videos and my desktop was definitely not this slow.
Comment 4 Andy Matteson 2009-03-20 22:30:40 UTC
Very sure it's related to this bug: https://bugs.launchpad.net/ubuntu/+source/xserver-xorg-video-nv/+bug/62230

and

http://www.nvnews.net/vbulletin/showthread.php?t=31858&pp=15&highlight=loop&page=27

The 'nv' driver freezes using XAA acceleration.

Backtrace of that: (gdb) backtrace
#0  0x00007f8f71f58dc7 in NVDmaWait () from /usr/lib/xorg/modules/drivers//nv_drv.so
#1  0x00007f8f71f59e90 in ?? () from /usr/lib/xorg/modules/drivers//nv_drv.so
#2  0x00007f8f706d25c2 in XAATEGlyphRendererScanlineLSBFirst (pScrn=0x187c990, x=63, y=250, w=228, h=7, skipleft=0, startline=5, glyphs=0x18a6a90, glyphWidth=6, 
    fg=16777215, bg=0, rop=3, planemask=4294967295) at ../../../../hw/xfree86/xaa/./xaaTEGlyph.c:408
#3  0x00007f8f706acd6d in XAAGlyphBltTEColorExpansion (pScrn=0x187c990, xInit=63, yInit=<value optimized out>, font=<value optimized out>, fg=16777215, bg=0, 
    rop=3, planemask=4294967295, cclip=0x1b22d00, nglyph=38, gBase=0x0, ppci=0x18a4230) at ../../../../hw/xfree86/xaa/xaaTEText.c:297
#4  0x00007f8f706ad123 in XAAImageText16TEColorExpansion (pDraw=0x1b22cb0, pGC=0x1b1ba80, x=62, y=260, count=<value optimized out>, chars=0x1b8443c)
    at ../../../../hw/xfree86/xaa/xaaTEText.c:145
#5  0x00007f8f706e9068 in cwImageText16 (pDst=<value optimized out>, pGC=0x1b1ba80, x=62, y=260, count=38, chars=0x1b8443c) at ../../../miext/cw/cw_ops.c:425
#6  0x000000000053d3d0 in damageImageText16 (pDrawable=0x1b22cb0, pGC=0x1b1ba80, x=62, y=260, count=38, chars=0x1b8443c) at ../../../miext/damage/damage.c:1618
#7  0x0000000000450194 in doImageText (client=0x1b283b0, c=0x7fff7e1807e0) at ../../dix/dixfonts.c:1576
#8  0x00000000004503ac in ImageText (client=0x187d000, pDraw=<value optimized out>, pGC=0xa28, nChars=0, data=0x14 <Address 0x14 out of bounds>, xorg=1899069440, 
    yorg=260, reqType=<value optimized out>, did=2097197) at ../../dix/dixfonts.c:1627
#9  0x000000000044bce4 in ProcImageText16 (client=0x1b283b0) at ../../dix/dispatch.c:2205
#10 0x000000000044e354 in Dispatch () at ../../dix/dispatch.c:437
#11 0x0000000000433ddd in main (argc=4, argv=0x7fff7e180a18, envp=<value optimized out>) at ../../dix/main.c:397
(gdb) backtrace full
#0  0x00007f8f71f58dc7 in NVDmaWait () from /usr/lib/xorg/modules/drivers//nv_drv.so
No symbol table info available.
#1  0x00007f8f71f59e90 in ?? () from /usr/lib/xorg/modules/drivers//nv_drv.so
No symbol table info available.
#2  0x00007f8f706d25c2 in XAATEGlyphRendererScanlineLSBFirst (pScrn=0x187c990, x=63, y=250, w=228, h=7, skipleft=0, startline=5, glyphs=0x18a6a90, glyphWidth=6, 
    fg=16777215, bg=0, rop=3, planemask=4294967295) at ../../../../hw/xfree86/xaa/./xaaTEGlyph.c:408
	infoRec = (XAAInfoRecPtr) 0x18a38f0
	bufferNo = 1
	GlyphFunc = (GlyphScanlineFuncPtr) 0x7f8f706d1190 <DrawTETextScanlineWidth6>
#3  0x00007f8f706acd6d in XAAGlyphBltTEColorExpansion (pScrn=0x187c990, xInit=63, yInit=<value optimized out>, font=<value optimized out>, fg=16777215, bg=0, 
    rop=3, planemask=4294967295, cclip=0x1b22d00, nglyph=38, gBase=0x0, ppci=0x18a4230) at ../../../../hw/xfree86/xaa/xaaTEText.c:297
	fallbackBits = <value optimized out>
	infoRec = (XAAInfoRecPtr) 0x18a38f0
	skippix = <value optimized out>
	skipglyphs = <value optimized out>
	Right = 291
	Top = 250
	Bottom = 263
	LeftEdge = 63
	RightEdge = 228
	ytop = 250
	ybot = 263
	nbox = <value optimized out>
	pbox = (BoxPtr) 0x1b22d00
	glyphs = (unsigned int **) 0x18a6a90
	glyphWidth = 6
#4  0x00007f8f706ad123 in XAAImageText16TEColorExpansion (pDraw=0x1b22cb0, pGC=0x1b1ba80, x=62, y=260, count=<value optimized out>, chars=0x1b8443c)
    at ../../../../hw/xfree86/xaa/xaaTEText.c:145
	infoRec = (XAAInfoRecPtr) 0x18a38f0
	n = 38
#5  0x00007f8f706e9068 in cwImageText16 (pDst=<value optimized out>, pGC=0x1b1ba80, x=62, y=260, count=38, chars=0x1b8443c) at ../../../miext/cw/cw_ops.c:425
	pGCPrivate = (cwGCPtr) 0x1b1b9c0
	dst_off_x = 0
	dst_off_y = 0
	pBackingDst = (DrawablePtr) 0x1b22cb0
	pBackingGC = (GCPtr) 0x1b1ba80
#6  0x000000000053d3d0 in damageImageText16 (pDrawable=0x1b22cb0, pGC=0x1b1ba80, x=62, y=260, count=38, chars=0x1b8443c) at ../../../miext/damage/damage.c:1618
	pGCPriv = (DamageGCPrivPtr) 0x1b22a40
	oldFuncs = (GCFuncs *) 0x7cf3a0
#7  0x0000000000450194 in doImageText (client=0x1b283b0, c=0x7fff7e1807e0) at ../../dix/dixfonts.c:1576
	err = <value optimized out>
	lgerr = 2
	fpe = <value optimized out>
#8  0x00000000004503ac in ImageText (client=0x187d000, pDraw=<value optimized out>, pGC=0xa28, nChars=0, data=0x14 <Address 0x14 out of bounds>, xorg=1899069440, 
    yorg=260, reqType=<value optimized out>, did=2097197) at ../../dix/dixfonts.c:1627
	local_closure = {client = 0x1b283b0, pDraw = 0x1b22cb0, pGC = 0x1b1ba80, nChars = 38 '&', data = 0x1b8443c "", xorg = 62, yorg = 260, reqType = 77 'M', 
  imageText = 0x53d290 <damageImageText16>, itemSize = 2, did = 2097197, slept = 0}
#9  0x000000000044bce4 in ProcImageText16 (client=0x1b283b0) at ../../dix/dispatch.c:2205
	err = 2
	pDraw = (DrawablePtr) 0x1b22cb0
	pGC = (GC *) 0xa28
#10 0x000000000044e354 in Dispatch () at ../../dix/dispatch.c:437
---Type <return> to continue, or q <return> to quit---
	result = <value optimized out>
	client = (ClientPtr) 0x1b283b0
	nready = 0
	start_tick = 40
#11 0x0000000000433ddd in main (argc=4, argv=0x7fff7e180a18, envp=<value optimized out>) at ../../dix/main.c:397
	i = 1
	alwaysCheckForInput = {0, 1}
Comment 5 Andy Matteson 2009-03-21 02:19:30 UTC
Created attachment 24112 [details]
~git200801xx xorg.log

Older version of nouveau gives more details about the crash and more fifo info.
Comment 6 Andy Matteson 2009-03-21 02:19:53 UTC
Created attachment 24113 [details]
~git200801xx xorg.conf
Comment 7 Andy Matteson 2009-03-21 02:20:20 UTC
Created attachment 24114 [details]
~git200801xx dmesg
Comment 8 Andy Matteson 2009-03-21 02:21:42 UTC
I don't know that EXA ever worked on this card, specifically actually.  I tried different revisions of nouveau all the way back to when they just added ctx voodoo for the nv47 (my card), and nothing worked right with noaccel=false.  With noaccel=true, the newer nouveaus work much better too.

I attached error logs of a 2008-01-xx nouveau that exhibited the same symptoms as the most recent nouveau (corruption---->crashing later), and gave more fifo info.
Comment 9 Andy Matteson 2009-03-21 13:41:17 UTC
An mmiotrace of nvidia 180 was sent to mmio.dumps@gmail.com at Sat Mar 21 4:40PM from xt.knight@gmail.com
Comment 10 Ilia Mirkin 2013-08-18 18:10:33 UTC
It appears that this bug report has laid dormant for quite a while. Sorry we haven't gotten to it. Since we fix bugs all the time, chances are pretty good that your issue has been fixed with the latest software. Please give it a shot. (Linux kernel 3.10.7, xf86-video-nouveau 1.0.9, mesa 9.1.6, or their git versions.) If upgrading to the latest isn't an option for you, your distro's bugzilla is probably the right destination for your bug report.

In an effort to clean up our bug list, we're pre-emptively closing all bugs that haven't seen updates since 2011. If the original issue remains, please make sure to provide fresh info, see http://nouveau.freedesktop.org/wiki/Bugs/ for what we need to see, and re-open this one.

Thanks,

The Nouveau Team

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.