Bug 103467

Summary: [NVC0/GF108] fifo: read fault at cd90f4d000 engine 07 [PFIFO] client 07 [BAR_READ] reason 00 [PT_NOT_PRESENT] on channel 1 [003fe11000 DRM]
Product: xorg Reporter: erhard_f
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium    
Version: 7.7 (2012.06)   
Hardware: PowerPC   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
dmesg output
none
4.13.9 kernel .config
none
dmesg output
none
dmesg output with nouveau.debug=trace
none
journalctl -b output with nouveau.debug=trace
none
journalctl -b output with nouveau.debug=trace (kernel 4.15-rc2)
none
Xorg.log none

Description erhard_f 2017-10-26 14:36:54 UTC
Created attachment 135065 [details]
dmesg output

In my PowerMac G5 11,2 I am running a Geforce 6600 MAC Edition (which works), and added a GeForce GT 430 as a 2nd card. It seems this card does not get initialized properly:

[...]
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: PBDMA0: 00004000 [] ch 1 [003fe11000 DRM] subc 0 mthd 0000 data 44dc3569
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: PBDMA0: 00204000 [ILLEGAL_MTHD] ch 1 [003fe11000 DRM] subc 0 mthd 0004 data 39e65d54
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: PBDMA0: 00204000 [ILLEGAL_MTHD] ch 1 [003fe11000 DRM] subc 0 mthd 000c data 2bd98280
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: PBDMA0: 00004000 [] ch 1 [003fe11000 DRM] subc 0 mthd 0018 data e384a768
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: PBDMA0: 00004000 [] ch 1 [003fe11000 DRM] subc 0 mthd 001c data 8e86bf68
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: read fault at cd90f4d000 engine 07 [PFIFO] client 07 [BAR_READ] reason 00 [PT_NOT_PRESENT] on channel 1 [003fe11000 DRM]
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: fifo: fifo engine fault on channel 1, recovering...
Okt 26 15:46:33 T801 kernel: nouveau 0001:06:00.0: DRM: channel 1 killed!

# lspci -vv -s 0001:06:00.0
0001:06:00.0 VGA compatible controller: NVIDIA Corporation GF108 [GeForce GT 430] (rev a1) (prog-if 00 [VGA controller])
	Control: I/O+ Mem+ BusMaster+ SpecCycle- MemWINV- VGASnoop- ParErr- Stepping- SERR- FastB2B- DisINTx-
	Status: Cap+ 66MHz- UDF- FastB2B- ParErr- DEVSEL=fast >TAbort- <TAbort- <MAbort- >SERR- <PERR- INTx-
	Latency: 0, Cache Line Size: 64 bytes
	Interrupt: pin A routed to IRQ 65
	Region 0: Memory at 81000000 (32-bit, non-prefetchable) [size=16M]
	Region 1: Memory at 88000000 (64-bit, prefetchable) [size=128M]
	Region 3: Memory at 82000000 (64-bit, prefetchable) [size=32M]
	Region 5: I/O ports at 0000 [size=128]
	Expansion ROM at 80180000 [disabled] [size=512K]
	Capabilities: [60] Power Management version 3
		Flags: PMEClk- DSI- D1- D2- AuxCurrent=0mA PME(D0-,D1-,D2-,D3hot-,D3cold-)
		Status: D0 NoSoftRst+ PME-Enable- DSel=0 DScale=0 PME-
	Capabilities: [68] MSI: Enable- Count=1/1 Maskable- 64bit+
		Address: 0000000000000000  Data: 0000
	Capabilities: [78] Express (v2) Endpoint, MSI 00
		DevCap:	MaxPayload 128 bytes, PhantFunc 0, Latency L0s unlimited, L1 <64us
			ExtTag+ AttnBtn- AttnInd- PwrInd- RBE+ FLReset- SlotPowerLimit 0.000W
		DevCtl:	Report errors: Correctable- Non-Fatal- Fatal- Unsupported-
			RlxdOrd+ ExtTag+ PhantFunc- AuxPwr- NoSnoop-
			MaxPayload 128 bytes, MaxReadReq 512 bytes
		DevSta:	CorrErr- UncorrErr+ FatalErr- UnsuppReq- AuxPwr- TransPend-
		LnkCap:	Port #0, Speed 2.5GT/s, Width x16, ASPM L0s L1, Exit Latency L0s <1us, L1 <4us
			ClockPM+ Surprise- LLActRep- BwNot- ASPMOptComp-
		LnkCtl:	ASPM Disabled; RCB 128 bytes Disabled- CommClk-
			ExtSynch- ClockPM- AutWidDis- BWInt- AutBWInt-
		LnkSta:	Speed 2.5GT/s, Width x8, TrErr- Train- SlotClk+ DLActive- BWMgmt- ABWMgmt-
		DevCap2: Completion Timeout: Not Supported, TimeoutDis+, LTR-, OBFF Not Supported
		DevCtl2: Completion Timeout: 50us to 50ms, TimeoutDis-, LTR-, OBFF Disabled
		LnkCtl2: Target Link Speed: 2.5GT/s, EnterCompliance- SpeedDis-
			 Transmit Margin: Normal Operating Range, EnterModifiedCompliance- ComplianceSOS-
			 Compliance De-emphasis: -6dB
		LnkSta2: Current De-emphasis Level: -6dB, EqualizationComplete-, EqualizationPhase1-
			 EqualizationPhase2-, EqualizationPhase3-, LinkEqualizationRequest-
	Capabilities: [b4] Vendor Specific Information: Len=14 <?>
	Kernel driver in use: nouveau
	Kernel modules: nouveau
Comment 1 erhard_f 2017-10-26 14:38:09 UTC
Created attachment 135066 [details]
4.13.9 kernel .config
Comment 2 Ilia Mirkin 2017-10-26 14:48:49 UTC
Can you grab dmesg from the beginning, and boot with nouveau.debug=trace? Try increasing the kernel log buffer, or send it over netconsole or something.

Note that no one has gotten this to work in the past (although few have tried with any level of persistence).

I believe at last report, the issue was something to do with the color LUT being set up wrong, although that's clearly not your present issue.
Comment 3 erhard_f 2017-10-26 15:29:41 UTC
Thanks for your fast reply! Yes, seems I messed this up... Here is now the dmesg from beginning and another one with nouveau.debug=trace.
Comment 4 erhard_f 2017-10-26 15:30:19 UTC
Created attachment 135071 [details]
dmesg output
Comment 5 Ilia Mirkin 2017-10-26 15:31:16 UTC
(In reply to erhard_f from comment #4)
> Created attachment 135071 [details]
> dmesg output

This log appears to only capture kernel messages logged at the 'error' level.
Comment 6 erhard_f 2017-10-26 15:31:42 UTC
Created attachment 135072 [details]
dmesg output with nouveau.debug=trace
Comment 7 Ilia Mirkin 2017-10-26 15:40:27 UTC
(In reply to erhard_f from comment #6)
> Created attachment 135072 [details]
> dmesg output with nouveau.debug=trace

And this one appears to *only* have logs at the debug level or something else odd. I'm not sure what you're doing ... but I just want the same output that 'dmesg' might output. I think you're fighting some logging system which thinks it knows better... I wish you the best of luck with that.
Comment 8 erhard_f 2017-10-26 16:21:58 UTC
Sorry, but that's the best I can do at the moment... When booting up the console gets flooded with debug messages, logging in is only possible afterwards. The dmesg > kernel_debug-log.txt I create at that time does not show the boot process from the beginning but only repeats:
[   18.488344] nouveau: DRM:00000000:00000000: ioctl: size 48
[   18.488346] nouveau: DRM:00000000:00000000: ioctl: vers 0 type 04 object c000000459bba070 owner ff
[   18.488347] nouveau: DRM:00000000:00000080: ioctl: mthd size 24
[   18.488348] nouveau: DRM:00000000:00000080: ioctl: mthd vers 0 mthd 01
[   18.488350] nouveau: DRM:00000000:00000080: ioctl: device mthd 00000001
[   18.488351] nouveau: DRM:00000000:00000080: ioctl: device time size 16
[   18.488352] nouveau: DRM:00000000:00000080: ioctl: device time vers 0
[   18.488356] nouveau: DRM:00000000:00000000: ioctl: return 0

even if the log buffer ist set to 4M. As an alternative I can only offer the ouput of systemd's journalctl -k.
Comment 9 Ilia Mirkin 2017-10-26 16:23:59 UTC
(In reply to erhard_f from comment #8)
> Sorry, but that's the best I can do at the moment... When booting up the
> console gets flooded with debug messages, logging in is only possible
> afterwards. The dmesg > kernel_debug-log.txt I create at that time does not
> show the boot process from the beginning but only repeats:
> [   18.488344] nouveau: DRM:00000000:00000000: ioctl: size 48
> [   18.488346] nouveau: DRM:00000000:00000000: ioctl: vers 0 type 04 object
> c000000459bba070 owner ff
> [   18.488347] nouveau: DRM:00000000:00000080: ioctl: mthd size 24
> [   18.488348] nouveau: DRM:00000000:00000080: ioctl: mthd vers 0 mthd 01
> [   18.488350] nouveau: DRM:00000000:00000080: ioctl: device mthd 00000001
> [   18.488351] nouveau: DRM:00000000:00000080: ioctl: device time size 16
> [   18.488352] nouveau: DRM:00000000:00000080: ioctl: device time vers 0
> [   18.488356] nouveau: DRM:00000000:00000000: ioctl: return 0
> 
> even if the log buffer ist set to 4M. As an alternative I can only offer the
> ouput of systemd's journalctl -k.

OK, let's try it with nouveau.debug=debug -- that should generate considerably less junk.
Comment 10 erhard_f 2017-10-26 17:12:41 UTC
Hm, unfortunately nouveau.debug=debug did not improve things enough to get a log from the beginning. But after reading through https://freedesktop.org/wiki/Software/systemd/Debugging/ and configuring it journald with:

#ForwardToWall=no
#MaxFileSec=1h
#MaxRetentionSec=1month
#RateLimitIntervalSec=0
#RateLimitBurst=0
#Storage=persistent
#SyncIntervalSec=20
#SystemMaxUse=1G
#RuntimeMaxUse=1G

I managed to get what I think is a full log from the beginning with nouveau.debug=trace with hopefully all the information needed. There's other stuff in it, but if you filter the "kernel" lines only this should be same output as dmesg.

Please have a look at it. If this log is still not helpful I hope I will manage to get this netconsole thingy running...
Comment 11 erhard_f 2017-10-26 17:13:59 UTC
Created attachment 135083 [details]
journalctl -b output with nouveau.debug=trace
Comment 12 Ilia Mirkin 2017-10-26 17:39:27 UTC
(In reply to erhard_f from comment #11)
> Created attachment 135083 [details]
> journalctl -b output with nouveau.debug=trace

Success! (I think. At least much more success than previous times.)
Comment 13 erhard_f 2017-12-06 21:21:07 UTC
It seems kernel 4.15_rc2 did change things a bit. Looks like the card now gets initialized and I can seemingly successfully start X.

But still I only get so see a black screen ('No signal' on the monitor).
Comment 14 erhard_f 2017-12-06 21:22:33 UTC
Created attachment 136016 [details]
journalctl -b output with nouveau.debug=trace (kernel 4.15-rc2)
Comment 15 erhard_f 2017-12-06 21:25:29 UTC
Created attachment 136017 [details]
Xorg.log
Comment 16 Martin Peres 2019-12-04 09:32:44 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/379.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.