Bug 108982

Summary: GM206: MMIO write of 800000ec FAULT at 10eb14 [ IBUS ]
Product: xorg Reporter: Johnny B. Goode <adamos>
Component: Driver/nouveauAssignee: Nouveau Project <nouveau>
Status: RESOLVED MOVED QA Contact: Xorg Project Team <xorg-team>
Severity: normal    
Priority: medium CC: adamos
Version: unspecified   
Hardware: x86-64 (AMD64)   
OS: Linux (All)   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
GM206 nouveau.debug trace
none
GM206 nouveau.debug trace2
none
Rom none

Description Johnny B. Goode 2018-12-08 16:07:26 UTC
i7-6700, GM206 [GTX 950]
No touch screen, no hybrid graphics. Pure box. Intel 530 on chip but disabled in BIOS. Only one Asus Graphic Card.
Problem started on Kernel 4.10.x and is going on next kernels. Kernel 4.9.x is FAULT free. Different distributions Fedora, Gentoo.

dmesg | grep nouveau
[   11.511078] fb: switching to nouveaufb from EFI VGA
[   11.511602] nouveau 0000:01:00.0: NVIDIA GM206 (126020a1)
[   11.601757] nouveau 0000:01:00.0: bios: version 84.06.3d.00.6f
[   11.885711] nouveau 0000:01:00.0: fb: 2048 MiB GDDR5
[   11.885746] nouveau 0000:01:00.0: bus: MMIO write of 800000ec FAULT at 10eb14 [ IBUS ]
[   11.979550] nouveau 0000:01:00.0: DRM: VRAM: 2048 MiB
[   11.979551] nouveau 0000:01:00.0: DRM: GART: 1048576 MiB
[   11.979553] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[   11.979554] nouveau 0000:01:00.0: DRM: DCB version 4.1
[   11.979556] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f02 00020030
[   11.979558] nouveau 0000:01:00.0: DRM: DCB outp 01: 02000f00 00000000
[   11.979559] nouveau 0000:01:00.0: DRM: DCB outp 02: 02811f66 04400010
[   11.979560] nouveau 0000:01:00.0: DRM: DCB outp 03: 02011f62 00020010
[   11.979562] nouveau 0000:01:00.0: DRM: DCB outp 04: 02022f72 00020020
[   11.979563] nouveau 0000:01:00.0: DRM: DCB outp 05: 04033f82 00020030
[   11.979564] nouveau 0000:01:00.0: DRM: DCB outp 15: 01df4ff8 00000000
[   11.979566] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001030
[   11.979567] nouveau 0000:01:00.0: DRM: DCB conn 01: 00010146
[   11.979568] nouveau 0000:01:00.0: DRM: DCB conn 02: 00020261
[   11.979569] nouveau 0000:01:00.0: DRM: DCB conn 03: 02000331
[   11.979570] nouveau 0000:01:00.0: DRM: DCB conn 04: 00000470
[   11.982170] nouveau 0000:01:00.0: DRM: failed to create encoder 1/8/0: -19
[   11.982172] nouveau 0000:01:00.0: DRM: Virtual-1 has no encoders, removing
[   12.233674] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[   12.429065] nouveau 0000:01:00.0: DRM: allocated 2560x1440 fb: 0xa0000, bo 000000000065bfd7
[   12.429187] fbcon: nouveaufb (fb0) is primary device
[   12.564503] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[   12.587799] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0

01:00.0 VGA compatible controller: NVIDIA Corporation GM206 [GeForce GTX 950] (rev a1) (prog-if 00 [VGA controller])
	Subsystem: ASUSTeK Computer Inc. GM206 [GeForce GTX 950]
	Flags: bus master, fast devsel, latency 0, IRQ 136
	Memory at de000000 (32-bit, non-prefetchable) [size=16M]
	Memory at c0000000 (64-bit, prefetchable) [size=256M]
	Memory at d0000000 (64-bit, prefetchable) [size=32M]
	I/O ports at e000 [size=128]
	Expansion ROM at 000c0000 [disabled] [size=128K]
	Capabilities: [60] Power Management version 3
	Capabilities: [68] MSI: Enable+ Count=1/1 Maskable- 64bit+
	Capabilities: [78] Express Legacy Endpoint, MSI 00
	Capabilities: [100] Virtual Channel
	Capabilities: [258] L1 PM Substates
	Capabilities: [128] Power Budgeting <?>
	Capabilities: [420] Advanced Error Reporting
	Capabilities: [600] Vendor Specific Information: ID=0001 Rev=1 Len=024 <?>
	Capabilities: [900] #19
	Kernel driver in use: nouveau
	Kernel modules: nouveau


xf86-video-nouveau-1.0.15-r1
libdrm-2.4.96
kernel-4.19.6
Comment 1 Ilia Mirkin 2018-12-08 17:20:31 UTC
Interesting. The only write I see to that register is in the GM107 fan control logic. This is not used for GM200+, as controlling the fan can't be done anymore, except via secure firmware means.

So ... my *guess* is that this is a write failure that happens as a result of the firmware that gets loaded, which is not supplied by nouveau. (Always preferable to blame someone else...)

Can you get a dmesg with nouveau.debug=trace which should place that error more exactly in the initialization flow.
Comment 2 Johnny B. Goode 2018-12-09 05:04:02 UTC
Created attachment 142762 [details]
GM206 nouveau.debug trace

journalctl -b -1 --no-hostname -o short-monotonic | grep nouveau
Comment 3 Johnny B. Goode 2018-12-09 05:11:17 UTC
Created attachment 142763 [details]
GM206 nouveau.debug trace2

Once again, because the first one was without -b -1 parameter, so without closing nouveau.
journalctl -b -1 --no-hostname -o short-monotonic | grep nouveau
Comment 4 Ilia Mirkin 2018-12-19 02:24:03 UTC
(In reply to Johnny B. Goode from comment #3)
> Created attachment 142763 [details]
> GM206 nouveau.debug trace2
> 
> Once again, because the first one was without -b -1 parameter, so without
> closing nouveau.
> journalctl -b -1 --no-hostname -o short-monotonic | grep nouveau

I could be mistaken, but there's no MMIO errors in there... which means it's intermittent, perhaps based on boot state or something else. I do think it's the secure firmware doing this.
Comment 5 Johnny B. Goode 2018-12-20 16:45:08 UTC
Created attachment 142870 [details]
Rom

Rom bios because Asus made few cards GTX 950 with different bioses hxxps://www.techpowerup.com/vgabios/?architecture=NVIDIA&manufacturer=Asus&model=GTX+950&interface=&memType=&memSize=&since=
Mine is 84.06.3d.00.6f
Comment 6 Ilia Mirkin 2018-12-20 16:54:13 UTC
(In reply to Johnny B. Goode from comment #5)
> Created attachment 142870 [details]
> Rom
> 
> Rom bios because Asus made few cards GTX 950 with different bioses
> hxxps://www.techpowerup.com/vgabios/
> ?architecture=NVIDIA&manufacturer=Asus&model=GTX+950&interface=&memType=&memS
> ize=&since=
> Mine is 84.06.3d.00.6f

I was referring to the firmware that's supplied by NVIDIA in linux-firmware, which has to be loaded to operate any of the acceleration components of the board.
Comment 7 Martin Peres 2019-12-04 09:47:35 UTC
-- GitLab Migration Automatic Message --

This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.

You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/472.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.