Summary: | GM108/NV118: 0 MiB DDR3 and boot crash in gf100_ltc_oneinit_tag_ram | ||||||||||
---|---|---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | Daniel Drake <dan> | ||||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||||
Status: | RESOLVED MOVED | QA Contact: | Xorg Project Team <xorg-team> | ||||||||
Severity: | normal | ||||||||||
Priority: | medium | ||||||||||
Version: | unspecified | ||||||||||
Hardware: | Other | ||||||||||
OS: | All | ||||||||||
Whiteboard: | |||||||||||
i915 platform: | i915 features: | ||||||||||
Attachments: |
|
(In reply to Daniel Drake from comment #0) > Created attachment 131742 [details] > full dmesg log > > On the Acer Z20-730 laptop, the nouveau driver crashes during boot with: > > [ 4.041108] nouveau 0000:01:00.0: pci: failed to adjust cap speed > [ 4.041167] nouveau 0000:01:00.0: pci: failed to adjust lnkctl speed > [ 9.633613] nouveau 0000:01:00.0: fb: 0 MiB DDR3 > [ 20.811768] divide error: 0000 [#1] SMP > [ 20.813654] Modules linked in: hid_generic usbmouse usbkbd usbhid i915 > nouveau(+) mxm_wmi i2c_algo_bit drm_kms_helper sdhci_pci syscopyarea > sysfillrect sysimgblt fb_sys_fops ttm sdhci drm ahci libahci wmi i2c_hid hid > video > [ 20.815697] CPU: 3 PID: 200 Comm: systemd-udevd Not tainted > 4.11.0-2-generic #7+dev143.9f9ecd2beos3.2.2-Endless > [ 20.817684] Hardware name: Acer Aspire Z20-730/IPMAL-BR3, BIOS D01 > 07/07/2016 > [ 20.819711] task: ffff8a3070288000 task.stack: ffffa3eb4103c000 > [ 20.821762] RIP: 0010:gf100_ltc_oneinit_tag_ram+0xba/0x100 [nouveau] > [ 20.823789] RSP: 0018:ffffa3eb4103f6b8 EFLAGS: 00010206 > [ 20.825773] RAX: 00001000ffefdfff RBX: ffff8a306f915000 RCX: > ffff8a3075570030 > [ 20.827820] RDX: 0000000000000000 RSI: dead000000000200 RDI: > ffff8a307fd9b700 > [ 20.829797] RBP: ffffa3eb4103f6d0 R08: 000000000001e980 R09: > ffff8a3077003900 > [ 20.831825] R10: ffffa3eb40cdbda0 R11: ffff8a307fd986a4 R12: > 0000000000000000 > [ 20.833814] R13: 0000000100005fff R14: ffff8a306fa2e400 R15: > ffff8a306f914e00 > [ 20.835882] FS: 00007f456d052900(0000) GS:ffff8a307fd80000(0000) > knlGS:0000000000000000 > [ 20.837874] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 > [ 20.839935] CR2: 00007fefceb1c020 CR3: 000000026fbc5000 CR4: > 00000000003406e0 > [ 20.841918] Call Trace: > [ 20.843972] gm107_ltc_oneinit+0x7c/0x90 [nouveau] > [ 20.845952] nvkm_ltc_oneinit+0x13/0x20 [nouveau] > [ 20.847991] nvkm_subdev_init+0x50/0x210 [nouveau] > [ 20.849977] nvkm_device_init+0x151/0x270 [nouveau] > [ 20.851997] nvkm_udevice_init+0x48/0x60 [nouveau] > [ 20.853944] nvkm_object_init+0x40/0x190 [nouveau] > [ 20.855924] nvkm_ioctl_new+0x179/0x290 [nouveau] > [ 20.857838] ? nvkm_client_notify+0x30/0x30 [nouveau] > [ 20.859794] ? nvkm_udevice_rd08+0x30/0x30 [nouveau] > [ 20.861674] nvkm_ioctl+0x168/0x240 [nouveau] > [ 20.863576] ? nvif_client_init+0x42/0x110 [nouveau] > [ 20.865449] nvkm_client_ioctl+0x12/0x20 [nouveau] > [ 20.867368] nvif_object_ioctl+0x42/0x50 [nouveau] > [ 20.869237] nvif_object_init+0xc2/0x130 [nouveau] > [ 20.871141] nvif_device_init+0x12/0x30 [nouveau] > [ 20.872994] nouveau_cli_init+0x15e/0x1d0 [nouveau] > [ 20.874873] nouveau_drm_load+0x67/0x8b0 [nouveau] > [ 20.876674] ? sysfs_do_create_link_sd.isra.2+0x70/0xb0 > [ 20.878451] drm_dev_register+0x148/0x1e0 [drm] > [ 20.880302] drm_get_pci_dev+0xa0/0x160 [drm] > [ 20.882166] nouveau_drm_probe+0x1d9/0x260 [nouveau] > > This has been reproduced on 4.12-rc3. Please let us know how we can help > with further debugging. does this error appear on older kernels? Can you get a dmesg booted with "nouveau.debug=debug"? thanks Created attachment 131772 [details]
dmesg with nouveau.debug=debug
Here is the debug log. By the way, this is exactly the same report as was posted on the nouveau list "Kernel panic on nouveau during boot on NVIDIA NV118 (GM108)" - I just duplicated it here after no response on the list (plus I think here is the more appropriate place?)
Thanks for your help so far!
And yes this problem has been reproduced on v4.8, v4.11, v4.12-rc3. We don't know of any working kernels. can you also upload your vbios.rom file located in /sys/kernel/debug/dri/0/vbios.rom ? And if you are up for it, install envytools and do nvapeek 101000 as root? Second is optional, but may help us even more. vbios.rom is empty. We will try to get envytools running now. The nvapeek 101000 output is "PCI init failure". Created attachment 131939 [details] [review] test disabling pci link config twiddling Can you give this patch a try? It looks like things are working normally up until we try to fiddle with the PCIE link configuration, and I'd like to rule this in/out as a culprit. Sorry for the slow response. We tested the patch against 4.13.rc5 and the issue is still there. Problem still present on 4.14-rc4 I don't suppose you'd be able to grab a mmiotrace[1] of the proprietary driver for me? One of Nouveau might be useful also. [1] https://nouveau.freedesktop.org/wiki/MmioTrace/ Sent partial dump from nvidia proprietary driver to mmio.dumps address: I loaded the module and then started an empty X session. Unfortunately with tracing enabled, this results in an instant hard hang upon starting X, before it has rendered the black screen. However I managed to capture these messages over the network up until the point of hang. Captured on Linux 4.13. No external displays connected. The all-in-one PC has one internal LCD display. This patch should (hopefully) help with the issues faced while tracing the binary driver, it will apply to their kernel shim layer sourcecode that can be found if you extract the installer package with --extract-only (and run ./nvidia-installer inside the extracted directory to build the patched kernel module). https://paste.fedoraproject.org/paste/LkiC1cJdfPGOLc~NlKWkcA That worked. sent an updated dump -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/352. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 131742 [details] full dmesg log On the Acer Z20-730 laptop, the nouveau driver crashes during boot with: [ 4.041108] nouveau 0000:01:00.0: pci: failed to adjust cap speed [ 4.041167] nouveau 0000:01:00.0: pci: failed to adjust lnkctl speed [ 9.633613] nouveau 0000:01:00.0: fb: 0 MiB DDR3 [ 20.811768] divide error: 0000 [#1] SMP [ 20.813654] Modules linked in: hid_generic usbmouse usbkbd usbhid i915 nouveau(+) mxm_wmi i2c_algo_bit drm_kms_helper sdhci_pci syscopyarea sysfillrect sysimgblt fb_sys_fops ttm sdhci drm ahci libahci wmi i2c_hid hid video [ 20.815697] CPU: 3 PID: 200 Comm: systemd-udevd Not tainted 4.11.0-2-generic #7+dev143.9f9ecd2beos3.2.2-Endless [ 20.817684] Hardware name: Acer Aspire Z20-730/IPMAL-BR3, BIOS D01 07/07/2016 [ 20.819711] task: ffff8a3070288000 task.stack: ffffa3eb4103c000 [ 20.821762] RIP: 0010:gf100_ltc_oneinit_tag_ram+0xba/0x100 [nouveau] [ 20.823789] RSP: 0018:ffffa3eb4103f6b8 EFLAGS: 00010206 [ 20.825773] RAX: 00001000ffefdfff RBX: ffff8a306f915000 RCX: ffff8a3075570030 [ 20.827820] RDX: 0000000000000000 RSI: dead000000000200 RDI: ffff8a307fd9b700 [ 20.829797] RBP: ffffa3eb4103f6d0 R08: 000000000001e980 R09: ffff8a3077003900 [ 20.831825] R10: ffffa3eb40cdbda0 R11: ffff8a307fd986a4 R12: 0000000000000000 [ 20.833814] R13: 0000000100005fff R14: ffff8a306fa2e400 R15: ffff8a306f914e00 [ 20.835882] FS: 00007f456d052900(0000) GS:ffff8a307fd80000(0000) knlGS:0000000000000000 [ 20.837874] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033 [ 20.839935] CR2: 00007fefceb1c020 CR3: 000000026fbc5000 CR4: 00000000003406e0 [ 20.841918] Call Trace: [ 20.843972] gm107_ltc_oneinit+0x7c/0x90 [nouveau] [ 20.845952] nvkm_ltc_oneinit+0x13/0x20 [nouveau] [ 20.847991] nvkm_subdev_init+0x50/0x210 [nouveau] [ 20.849977] nvkm_device_init+0x151/0x270 [nouveau] [ 20.851997] nvkm_udevice_init+0x48/0x60 [nouveau] [ 20.853944] nvkm_object_init+0x40/0x190 [nouveau] [ 20.855924] nvkm_ioctl_new+0x179/0x290 [nouveau] [ 20.857838] ? nvkm_client_notify+0x30/0x30 [nouveau] [ 20.859794] ? nvkm_udevice_rd08+0x30/0x30 [nouveau] [ 20.861674] nvkm_ioctl+0x168/0x240 [nouveau] [ 20.863576] ? nvif_client_init+0x42/0x110 [nouveau] [ 20.865449] nvkm_client_ioctl+0x12/0x20 [nouveau] [ 20.867368] nvif_object_ioctl+0x42/0x50 [nouveau] [ 20.869237] nvif_object_init+0xc2/0x130 [nouveau] [ 20.871141] nvif_device_init+0x12/0x30 [nouveau] [ 20.872994] nouveau_cli_init+0x15e/0x1d0 [nouveau] [ 20.874873] nouveau_drm_load+0x67/0x8b0 [nouveau] [ 20.876674] ? sysfs_do_create_link_sd.isra.2+0x70/0xb0 [ 20.878451] drm_dev_register+0x148/0x1e0 [drm] [ 20.880302] drm_get_pci_dev+0xa0/0x160 [drm] [ 20.882166] nouveau_drm_probe+0x1d9/0x260 [nouveau] This has been reproduced on 4.12-rc3. Please let us know how we can help with further debugging.