Summary: | Regularly System Crash: (ca. 1 hour) nouveau 0000:08:00.0: gr: PGRAPH TLB flush idle timeout fail and nouveau 0000:08:00.0: mmu: ce0 mmu invalidate timeout | ||
---|---|---|---|
Product: | Mesa | Reporter: | Linux Freak <Linuxfreak> |
Component: | Drivers/DRI/nouveau | Assignee: | Nouveau Project <nouveau> |
Status: | RESOLVED MOVED | QA Contact: | Nouveau Project <nouveau> |
Severity: | critical | ||
Priority: | high | CC: | Linuxfreak |
Version: | 19.0 | ||
Hardware: | x86-64 (AMD64) | ||
OS: | Linux (All) | ||
See Also: | https://bugs.freedesktop.org/show_bug.cgi?id=105940 | ||
Whiteboard: | |||
i915 platform: | i915 features: | ||
Attachments: |
a huge: journalctl -p 4 -b -1 of the last crash
Crash from 14.05.2019 crash 6 crash 6 While it crashed, the music of VLC was playing until the end of the song. crash 7, VLC Media Player running continuously playing music crash nouveau 0000:08:00.0: timeout crash 9 crash 10 crash 12: a huge journalctl -p 4 -b -1 of the last crash |
Description
Linux Freak
2019-05-01 07:52:23 UTC
Enabling also warnings gives a lot of more useful information: journalctl -p 4 -b -0 see at: https://forum.manjaro.org/t/system-crash-nouveau-000000-0-gr-pgraph-tlb-flush-idle-timeout-fail-and-nouveau-000000-0-mmu-ce0-mmu-invalidate-timeout/85154 from: https://nouveau.freedesktop.org/wiki/TroubleShooting/ Diagnosing hang: 12 Completely dead: display, keyboard and other input devices, network, serial port, IEEE1394. Have to press reset button. The Basic Questions 1) I do not have drivers that break nouveau 2) No Nvidia proprietary driver is installed Created attachment 144236 [details]
a huge: journalctl -p 4 -b -1 of the last crash
A lot of hang up messages...
e.g.
Call Trace:
Mai 11 19:58:16 kernel: ? nvkm_ioctl_new+0x1a0/0x200 [nouveau]
Call Trace:
Mai 11 19:58:18 kernel: nvkm_vmm_ptes_get_map+0x246/0x3f0 [nouveau]
Call Trace:
Mai 11 19:58:22 kernel: nv50_vmm_flush+0x1f2/0x220 [nouveau]
Call Trace:
Mai 11 19:58:24 kernel: nvkm_vmm_ptes_get_map+0x246/0x3f0 [nouveau]
Call Trace:
Mai 11 19:58:28 kernel: nv50_vmm_flush+0x1f2/0x220 [nouveau]
INFO: task kworker/u8:2:1002 blocked for more than 122 seconds.
Mai 11 20:01:38 kernel: Tainted: G W 5.1.0-1-MANJARO #1
Mai 11 20:01:38 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 11 20:01:38 kernel: Call Trace:
Mai 11 20:01:38 kernel: ? __schedule+0x30b/0x8b0
Mai 11 20:01:38 kernel: ? g84_fifo_uevent_init+0x1a/0x40 [nouveau]
Mai 11 20:01:38 kernel: schedule+0x32/0x80
Mai 11 20:01:38 kernel: schedule_timeout+0x311/0x4a0
Mai 11 20:01:38 kernel: ? nouveau_fence_is_signaled+0x39/0x40 [nouveau]
Mai 11 20:01:38 kernel: dma_fence_default_wait+0x204/0x280
Mai 11 20:01:38 kernel: ? dma_fence_wait_timeout+0x120/0x120
Mai 11 20:01:38 kernel: dma_fence_wait_timeout+0x105/0x120
Mai 11 20:01:38 kernel: drm_atomic_helper_wait_for_fences+0x38/0xc0 [drm_kms_helper]
Mai 11 20:01:38 kernel: nv50_disp_atomic_commit_tail+0x72/0x690 [nouveau]
Mai 11 20:01:38 kernel: ? finish_task_switch+0x84/0x2d0
Mai 11 20:01:38 kernel: ? __switch_to_asm+0x40/0x70
Mai 11 20:01:38 kernel: process_one_work+0x1eb/0x410
Mai 11 20:01:38 kernel: worker_thread+0x2d/0x3d0
Mai 11 20:01:38 kernel: ? process_one_work+0x410/0x410
Mai 11 20:01:38 kernel: kthread+0x112/0x130
Mai 11 20:01:38 kernel: ? kthread_park+0x80/0x80
Mai 11 20:01:38 kernel: ret_from_fork+0x35/0x40
The basic Questions 3) Did you compile fbcon into the kernel, or compile it as a module and loaded it? First make sure you have CONFIG_FRAMEBUFFER_CONSOLE enabled in your kernel configuration. If it is a module (it is called fbcon.ko), make sure it is loaded. Otherwise activating KMS will make your console screen unusable, but your system should still work otherwise, including X. -> Maybe NO, i use the kernel as i got it. so: sudo modprobe fbcon modprobe: FATAL: Module fbcon not found in directory /lib/modules/5.1.0-1-MANJARO so i do edit /etc/modules-load.d/modules.conf as fbcon and reboot.. hmm do i need to install fbcon? journalctl -p 3 -b -0 Mai 11 20:38:44 systemd-modules-load[273]: Failed to find module 'fbcon' Mai 11 20:38:49 systemd-modules-load[358]: Failed to find module 'fbcon' Mai 11 20:38:49 systemd[1]: Failed to start Load Kernel Modules. Mai 11 20:38:49 systemd-modules-load[364]: Failed to find module 'fbcon' Mai 11 20:38:49 systemd[1]: Failed to start Load Kernel Modules. Mai 11 20:38:50 kernel: nouveau 0000:08:00.0: bios: OOB 1 d7500086 d7500086 Mai 11 20:39:16 colord-sane[464]: io/hpmud/pp.c 627: unable to read device-i The basic Questions: 4) i do use nouveau, i see a lot of in XORG but no nv: cat /var/log/Xorg.0.log | grep nv [ 30.677] (==) Matched nv as autoconfigured driver 1 [ 31.051] (II) LoadModule: "nv" [ 31.052] (WW) Warning, couldn't open module nv [ 31.052] (EE) Failed to load module "nv" (module does not exist, 0) THE BASIC QUESTIONS 6) The version i use: The version: pacman -Si xf86-video-nouveau Repository : extra Name : xf86-video-nouveau Version : 1.0.16-1 Description : Open Source 3D acceleration driver for nVidia cards Architecture : x86_64 URL : https://nouveau.freedesktop.org/ Licenses : GPL Groups : xorg-drivers Provides : None Depends On : libsystemd mesa Optional Deps : None Conflicts With : xorg-server<1.20 X-ABI-VIDEODRV_VERSION<24 X-ABI-VIDEODRV_VERSION>=25 Replaces : None Download Size : 84.94 KiB Installed Size : 259.00 KiB Packager : Andreas Radke <andyrtrNOSPAMarchlinux.org> Build Date : Tue Jan 29 17:13:41 2019 Validated By : MD5 Sum SHA-256 Sum Signature QUESTION: fbcon Is nouveaufb the fbcon module? sudo dmesg | grep nouveaufb [ 20.165256] fb0: switching to nouveaufb from VESA VGA [ 20.468352] fbcon: nouveaufb (fb0) is primary device [ 20.510301] nouveau 0000:08:00.0: fb0: nouveaufb frame buffer device the error on journalctl -p 3 -b -0 kernel: nouveau 0000:08:00.0: bios: OOB 1 d7500086 d7500086 has the same address as nouveaufb... ! *** Maybe the nouveau error is due to the missing fbcon? *** MANJARO has in linux50 the following config: CONFIG_FRAMEBUFFER_CONSOLE=y CONFIG_FRAMEBUFFER_CONSOLE_DETECT_PRIMARY=y CONFIG_FRAMEBUFFER_CONSOLE_ROTATION=y CONFIG_FRAMEBUFFER_CONSOLE_DEFERRED_TAKEOVER=y In older kernels different config is available, but all of them certainly have CONFIG_FRAMEBUFFER_CONSOLE enabled. another crash... journalctl -p 3 -b -1 Mai 14 15:48:48 kernel: nouveau 0000:08:00.0: bios: OOB 1 d7500086 d7500086 Mai 14 15:49:16 colord-sane[456]: io/hpmud/pp.c 627: unable to read device-id ret=-1 Mai 14 15:56:10 kernel: nouveau 0000:08:00.0: mmu: ce0 mmu invalidate timeout Mai 14 15:56:10 kernel: nouveau 0000:08:00.0: gr: PGRAPH TLB flush idle timeout fail Mai 14 15:56:10 kernel: nouveau 0000:08:00.0: gr: PGRAPH_STATUS ffffffff [BUSY DISPATCH UNK2 UNK3 UNK4 UNK5 M2MF UNK7 CTXPROG VFETCH CCACHE_PREGEOM STRMOUT_VATTR_POSTGEOM VCLIP RATTR> Mai 14 15:56:10 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS0: ffffffff [] Mai 14 15:56:10 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS1: 0000106d [TPC_TEX TPC_MP] Mai 14 15:56:10 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS2: 00148000 [ENG2D] Created attachment 144262 [details]
Crash from 14.05.2019
Call Trace:
Mai 14 15:55:47 kernel: ? _raw_spin_lock+0x13/0x30
Call Trace:
Mai 14 15:55:55 kernel: <IRQ>
Mai 14 15:55:55 kernel: nvkm_pci_intr+0x4c/0x90 [nouveau]
Call Trace:
Mai 14 15:56:00 kernel: nv50_instobj_release+0x27/0x90 [nouveau]
Call Trace:
Mai 14 15:56:10 kernel: nvkm_vmm_ptes_get_map+0x246/0x3f0 [nouveau]
Call Trace:
Mai 14 15:56:10 kernel: nv50_vmm_flush+0x1f2/0x220 [nouveau]
Created attachment 144263 [details]
crash 6
While it crashed, the music of VLC was playing until the end of the song.
Created attachment 144264 [details]
crash 6 While it crashed, the music of VLC was playing until the end of the song.
While it crashed, the music of VLC was playing until the end of the song.
Created attachment 144279 [details]
crash 7, VLC Media Player running continuously playing music
crash 7, VLC Media Player was set to continuously playing music, after crash, until pressing HW reset button.
Created attachment 144284 [details]
crash nouveau 0000:08:00.0: timeout
crash nouveau 0000:08:00.0: timeout
Created attachment 144299 [details]
crash 9
nouveau 0000:08:00.0: timeout
RIP: 0010:g84_gr_tlb_flush+0x2ec/0x300 [nouveau]
Call Trace:
Mai 17 08:07:29 kernel: ? _raw_spin_lock+0x13/0x30
Mai 17 08:07:29 kernel: ? gv100_fb_new+0x20/0x20 [nouveau]
Created attachment 144339 [details]
crash 10
Mai 19 15:13:49 kernel: nouveau 0000:08:00.0: mmu: ce0 mmu invalidate timeout
Mai 19 15:13:53 kernel: nouveau 0000:08:00.0: gr: PGRAPH TLB flush idle timeout fail
Mai 19 15:13:53 kernel: nouveau 0000:08:00.0: gr: PGRAPH_STATUS 00b00103 [BUSY DISPATCH CTXPROG TPC_PROP TPC_TEX TPC_MP]
Mai 19 15:13:53 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS0: 00000000 []
Mai 19 15:13:53 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS1: 00005068 [TPC_TEX]
Mai 19 15:13:53 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS2: 00000000 []
Mai 19 15:16:18 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:16:18 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:16:18 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:18:21 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:18:21 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:18:21 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:20:24 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:20:24 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:20:24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:22:26 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:22:27 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:22:27 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:24:29 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:24:29 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:24:29 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:26:32 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:26:32 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:26:32 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:28:35 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:28:35 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:28:35 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:30:38 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:30:38 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:30:38 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:32:41 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:32:41 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:32:41 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 19 15:34:44 kernel: INFO: task kworker/u8:7:219 blocked for more than 120 seconds.
Mai 19 15:34:44 kernel: Tainted: G W 5.0.15-1-MANJARO #1
Mai 19 15:34:44 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Created attachment 144455 [details]
crash 12: a huge journalctl -p 4 -b -1 of the last crash
Mai 29 11:43:58 kernel: nouveau 0000:08:00.0: gr: PGRAPH TLB flush idle timeout fail
Mai 29 11:43:59 kernel: nouveau 0000:08:00.0: gr: PGRAPH_STATUS 01bfe101 [BUSY CTXPROG RATTR_APLANE TRAST CLIPID ZCULL ENG2D RMASK TPC_RAST TPC_PROP TPC_TEX TPC_MP ROP]
Mai 29 11:43:59 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS0: 00000000 []
Mai 29 11:43:59 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS1: 0000106d [TPC_TEX TPC_MP]
Mai 29 11:43:59 kernel: nouveau 0000:08:00.0: gr: PGRAPH_VSTATUS2: 0034da43 [TRAST ENG2D ROP]
...
INFO: task kworker/u8:2:1186 blocked for more than 505 seconds.
Mai 29 11:53:24 kernel: Tainted: G W 5.1.4-1-MANJARO #1
Mai 29 11:53:24 kernel: "echo 0 > /proc/sys/kernel/hung_task_timeout_secs" disables this message.
Mai 29 11:53:24 kernel: Call Trace:
Mai 29 11:53:24 kernel: ? __schedule+0x30b/0x8b0
Mai 29 11:53:24 kernel: schedule+0x32/0x80
Mai 29 11:53:24 kernel: schedule_timeout+0x311/0x4a0
Mai 29 11:53:24 kernel: ? nouveau_fence_is_signaled+0x39/0x40 [nouveau]
Mai 29 11:53:24 kernel: dma_fence_default_wait+0x204/0x280
Mai 29 11:53:24 kernel: ? dma_fence_wait_timeout+0x120/0x120
Mai 29 11:53:24 kernel: dma_fence_wait_timeout+0x105/0x120
Mai 29 11:53:24 kernel: drm_atomic_helper_wait_for_fences+0x38/0xc0 [drm_kms_helper]
Mai 29 11:53:24 kernel: nv50_disp_atomic_commit_tail+0x72/0x690 [nouveau]
Mai 29 11:53:24 kernel: ? finish_task_switch+0x84/0x2d0
Mai 29 11:53:24 kernel: ? __switch_to_asm+0x35/0x70
Mai 29 11:53:24 kernel: process_one_work+0x1eb/0x410
Mai 29 11:53:24 kernel: worker_thread+0x2d/0x3d0
Mai 29 11:53:24 kernel: ? process_one_work+0x410/0x410
Mai 29 11:53:24 kernel: kthread+0x112/0x130
Mai 29 11:53:24 kernel: ? kthread_park+0x80/0x80
Mai 29 11:53:24 kernel: ret_from_fork+0x35/0x40
Hi, how can i find out, if i have the correct fbcon module in the kernel? is this the correct fbcon? [ 20.468352] fbcon: nouveaufb (fb0) is primary device [ 20.510301] nouveau 0000:08:00.0: fb0: nouveaufb frame buffer device [ 20.527782] [drm] Initialized nouveau 1.3.1 20120801 for 0000:08:00.0 on minor 0 see: https://forum.manjaro.org/t/system-crash-nouveau-000000-0-gr-pgraph-tlb-flush- idle-timeout-fail-and-nouveau-000000-0-mmu-ce0-mmu-invalidate-timeout/85154/23 + https://forum.manjaro.org/t/system-crash-nouveau-000000-0-gr-pgraph-tlb-flush- idle-timeout-fail-and-nouveau-000000-0-mmu-ce0-mmu-invalidate-timeout/85154/24 LF journalctl -p 7 -b -1 | grep fbcon Jun 05 09:59:17 kernel: fbcon: Deferring console take-over Jun 05 09:59:17 kernel: fbcon: Taking over console Jun 05 09:59:25 kernel: fbcon: nouveaufb (fb0) is primary device -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/mesa/mesa/issues/1176. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.