Bug 108857 - display becomes unresponsive and keyboard input fails
Summary: display becomes unresponsive and keyboard input fails
Status: NEW
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/nouveau (show other bugs)
Version: unspecified
Hardware: x86-64 (AMD64) Linux (All)
: medium major
Assignee: Nouveau Project
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2018-11-25 12:20 UTC by tla2k20
Modified: 2019-04-02 18:16 UTC (History)
1 user (show)

See Also:
i915 platform:
i915 features:


Attachments
dmesg (71.79 KB, text/plain)
2018-11-25 12:20 UTC, tla2k20
no flags Details
lspci -vvv (38.30 KB, text/plain)
2018-11-25 12:21 UTC, tla2k20
no flags Details
dmesg from 4.19.4-300.fc29.x86_64 (75.24 KB, text/plain)
2018-11-28 16:37 UTC, tla2k20
no flags Details

Note You need to log in before you can comment on or make changes to this bug.
Description tla2k20 2018-11-25 12:20:11 UTC
Created attachment 142607 [details]
dmesg

Fedora release 29 (Twenty Nine)
Linux s0.home 4.19.3-300.fc29.x86_64 #1 SMP Wed Nov 21 15:27:25 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

System boots and is initially OK for a few minutes then becomes unresponsive to mouse movements and keyboard input.  Access via a SSH session is fine.

Issue started around 3 kernel updates ago (4.19.*).
 
top shows the following when the problem is in progress:

top - 12:10:25 up  1:10,  2 users,  load average: 0.23, 0.25, 0.27
Tasks: 356 total,   2 running, 354 sleeping,   0 stopped,   0 zombie
%Cpu(s):  0.8 us,  6.6 sy,  0.0 ni, 91.6 id,  0.0 wa,  1.0 hi,  0.0 si,  0.0 st
MiB Mem :  15982.9 total,   9198.9 free,   4552.3 used,   2231.7 buff/cache
MiB Swap:   8008.0 total,   8008.0 free,      0.0 used.  11035.1 avail Mem

  PID USER      PR  NI    VIRT    RES    SHR S  %CPU  %MEM     TIME+ COMMAND
 7144 root      20   0       0      0      0 R  53.8   0.0   0:02.26 kworker/u16:0+events_unbound
 2325 user1     20   0 3867708 195040 118048 S   2.0   1.2   0:28.43 gnome-shell

kworker/u16:2+events_unbound seems to eat CPU and dmesg reports:

nouveau 0000:01:00.0: DRM: base-0: timeout
Comment 1 tla2k20 2018-11-25 12:21:19 UTC
Created attachment 142608 [details]
lspci -vvv
Comment 2 Rhys Kidd 2018-11-26 21:36:37 UTC
Comparing dmesg, a similar timeout fault with the GP104 was experienced by this user (their dmesg is linked): https://bugs.launchpad.net/ubuntu/+source/linux/+bug/1799180/comments/5
Comment 3 tla2k20 2018-11-28 16:35:58 UTC
Updated to latest Fedora 29 kernel today and the problem is still evident.

Linux s0.home 4.19.4-300.fc29.x86_64 #1 SMP Fri Nov 23 13:03:11 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Comment 4 tla2k20 2018-11-28 16:37:00 UTC
Created attachment 142650 [details]
dmesg from 4.19.4-300.fc29.x86_64
Comment 5 tla2k20 2018-11-29 16:43:20 UTC
Switched to the nvidia drivers today and no issues, so looks like it is probably nouveau related?

sudo dnf config-manager --add-repo=https://negativo17.org/repos/fedora-nvidia.repo
sudo dnf -y remove \*nvidia\*
sudo dnf -y install nvidia-driver nvidia-settings kernel-devel nvidia-driver-libs.i686
# wait for the kernel driver to build in the background (top) then...
sudo reboot
Comment 6 Victor Costan 2018-12-01 07:04:05 UTC
I ran into the same problem since I upgraded to 4.19. I've been using the 4.18 kernel to get my work done. I just tried the 4.19 and 4.20 vanilla kernels packaged for Fedora, and the problem is still there.

I've been suspecting the Spectre mitigations until I found this bug -- I now tried the proprietary nvidia driver, and it seems to have made the problem go away.

In case it helps, I have a GTX1080 founders' edition.

Relevant lines from dmesg:
[    1.745599] nouveau 0000:01:00.0: NVIDIA GP104 (134000a1)
[    1.852957] nouveau 0000:01:00.0: bios: version 86.04.17.00.01
[    1.853407] nouveau 0000:01:00.0: bios: M0203E type 08
[    1.853440] nouveau 0000:01:00.0: fb: 8192 MiB of unknown memory type
[    1.893737] [TTM] Zone  kernel: Available graphics memory: 16445892 kiB
[    1.893738] [TTM] Zone   dma32: Available graphics memory: 2097152 kiB
[    1.893738] [TTM] Initializing pool allocator
[    1.893741] [TTM] Initializing DMA pool allocator
[    1.893750] nouveau 0000:01:00.0: DRM: VRAM: 8192 MiB
[    1.893751] nouveau 0000:01:00.0: DRM: GART: 536870912 MiB
[    1.893752] nouveau 0000:01:00.0: DRM: BIT table 'A' not found
[    1.893753] nouveau 0000:01:00.0: DRM: BIT table 'L' not found
[    1.893754] nouveau 0000:01:00.0: DRM: TMDS table version 2.0
[    1.893755] nouveau 0000:01:00.0: DRM: DCB version 4.1
[    1.893756] nouveau 0000:01:00.0: DRM: DCB outp 00: 01000f42 00020030
[    1.893757] nouveau 0000:01:00.0: DRM: DCB outp 01: 04811f96 04600020
[    1.893757] nouveau 0000:01:00.0: DRM: DCB outp 02: 04011f92 00020020
[    1.893758] nouveau 0000:01:00.0: DRM: DCB outp 03: 04822f86 04600010
[    1.893759] nouveau 0000:01:00.0: DRM: DCB outp 04: 04022f82 00020010
[    1.893760] nouveau 0000:01:00.0: DRM: DCB outp 06: 02033f62 00020010
[    1.893761] nouveau 0000:01:00.0: DRM: DCB outp 07: 02844f76 04600020
[    1.893762] nouveau 0000:01:00.0: DRM: DCB outp 08: 02044f72 00020020
[    1.893762] nouveau 0000:01:00.0: DRM: DCB conn 00: 00001031
[    1.893763] nouveau 0000:01:00.0: DRM: DCB conn 01: 02000146
[    1.893764] nouveau 0000:01:00.0: DRM: DCB conn 02: 01000246
[    1.893765] nouveau 0000:01:00.0: DRM: DCB conn 03: 00010361
[    1.893765] nouveau 0000:01:00.0: DRM: DCB conn 04: 00020446
[    1.998241] [drm] Supports vblank timestamp caching Rev 2 (21.10.2013).
[    1.998243] [drm] Driver supports precise vblank timestamp query.
[    2.043058] nouveau 0000:01:00.0: DRM: MM: using COPY for buffer copies
[    2.374601] nouveau 0000:01:00.0: DRM: allocated 3840x2160 fb: 0x200000, bo 00000000ae0f5db6
[    2.374656] fbcon: nouveaufb (fb0) is primary device
[    2.374657] fbcon: Deferring console take-over
[    2.374658] nouveau 0000:01:00.0: fb0: nouveaufb frame buffer device
[    2.393355] [drm] Initialized nouveau 1.3.1 20120801 for 0000:01:00.0 on minor 0
[    2.470633] nouveau 0000:01:00.0: disp: 0x000064f7[0]: INIT_GENERIC_CONDITON: unknown 0x07
[   40.934912] nouveau 0000:01:00.0: disp: chid 1 mthd 0000 data 00000000 00003000 00000000
[   40.934931] nouveau 0000:01:00.0: disp: chid 1 mthd 0004 data 08700f00 10003004 00000000
[   40.934948] nouveau 0000:01:00.0: disp: chid 1 mthd 0008 data 0000f004 10003008 00000000
[   40.934966] nouveau 0000:01:00.0: disp: chid 1 mthd 000c data 0000cf00 1000300c 00000000
[   40.934977] nouveau 0000:01:00.0: disp: chid 1 mthd 0010 data 20000000 10003010 00000000
[   40.934992] nouveau 0000:01:00.0: disp: chid 1 mthd 0014 data 00000000 10003014 00000000
[   40.935002] nouveau 0000:01:00.0: disp: chid 1 mthd 0018 data 00000000 10003018 00000000
[   40.935015] nouveau 0000:01:00.0: disp: chid 1 mthd 001c data 00000000 1000301c 00000000
[   40.935024] nouveau 0000:01:00.0: disp: chid 1 mthd 0020 data 00000000 10003020 00000000
[   40.935037] nouveau 0000:01:00.0: disp: chid 1 mthd 0000 data 00000400 10001000 00000002
[   42.935037] nouveau 0000:01:00.0: DRM: base-0: timeout
[   44.937059] nouveau 0000:01:00.0: DRM: base-0: timeout
[   46.940112] nouveau 0000:01:00.0: DRM: base-0: timeout
[   48.942221] nouveau 0000:01:00.0: DRM: base-0: timeout
[   73.444493] nouveau 0000:01:00.0: DRM: base-0: timeout
[   91.300747] nouveau 0000:01:00.0: DRM: base-0: timeout
[   93.306492] nouveau 0000:01:00.0: DRM: base-0: timeout
[   95.310609] nouveau 0000:01:00.0: DRM: base-0: timeout
[  111.646135] nouveau 0000:01:00.0: DRM: base-0: timeout
[  113.648187] nouveau 0000:01:00.0: DRM: base-0: timeout
[  115.649874] nouveau 0000:01:00.0: DRM: base-0: timeout
[  117.650629] nouveau 0000:01:00.0: DRM: base-0: timeout
[  120.902222] nouveau 0000:01:00.0: DRM: base-0: timeout
[  122.902900] nouveau 0000:01:00.0: DRM: base-0: timeout
[  124.903509] nouveau 0000:01:00.0: DRM: base-0: timeout
[  126.904112] nouveau 0000:01:00.0: DRM: base-0: timeout
[  154.470730] nouveau 0000:01:00.0: DRM: base-0: timeout
[  214.470998] nouveau 0000:01:00.0: DRM: base-0: timeout
[  274.471269] nouveau 0000:01:00.0: DRM: base-0: timeout
[  334.470469] nouveau 0000:01:00.0: DRM: base-0: timeout
[  394.470680] nouveau 0000:01:00.0: DRM: base-0: timeout
[  454.470846] nouveau 0000:01:00.0: DRM: base-0: timeout
Comment 7 c_norman 2019-04-02 18:15:09 UTC
I'm hitting this as well on my newly-installed Thinkpad T480. 

[  290.639896] nouveau 0000:01:00.0: timeout
[  290.639954] WARNING: CPU: 0 PID: 500 at drivers/gpu/drm/nouveau/nvkm/subdev/pmu/base.c:86 nvkm_pmu_reset+0x148/0x160 [nouveau]
[  290.639954] Modules linked in: snd_hda_codec_hdmi arc4 mei_wdt iTCO_wdt iTCO_vendor_support snd_soc_skl snd_soc_hdac_hda iwlmvm snd_hda_ext_core snd_soc_skl_ipc intel_cstate(+) snd_soc_sst_ipc intel_uncore snd_soc_sst_dsp snd_soc_acpi_intel_match snd_hda_codec_realtek snd_soc_acpi snd_hda_codec_generic mac80211 snd_soc_core snd_compress ac97_bus snd_pcm_dmaengine snd_hda_intel intel_rapl_perf snd_hda_codec btusb uvcvideo snd_hda_core btrtl snd_hwdep btbcm snd_seq btintel videobuf2_vmalloc snd_seq_device bluetooth videobuf2_memops iwlwifi snd_pcm videobuf2_v4l2 videobuf2_common intel_wmi_thunderbolt wmi_bmof videodev cfg80211 thunderbolt joydev thinkpad_acpi i2c_i801 snd_timer media mei_me ecdh_generic ledtrig_audio ucsi_acpi typec_ucsi intel_xhci_usb_role_switch roles snd processor_thermal_device mei typec intel_pch_thermal intel_soc_dts_iosf int3403_thermal soundcore rfkill int340x_thermal_zone int3400_thermal acpi_thermal_rel acpi_pad pcc_cpufreq dm_crypt i915 nouveau kvmgt mdev vfio kvm
[  290.639972]  mxm_wmi ttm nvme irqbypass i2c_algo_bit crct10dif_pclmul drm_kms_helper crc32_pclmul nvme_core crc32c_intel e1000e drm uas serio_raw ghash_clmulni_intel usb_storage hid_multitouch wmi video
[  290.639978] CPU: 0 PID: 500 Comm: plymouthd Tainted: G        W         5.0.5-200.fc29.x86_64 #1
[  290.639979] Hardware name: LENOVO 20L6S29D00/20L6S29D00, BIOS N24ET48W (1.23 ) 02/20/2019
[  290.640005] RIP: 0010:nvkm_pmu_reset+0x148/0x160 [nouveau]
[  290.640006] Code: 04 24 48 8b 40 10 48 8b 78 10 4c 8b 67 50 4d 85 e4 74 1e e8 aa 75 fb db 4c 89 e2 48 c7 c7 43 df 7b c0 48 89 c6 e8 d2 8f a7 db <0f> 0b e9 50 ff ff ff 4c 8b 67 10 eb dc 48 8b 5f 10 eb a3 e8 b0 8c
[  290.640006] RSP: 0000:ffffa54d03aa38b8 EFLAGS: 00010286
[  290.640007] RAX: 0000000000000000 RBX: ffff92b657d82c00 RCX: 0000000000000006
[  290.640008] RDX: 0000000000000007 RSI: 0000000000000086 RDI: ffff92b6614168c0
[  290.640008] RBP: ffff92b6583f7b40 R08: 000000000000046b R09: 0000000000000003
[  290.640009] R10: 0000000000000000 R11: 0000000000000001 R12: ffff92b65d2908b0
[  290.640009] R13: 0000004332f9ada6 R14: 000000432df68d37 R15: ffff92b65d3130b0
[  290.640010] FS:  00007f9b716ac300(0000) GS:ffff92b661400000(0000) knlGS:0000000000000000
[  290.640011] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  290.640011] CR2: 00007f56ec1877f8 CR3: 0000000856b9a003 CR4: 00000000003606f0
[  290.640012] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[  290.640013] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[  290.640013] Call Trace:
[  290.640041]  nvkm_pmu_init+0x16/0x40 [nouveau]
[  290.640054]  nvkm_subdev_init+0xb2/0x200 [nouveau]
[  290.640080]  nvkm_device_init+0x13b/0x280 [nouveau]
[  290.640106]  nvkm_udevice_init+0x41/0x60 [nouveau]
[  290.640119]  nvkm_object_init+0x3e/0x100 [nouveau]
[  290.640131]  nvkm_object_init+0x71/0x100 [nouveau]
[  290.640143]  nvkm_object_init+0x71/0x100 [nouveau]
[  290.640168]  nouveau_do_resume+0x28/0x150 [nouveau]
[  290.640193]  nouveau_pmops_runtime_resume+0x88/0x150 [nouveau]
[  290.640196]  pci_pm_runtime_resume+0x74/0xd0
[  290.640198]  ? pci_restore_standard_config+0x40/0x40
[  290.640200]  __rpm_callback+0xca/0x1d0
[  290.640201]  ? pci_restore_standard_config+0x40/0x40
[  290.640203]  rpm_callback+0x1f/0x70
[  290.640204]  ? pci_restore_standard_config+0x40/0x40
[  290.640205]  rpm_resume+0x5bc/0x7e0
[  290.640206]  __pm_runtime_resume+0x47/0x70
[  290.640231]  nouveau_drm_open+0x3b/0x1a0 [nouveau]
[  290.640241]  drm_file_alloc+0x15a/0x240 [drm]
[  290.640247]  drm_open+0xa7/0x210 [drm]
[  290.640253]  ? drm_dev_enter+0x19/0x50 [drm]
[  290.640259]  drm_stub_open+0xaf/0xf0 [drm]
[  290.640262]  chrdev_open+0xa2/0x1c0
[  290.640263]  ? cdev_put.part.0+0x20/0x20
[  290.640265]  do_dentry_open+0x12f/0x340
[  290.640266]  path_openat+0x2c5/0x1500
[  290.640269]  ? __alloc_pages_nodemask+0x160/0x300
[  290.640270]  do_filp_open+0x93/0x100
[  290.640272]  ? __check_object_size+0x15d/0x189
[  290.640274]  do_sys_open+0x186/0x220
[  290.640276]  do_syscall_64+0x5b/0x160
[  290.640278]  entry_SYSCALL_64_after_hwframe+0x44/0xa9
[  290.640279] RIP: 0033:0x7f9b71945ca2
[  290.640280] Code: 25 00 00 41 00 3d 00 00 41 00 74 4c 48 8d 05 85 7a 0d 00 8b 00 85 c0 75 6d 89 f2 b8 01 01 00 00 48 89 fe bf 9c ff ff ff 0f 05 <48> 3d 00 f0 ff ff 0f 87 a2 00 00 00 48 8b 4c 24 28 64 48 33 0c 25
[  290.640281] RSP: 002b:00007fff92e0f1e0 EFLAGS: 00000246 ORIG_RAX: 0000000000000101
[  290.640282] RAX: ffffffffffffffda RBX: 00007f9b716ac270 RCX:
[  290.640282] RDX: 0000000000000002 RSI: 000055d4899fae30 RDI: 00000000ffffff9c
[  290.640283] RBP: 000055d4899fada8 R08: 000055d4899fad70 R09: 0000000000000000
[  290.640283] R10: 0000000000000000 R11: 0000000000000246 R12: 000055d4899fada0
[  290.640284] R13: 0000000000000000 R14: 0000000000000000 R15: 000055d4899fad20

Running Fedora 29

Linux dhcp-ip-deleted.fc29.x86_64 #1 SMP Wed Mar 27 20:58:04 UTC 2019 x86_64 x86_64 x86_64 GNU/Linux
Comment 8 c_norman 2019-04-02 18:16:01 UTC
Deleted that kernel version in my previous comment:

 5.0.5-200.fc29.x86_64


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.