Summary: | Xorg take 100% CPU when using multiple independent screen configuration | ||||||||
---|---|---|---|---|---|---|---|---|---|
Product: | xorg | Reporter: | dura <dura> | ||||||
Component: | Driver/nouveau | Assignee: | Nouveau Project <nouveau> | ||||||
Status: | RESOLVED MOVED | QA Contact: | Xorg Project Team <xorg-team> | ||||||
Severity: | normal | ||||||||
Priority: | medium | ||||||||
Version: | unspecified | ||||||||
Hardware: | x86-64 (AMD64) | ||||||||
OS: | Linux (All) | ||||||||
Whiteboard: | |||||||||
i915 platform: | i915 features: | ||||||||
Attachments: |
|
Created attachment 115836 [details] [review] quick and dirty fix I also have two screen setup and am facing the same problem since beginning of April. Strace shows the KOBJENT NETLINK socket in Xorg returning immediately from select. The host gets around 9000 interupts / sec: cat /proc/interupts CPU0 CPU1 CPU2 CPU3 0: 151 0 0 0 IO-APIC-edge timer 1: 1 0 0 0 IO-APIC-edge i8042 8: 20 0 0 0 IO-APIC-edge rtc0 9: 0 0 0 0 IO-APIC-fasteoi acpi 12: 1 0 0 0 IO-APIC-edge i8042 16: 8898 0 0 0 IO-APIC 16-fasteoi ehci_hcd:usb1 17: 1268 0 0 0 IO-APIC 17-fasteoi snd_hda_intel 18: 0 0 0 0 IO-APIC 18-fasteoi 0000:03:00.0, i801_smbus 19: 70297 0 0 0 IO-APIC 19-fasteoi pata_jmicron 23: 33 0 0 0 IO-APIC 23-fasteoi ehci_hcd:usb2 24: 6912419 0 0 0 HPET_MSI-edge hpet2 25: 0 7491688 0 0 HPET_MSI-edge hpet3 26: 0 0 4552335 0 HPET_MSI-edge hpet4 27: 0 0 0 8398350 HPET_MSI-edge hpet5 34: 155556 0 0 0 PCI-MSI-edge nouveau 35: 37000 0 0 0 PCI-MSI-edge 0000:00:1f.2 36: 0 0 0 0 PCI-MSI-edge 0000:09:00.0 37: 120172 0 0 0 PCI-MSI-edge enp5s0 38: 325 0 0 0 PCI-MSI-edge snd_hda_intel NMI: 76929 95741 80757 78650 Non-maskable interrupts LOC: 0 125 121 118 Local timer interrupts SPU: 0 0 0 0 Spurious interrupts PMI: 76928 95740 80756 78649 Performance monitoring interrupts IWI: 11 10 11 5 IRQ work interrupts RTR: 5 0 0 0 APIC ICR read retries RES: 83545 76436 81720 73802 Rescheduling interrupts CAL: 142 146 161 94 Function call interrupts TLB: 276 150 213 227 TLB shootdowns TRM: 0 0 0 0 Thermal event interrupts THR: 0 0 0 0 Threshold APIC interrupts MCE: 0 0 0 0 Machine check exceptions MCP: 251 251 251 251 Machine check polls ERR: 0 MIS: 0 Kernel function tracer: cat /sys/kernel/debug/tracing/trace X-2956 [000] .... 74409.277687: __fget_light <-__fdget zsh-6138 [002] d... 74409.277688: __mod_zone_page_state <-get_page_from_freelist X-2956 [000] .... 74409.277688: sock_poll <-do_select zsh-6138 [002] d... 74409.277688: zone_statistics <-get_page_from_freelist X-2956 [000] .... 74409.277688: unix_poll <-sock_poll zsh-6138 [002] d... 74409.277688: __inc_zone_state <-zone_statistics X-2956 [000] .... 74409.277688: __fdget <-do_select zsh-6138 [002] d... 74409.277688: __inc_zone_state <-zone_statistics X-2956 [000] .... 74409.277688: __fget_light <-__fdget X-2956 [000] .... 74409.277688: sock_poll <-do_select X-2956 [000] .... 74409.277688: unix_poll <-sock_poll X-2956 [000] .... 74409.277688: __fdget <-do_select X-2956 [000] .... 74409.277688: __fget_light <-__fdget X-2956 [000] .... 74409.277688: sock_poll <-do_select X-2956 [000] .... 74409.277688: unix_poll <-sock_poll X-2956 [000] .... 74409.277688: __fdget <-do_select X-2956 [000] .... 74409.277688: __fget_light <-__fdget X-2956 [000] .... 74409.277689: sock_poll <-do_select X-2956 [000] .... 74409.277689: unix_poll <-sock_poll X-2956 [000] .... 74409.277689: __fdget <-do_select X-2956 [000] .... 74409.277689: __fget_light <-__fdget X-2956 [000] .... 74409.277689: sock_poll <-do_select X-2956 [000] .... 74409.277689: unix_poll <-sock_poll X-2956 [000] .... 74409.277689: __fdget <-do_select X-2956 [000] .... 74409.277689: __fget_light <-__fdget X-2956 [000] .... 74409.277689: sock_poll <-do_select X-2956 [000] .... 74409.277689: unix_poll <-sock_poll zsh-6138 [002] .... 74409.277689: kmemleak_alloc <-kmalloc_order X-2956 [000] .... 74409.277689: __fdget <-do_select X-2956 [000] .... 74409.277690: __fget_light <-__fdget X-2956 [000] .... 74409.277690: sock_poll <-do_select Perf top: PerfTop: 9602 irqs/sec kernel:86.0% exact: 0.0% [4000Hz cycles], (all, 4 CPUs) ------------------------------------------------------------------------------------------------------------------------------------------------------------ 19.24% [kernel] [k] do_select 17.08% [kernel] [k] __fget_light 13.34% [kernel] [k] unix_poll 13.18% [kernel] [k] sock_poll 3.00% [kernel] [k] do_raw_spin_lock 1.81% [kernel] [k] _raw_spin_unlock_irqrestore 1.55% Xorg [.] WaitForSomething 1.45% [kernel] [k] __fdget 1.38% [kernel] [k] datagram_poll 1.37% [kernel] [k] fput 1.37% [kernel] [k] _raw_spin_lock_irqsave 1.37% [kernel] [k] __pollwait 1.31% [vdso] [.] __vdso_clock_gettime 1.23% [kernel] [k] memset 0.76% libdrm_nouveau.so.2.0.0 [.] pushbuf_flush 0.76% libdrm_nouveau.so.2.0.0 [.] pushbuf_kref 0.72% nouveau_drv.so [.] drmmode_wakeup_handler 0.72% [kernel] [k] copy_user_generic_string 0.64% [kernel] [k] do_raw_spin_unlock 0.61% [kernel] [k] hpet_msi_next_event 0.57% libdrm_nouveau.so.2.0.0 [.] nouveau_bo_ref 0.55% [kernel] [k] system_call 0.54% [kernel] [k] core_sys_select 0.51% [kernel] [k] select_estimate_accuracy 0.45% nouveau_drv.so [.] NVBlockHandler 0.44% Xorg [.] GetTimeInMillis 0.43% [kernel] [k] poll_freewait 0.43% [kernel] [k] ktime_get_ts64 0.42% [kernel] [k] remove_wait_queue 0.41% Xorg [.] IdleTimeWakeupHandler 0.40% [kernel] [k] add_wait_queue 0.39% libexa.so [.] ExaBlockHandler 0.37% Xorg [.] BlockHandler 0.35% Xorg [.] WakeupHandler 0.32% Xorg [.] xf86Wakeup 0.31% [kernel] [k] nouveau_barobj_wr32 The bug activates when display of a screen is turned on. Restarting X fixes the issue until a display is cycled. and strace Xorg(PID 2954): 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 56000}) = 1 (in [12], left {264, 55981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 56000}) = 1 (in [12], left {264, 55981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 55000}) = 1 (in [12], left {264, 54982}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 55000}) = 1 (in [12], left {264, 54981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 55000}) = 1 (in [12], left {264, 54981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 54000}) = 1 (in [12], left {264, 53982}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 54000}) = 1 (in [12], left {264, 53982}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 54000}) = 1 (in [12], left {264, 53981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 53000}) = 1 (in [12], left {264, 52981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 53000}) = 1 (in [12], left {264, 52981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 53000}) = 1 (in [12], left {264, 52981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 52000}) = 1 (in [12], left {264, 51981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 52000}) = 1 (in [12], left {264, 51981}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 51000}) = 1 (in [12], left {264, 50982}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 51000}) = 1 (in [12], left {264, 50983}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 51000}) = 1 (in [12], left {264, 50983}) 2954 select(256, [1 3 5 10 11 12 18 19 20 21 22 23 24 25 26 27 28 29 30 31 32 33 34 35 36 37 38 39 40 41 42 43 44 45 46 47 48 49 50 51 52 53 54 55 56 57 58 59 60 61 62 63 64 65 66 67 68 69 70 71 72 73 74 75 76 77 78 79 80 81], NULL, NULL, {264, 51000}) = 1 (in [12], left {264, 50983}) cat /proc/2954/fd: 0 l-wx------ 1 root root 64 May 16 23:47 0 -> /var/log/Xorg.0.log 0 lrwx------ 1 root root 64 May 16 23:47 1 -> socket:[12196] 0 lrwx------ 1 root root 64 May 16 23:47 10 -> /dev/dri/card0 0 lrwx------ 1 root root 64 May 16 23:47 11 -> socket:[1845] 0 lrwx------ 1 root root 64 May 16 23:47 12 -> socket:[1858] 0 lrwx------ 1 root root 64 May 16 23:47 13 -> /dev/dri/card0 0 lrwx------ 1 root root 64 May 16 23:47 14 -> /dev/input/event1 0 lrwx------ 1 root root 64 May 16 23:47 15 -> /dev/input/event0 0 lrwx------ 1 root root 64 May 16 23:47 16 -> /dev/input/event2 0 lrwx------ 1 root root 64 May 16 23:47 17 -> /dev/input/event3 0 lrwx------ 1 root root 64 May 16 23:47 18 -> socket:[2848] ....etc lsof -p 2954: X 2954 root 0w REG 8,2 164675 262153 /var/log/Xorg.0.log X 2954 root 1u unix 0xffff880214864840 0t0 12196 @/tmp/.X11-unix/X0 X 2954 root 2w REG 8,2 56795 262596 /var/log/kdm.log X 2954 root 3u unix 0xffff880214864c80 0t0 12197 /tmp/.X11-unix/X0 X 2954 root 4r REG 8,2 31614 524293 /usr/lib64/xorg/protocol.txt X 2954 root 5u netlink 0t0 2823 KOBJECT_UEVENT X 2954 root 6w REG 0,3 0 4026531961 /proc/mtrr X 2954 root 7w REG 0,3 0 4026531961 /proc/mtrr X 2954 root 8u CHR 4,7 0t0 2080 /dev/tty7 X 2954 root 9u CHR 10,63 0t0 1026 /dev/vga_arbiter X 2954 root 10u CHR 226,0 0t0 1044 /dev/dri/card0 X 2954 root 11u sock 0,6 0t0 1845 protocol: NETLINK X 2954 root 12u netlink 0t0 1858 KOBJECT_UEVENT X 2954 root 13u CHR 226,0 0t0 1044 /dev/dri/card0 X 2954 root 14u CHR 13,65 0t0 1082 /dev/input/event1 X 2954 root 15u CHR 13,64 0t0 10912 /dev/input/event0 X 2954 root 16u CHR 13,66 0t0 2447 /dev/input/event2 X 2954 root 17u CHR 13,67 0t0 1323 /dev/input/event3 X 2954 root 18u unix 0xffff8800d9839100 0t0 2848 @/tmp/.X11-unix/X0 @dura Thank you for your quick and dirty fix! I am also suffering the same problem (reported here btw: https://bugs.freedesktop.org/show_bug.cgi?id=98383). In case when my xinerama is disabled, your patch is working great. No more 100% cpu usage! But when it is disabled and I power up my TV, then I have still have server crash, but that could be another story (maybe related?). What is the problem to merge this patch upstream? -- GitLab Migration Automatic Message -- This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity. You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/xorg/driver/xf86-video-nouveau/issues/190. |
Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.
Created attachment 115835 [details] my xorg.conf My xorg setup is based on two independent screen (see attached xorg.conf). Most of the time the Xorg process eat 100% of one CPU core. I will try to explain what I think is the problem and the fix (or workaround) I made to the nouveau code. Basically, the problem is that nouveau add a socket to the xorg socket list but does not register a handler that will be able to deal with it. When data is received on this socket the Select call in xorg-server/os/WaitFor.c will return but the data will not be read. When Select is called again it returns immediately because data was not read, and again, and again, resulting in 100% CPU core consumption. Now, some details (based on xf86-video-nouveau-1.0.11 source code) To add a socket in the xorg server list AddGeneralSocket is used. In src/drmmode_display.c AddGeneralSocket is called in two places: line 1605 (via drmmode_uevent_init and line 1554) and 1610. If line 1608 condition is not meet, a socket is added without an handler so drmmode_wakeup_handler will never be called with correct data. My solution is to add a new handler that will only handle the udev part when line 1608 condition is not meet. See attached patch.