Bug 23787 - CPU soft lockups with KMS
Summary: CPU soft lockups with KMS
Status: RESOLVED FIXED
Alias: None
Product: xorg
Classification: Unclassified
Component: Driver/Radeon (show other bugs)
Version: 7.4 (2008.09)
Hardware: Other Linux (All)
: medium normal
Assignee: xf86-video-ati maintainers
QA Contact: Xorg Project Team
URL:
Whiteboard:
Keywords:
Depends on:
Blocks:
 
Reported: 2009-09-08 01:34 UTC by Mikko C.
Modified: 2009-10-21 04:46 UTC (History)
0 users

See Also:
i915 platform:
i915 features:


Attachments
dmesg before the hang (76.97 KB, text/plain)
2009-09-08 02:05 UTC, Mikko C.
no flags Details

Description Mikko C. 2009-09-08 01:34:59 UTC
I'm not sure about when exactly this happens or if it's somehow random or related to my other long-standing bug report about random hard-lockups ( http://bugs.freedesktop.org/show_bug.cgi?id=16198 ).

Anyway, with KMS enabled I got this in my log:

Sep  6 14:04:26 gentoo kernel: BUG: soft lockup - CPU#1 stuck for 61s! [gpm:5843]    
Sep  6 14:04:26 gentoo kernel: Modules linked in: vfat fat coretemp hwmon cpufreq_ondemand fan snd_hda_codec_idt snd_hda_intel radeon snd_hda_codec snd_pcm iwl3945 iwlcore snd_timer mac80211 ttm led_class drm cfg80211 ehci_hcd uhci_hcd snd i2c_algo_bit thermal battery ac evdev psmouse dell_laptop wmi button rfkill soundcore snd_page_alloc                                      
Sep  6 14:04:26 gentoo kernel: CPU 1:                                                                                                                                                        
Sep  6 14:04:26 gentoo kernel: Modules linked in: vfat fat coretemp hwmon cpufreq_ondemand fan snd_hda_codec_idt snd_hda_intel radeon snd_hda_codec snd_pcm iwl3945 iwlcore snd_timer mac80211 ttm led_class drm cfg80211 ehci_hcd uhci_hcd snd i2c_algo_bit thermal battery ac evdev psmouse dell_laptop wmi button rfkill soundcore snd_page_alloc
Sep  6 14:04:26 gentoo kernel: Pid: 5843, comm: gpm Not tainted 2.6.31-rc8 #1 MM061
Sep  6 14:04:26 gentoo kernel: RIP: 0010:[<ffffffffa01dd6a3>]  [<ffffffffa01dd6a3>] cail_reg_read+0x63/0x70 [radeon]
Sep  6 14:04:26 gentoo kernel: RSP: 0000:ffff8800764d3908  EFLAGS: 00000286
Sep  6 14:04:26 gentoo kernel: RAX: 0000000000010009 RBX: 0000000000001827 RCX: 0000000000001800
Sep  6 14:04:26 gentoo kernel: RDX: 0000000000001a27 RSI: ffffc90004ae689c RDI: ffffffffa0269240
Sep  6 14:04:26 gentoo kernel: RBP: ffffffff8100c29e R08: 0000000000000001 R09: 0000000000000000


And this with -rc9:

Sep  8 09:59:29 gentoo kernel: BUG: soft lockup - CPU#0 stuck for 61s! [gpm:4518]                                                                                                            
Sep  8 09:59:29 gentoo kernel: Modules linked in: vfat fat coretemp hwmon cpufreq_ondemand fan snd_hda_codec_idt radeon snd_hda_intel ttm snd_hda_codec snd_pcm snd_timer iwl3945 drm iwlcore mac80211 i2c_algo_bit uhci_hcd snd psmouse wmi dell_laptop led_class soundcore ehci_hcd evdev button thermal cfg80211 rfkill snd_page_alloc battery ac                                      
Sep  8 09:59:29 gentoo kernel: CPU 0:                                                                                                                                                        
Sep  8 09:59:29 gentoo kernel: Modules linked in: vfat fat coretemp hwmon cpufreq_ondemand fan snd_hda_codec_idt radeon snd_hda_intel ttm snd_hda_codec snd_pcm snd_timer iwl3945 drm iwlcore mac80211 i2c_algo_bit uhci_hcd snd psmouse wmi dell_laptop led_class soundcore ehci_hcd evdev button thermal cfg80211 rfkill snd_page_alloc battery ac                                      
Sep  8 09:59:29 gentoo kernel: Pid: 4518, comm: gpm Not tainted 2.6.31-rc9 #1 MM061                                                                                                          
Sep  8 09:59:29 gentoo kernel: RIP: 0010:[<ffffffffa01f459c>]  [<ffffffffa01f459c>] atom_get_src_int+0xbc/0x7f0 [radeon]                                                                     
Sep  8 09:59:29 gentoo kernel: RSP: 0018:ffff88007c2e9928  EFLAGS: 00000206                                                                                                                  
Sep  8 09:59:29 gentoo kernel: RAX: 0000000000010009 RBX: 0000000000000009 RCX: 0000000000000000                                                                                             
Sep  8 09:59:29 gentoo kernel: RDX: 0000000000000000 RSI: ffffc90004b0689c RDI: ffffffffa0276240                                                                                             
Sep  8 09:59:29 gentoo kernel: RBP: ffffffff8100c29e R08: 0000000000000001 R09: 0000000000000000                                                                                             
Sep  8 09:59:29 gentoo acpid: exiting                                                                                                                                                        
Sep  8 09:59:29 gentoo dhcpcd[4830]: wlan0: carrier lost
Comment 1 Mikko C. 2009-09-08 01:50:19 UTC
GPU is X1400 mobility.
mesa master
libdrm master
xf86-video-ati master
xorg 1.6.3.901
Comment 2 Mikko C. 2009-09-08 02:05:33 UTC
Created attachment 29334 [details]
dmesg before the hang

After more debugging it seems that it hangs when I move the mouse to recover from the blank screen. Until then, even if the screen is blank, I'm still connected via ssh. But if I move the mouse then even my ssh connection is frozen, but the PC still pings.

This attachment is dmesg from before the hang.
Comment 3 Mikko C. 2009-09-08 03:12:53 UTC
Another different backtrace with -rc6 (I'm trying to see if it's a regression or it's always been there).


Sep  8 12:10:16 gentoo kernel: BUG: soft lockup - CPU#0 stuck for 61s! [gpm:4454]                                                                                                            
Sep  8 12:10:16 gentoo kernel: Modules linked in: vfat fat coretemp hwmon cpufreq_ondemand fan iwl3945 iwlcore mac80211 led_class snd_hda_codec_idt cfg80211 radeon snd_hda_intel dell_laptop snd_hda_codec rfkill wmi snd_pcm snd_timer ehci_hcd uhci_hcd snd soundcore snd_page_alloc ac psmouse ttm drm thermal button i2c_algo_bit evdev battery                                      
Sep  8 12:10:16 gentoo kernel: CPU 0:                                                                                                                                                        
Sep  8 12:10:16 gentoo kernel: Modules linked in: vfat fat coretemp hwmon cpufreq_ondemand fan iwl3945 iwlcore mac80211 led_class snd_hda_codec_idt cfg80211 radeon snd_hda_intel dell_laptop snd_hda_codec rfkill wmi snd_pcm snd_timer ehci_hcd uhci_hcd snd soundcore snd_page_alloc ac psmouse ttm drm thermal button i2c_algo_bit evdev battery                                      
Sep  8 12:10:16 gentoo kernel: Pid: 4454, comm: gpm Not tainted 2.6.31-rc6 #5 MM061                                                                                                          
Sep  8 12:10:16 gentoo kernel: RIP: 0010:[<ffffffffa0155cf3>]  [<ffffffffa0155cf3>] r100_mm_rreg+0x53/0x60 [radeon]                                                                          
Sep  8 12:10:16 gentoo kernel: RSP: 0018:ffff88007d159908  EFLAGS: 00000286                                                                                                                  
Sep  8 12:10:16 gentoo kernel: RAX: 0000000000010009 RBX: 0000000000001827 RCX: 0000000000001800                                                                                             
Sep  8 12:10:16 gentoo kernel: RDX: 0000000000001a27 RSI: ffffc900049c689c RDI: ffff88007eeca000                                                                                             
Sep  8 12:10:16 gentoo kernel: RBP: ffffffff8100c29e R08: 0000000000000001 R09: 0000000000000000
Comment 4 Alex Deucher 2009-09-08 06:29:43 UTC
This looks like bug 16781.  We probably need to pull 9a108f0a0b7203458673ce6221e747a166d39617 into kms.
Comment 5 Mikko C. 2009-09-08 06:37:32 UTC
If you could make a patch that apply to 2.6.31-rc9 I will try it.
Anyway, it's there also in -rc1 so it's not a regression.
Comment 6 Mikko C. 2009-09-12 06:25:35 UTC
If it's any help, this bug only occurs when X is not running, so only when using console login. If I'm logged in KDE and the screen goes black, it will recover just fine when I move the mouse.
Comment 7 Mikko C. 2009-10-21 04:46:47 UTC
this seems fixed in 2.6.32-rc3.
Thanks


Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.