Bug 26438

Summary: [CS checker code][r6xx][2.6.33-rc6 fc76be4 (git snapshot) + drm-radeon-next] playing Second Life cause GPU lockup.
Product: DRI Reporter: Shawn Starr <shawn.starr>
Component: DRM/RadeonAssignee: Default DRI bug account <dri-devel>
Status: RESOLVED FIXED QA Contact:
Severity: normal    
Priority: medium CC: pedretti.fabio
Version: XOrg git   
Hardware: Other   
OS: All   
Whiteboard:
i915 platform: i915 features:
Attachments:
Description Flags
kernel bootup output
none
crashdump (easier to read than in bug)
none
crash dump from today
none
kernel oops dump
none
Add free flag for ib none

Description Shawn Starr 2010-02-05 00:31:08 UTC
Created attachment 33086 [details]
kernel bootup output

module options for radeon: modeset=1 audio=0 dynclks=1 dynpm=1

* Notes: While I was trying to fiddle with VBOs in Second Life the drm spit out:
radeon 0000:01:00.0: vbo resource seems too big for the bo

No matter what VRAM size set, 256 and texture memory of 192, or 128MB vram and texture memory of  96MB (you can set what the game 'thinks' is the most memory available, it attempts and fails to look at the Xorg log file for VRAM info (the string changed and thus broke this code it falls back to 128MB)

I don't think this had anything to do to trigger this, but that error is alarming nonetheless.

Crash below:

Feb  5 03:11:46 segfault kernel: [ 2603.824186] [drm] Requested: e: 60000 m: 70000 p: 16
Feb  5 03:11:47 segfault kernel: [ 2604.386373] ------------[ cut here ]------------
Feb  5 03:11:47 segfault kernel: [ 2604.386392] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:159 radeon_fence_signaled+0x6a/0x9c [radeon]()
Feb  5 03:11:47 segfault kernel: [ 2604.386395] Hardware name: 4058CTO
Feb  5 03:11:47 segfault kernel: [ 2604.386396] Querying an unemited fence : ffff88003af579c0 !
Feb  5 03:11:47 segfault kernel: [ 2604.386398] Modules linked in: aes_x86_64 aes_generic ipv6 radeon ttm drm_kms_helper drm i2c_algo_bit acpi_cpufreq cpufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table uinput fuse coretemp firewire_sbp2 arc4 ecb iwlagn iwlcore mac80211 snd_hda_codec_conexant snd_seq_dummy thinkpad_acpi hwmon snd_usb_audio snd_hda_intel snd_hda_codec snd_hwdep snd_seq_oss snd_usb_lib snd_pcm_oss snd_seq_midi snd_mixer_oss snd_seq_midi_event snd_rawmidi snd_pcm snd_seq cfg80211 uvcvideo rfkill videodev i2c_i801 snd_timer snd_seq_device snd_page_alloc v4l1_compat snd wmi joydev v4l2_compat_ioctl32 soundcore i2c_core ata_generic sdhci_pci sdhci firewire_ohci firewire_core ricoh_mmc mmc_core crc_itu_t pata_acpi video output e1000e [last unloaded: scsi_wait_scan]
Feb  5 03:11:47 segfault kernel: [ 2604.386449] Pid: 11077, comm: do-not-directly Not tainted 2.6.33-rc6-custom-00237-g9b03279 #1
Feb  5 03:11:47 segfault kernel: [ 2604.386451] Call Trace:
Feb  5 03:11:47 segfault kernel: [ 2604.386457]  [<ffffffff8104493a>] warn_slowpath_common+0x77/0x8f
Feb  5 03:11:47 segfault kernel: [ 2604.386459]  [<ffffffff8104499f>] warn_slowpath_fmt+0x3c/0x3e
Feb  5 03:11:47 segfault kernel: [ 2604.386473]  [<ffffffffa032ffcf>] ? radeon_fence_destroy+0x69/0x6e [radeon]
Feb  5 03:11:47 segfault kernel: [ 2604.386486]  [<ffffffffa033005a>] radeon_fence_signaled+0x6a/0x9c [radeon]
Feb  5 03:11:47 segfault kernel: [ 2604.386499]  [<ffffffffa033091e>] radeon_sync_obj_signaled+0x9/0xb [radeon]
Feb  5 03:11:47 segfault kernel: [ 2604.386504]  [<ffffffffa02648d8>] ttm_bo_wait+0x56/0x14e [ttm]
Feb  5 03:11:47 segfault kernel: [ 2604.386520]  [<ffffffffa03403c5>] radeon_bo_wait+0xa8/0xc8 [radeon]
Feb  5 03:11:47 segfault kernel: [ 2604.386535]  [<ffffffffa03404b3>] radeon_gem_busy_ioctl+0x47/0x9c [radeon]
Feb  5 03:11:47 segfault kernel: [ 2604.386544]  [<ffffffffa02e822a>] drm_ioctl+0x254/0x366 [drm]
Feb  5 03:11:47 segfault kernel: [ 2604.386559]  [<ffffffffa034046c>] ? radeon_gem_busy_ioctl+0x0/0x9c [radeon]
Feb  5 03:11:47 segfault kernel: [ 2604.386563]  [<ffffffff811b1f9c>] ? inode_has_perm+0x75/0x8b
Feb  5 03:11:47 segfault kernel: [ 2604.386567]  [<ffffffff8100fad3>] ? __switch_to_xtra+0x11c/0x13c
Feb  5 03:11:47 segfault kernel: [ 2604.386571]  [<ffffffff810076d0>] ? __switch_to+0x215/0x227
Feb  5 03:11:47 segfault kernel: [ 2604.386574]  [<ffffffff810f9498>] vfs_ioctl+0x2d/0xa1
Feb  5 03:11:47 segfault kernel: [ 2604.386577]  [<ffffffff810f9a00>] do_vfs_ioctl+0x47d/0x4c3
Feb  5 03:11:47 segfault kernel: [ 2604.386579]  [<ffffffff810f9a97>] sys_ioctl+0x51/0x74
Feb  5 03:11:47 segfault kernel: [ 2604.386582]  [<ffffffff81008a82>] system_call_fastpath+0x16/0x1b
Feb  5 03:11:47 segfault kernel: [ 2604.386584] ---[ end trace b5271f8192af5e53 ]---
Comment 1 Shawn Starr 2010-02-05 00:34:51 UTC
Created attachment 33087 [details]
crashdump (easier to read than in bug)
Comment 2 Shawn Starr 2010-02-05 00:38:28 UTC
From Dave Airlie: Thats the CS checker code from glisse. He'll need to test when he gets back from FOSDEM
Comment 3 Shawn Starr 2010-02-11 19:56:18 UTC
Attaching new crash info: This is 2.6.33-rc7 and any of linus's changes merged in linux-2.6: drm-linus, drm-core-next, drm-next and drm-radeon-testing.

See attachment to bug for easier to read dump

Feb 11 22:32:22 segfault kernel: [ 1495.696093] ------------[ cut here ]------------
Feb 11 22:32:22 segfault kernel: [ 1495.696105] WARNING: at lib/kref.c:43 kref_get+0x23/0x2d()
Feb 11 22:32:22 segfault kernel: [ 1495.696108] Hardware name: 4058CTO
Feb 11 22:32:22 segfault kernel: [ 1495.696110] Modules linked in: vboxnetadp vboxnetflt vboxdrv aes_x86_64 aes_generic ipv6 acpi_cpufreq cpufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table uinput radeon ttm drm_kms_helper drm i2c_algo_bit fuse coretemp firewire_sbp2 snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_pcm_oss snd_mixer_oss snd_usb_audio snd_pcm thinkpad_acpi uvcvideo hwmon snd_seq_dummy snd_hwdep snd_usb_lib arc4 snd_seq_oss ecb videodev snd_seq_midi snd_rawmidi snd_seq_midi_event iwlagn v4l1_compat snd_seq v4l2_compat_ioctl32 snd_timer iwlcore snd_seq_device i2c_i801 wmi i2c_core mac80211 joydev snd cfg80211 rfkill soundcore snd_page_alloc sdhci_pci sdhci mmc_core firewire_ohci firewire_core crc_itu_t ricoh_mmc e1000e ata_generic pata_acpi video output [last unloaded: scsi_wait_scan]
Feb 11 22:32:22 segfault kernel: [ 1495.696210] Pid: 6846, comm: do-not-directly Not tainted 2.6.33-rc7-custom-00263-gd464e51 #1
Feb 11 22:32:22 segfault kernel: [ 1495.696214] Call Trace:
Feb 11 22:32:22 segfault kernel: [ 1495.696222]  [<ffffffff81049640>] warn_slowpath_common+0x7c/0x94
Feb 11 22:32:22 segfault kernel: [ 1495.696228]  [<ffffffff8104966c>] warn_slowpath_null+0x14/0x16
Feb 11 22:32:22 segfault kernel: [ 1495.696234]  [<ffffffff8121453a>] kref_get+0x23/0x2d
Feb 11 22:32:22 segfault kernel: [ 1495.696260]  [<ffffffffa033055e>] radeon_fence_ref+0x1a/0x21 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.696300]  [<ffffffffa033244b>] radeon_bo_list_validate+0xac/0xf4 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.696328]  [<ffffffffa03436c5>] radeon_cs_parser_relocs+0x19d/0x1fb [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.696355]  [<ffffffffa0343b14>] radeon_cs_ioctl+0xca/0x1a2 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.696371]  [<ffffffffa02c6512>] drm_ioctl+0x25f/0x371 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.696398]  [<ffffffffa0343a4a>] ? radeon_cs_ioctl+0x0/0x1a2 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.696405]  [<ffffffff8106a78d>] ? sched_clock_local+0x1c/0x82
Feb 11 22:32:22 segfault kernel: [ 1495.696412]  [<ffffffff8110b4cf>] ? rcu_read_unlock+0x0/0x23
Feb 11 22:32:22 segfault kernel: [ 1495.696418]  [<ffffffff8106a8b6>] ? sched_clock_cpu+0xc3/0xce
Feb 11 22:32:22 segfault kernel: [ 1495.696423]  [<ffffffff8106a904>] ? cpu_clock+0x43/0x5e
Feb 11 22:32:22 segfault kernel: [ 1495.696429]  [<ffffffff811174b0>] vfs_ioctl+0x32/0xa6
Feb 11 22:32:22 segfault kernel: [ 1495.696434]  [<ffffffff81117a30>] do_vfs_ioctl+0x490/0x4d6
Feb 11 22:32:22 segfault kernel: [ 1495.696440]  [<ffffffff8110b549>] ? fget_light+0x57/0xf2
Feb 11 22:32:22 segfault kernel: [ 1495.696445]  [<ffffffff81117acc>] sys_ioctl+0x56/0x79
Feb 11 22:32:22 segfault kernel: [ 1495.696452]  [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
Feb 11 22:32:22 segfault kernel: [ 1495.696457] ---[ end trace 5608d98ef4e9c78b ]---
Feb 11 22:32:22 segfault kernel: [ 1495.698709] BUG: unable to handle kernel paging request at 00000000000021d0
Feb 11 22:32:22 segfault kernel: [ 1495.698715] IP: [<ffffffffa0330582>] radeon_fence_signaled+0x1d/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.698739] PGD 10dfde067 PUD 126421067 PMD 0
Feb 11 22:32:22 segfault kernel: [ 1495.698746] Oops: 0000 [#1] SMP
Feb 11 22:32:22 segfault kernel: [ 1495.698751] last sysfs file: /sys/devices/LNXSYSTM:00/LNXSYBUS:00/PNP0A08:00/device:00/PNP0C09:00/PNP0C0A:00/power_supply/BAT0/energy_full
Feb 11 22:32:22 segfault kernel: [ 1495.698756] CPU 0
Feb 11 22:32:22 segfault kernel: [ 1495.698762] Pid: 6846, comm: do-not-directly Tainted: G        W  2.6.33-rc7-custom-00263-gd464e51 #1 4058CTO/4058CTO
Feb 11 22:32:22 segfault kernel: [ 1495.698766] RIP: 0010:[<ffffffffa0330582>]  [<ffffffffa0330582>] radeon_fence_signaled+0x1d/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.698790] RSP: 0018:ffff8801072cbc28  EFLAGS: 00010286
Feb 11 22:32:22 segfault kernel: [ 1495.698793] RAX: ffff8801348844c0 RBX: ffff8800ab057fc0 RCX: 0000000000000001
Feb 11 22:32:22 segfault kernel: [ 1495.698797] RDX: 0000000000000001 RSI: 0000000000000000 RDI: 0000000000000000
Feb 11 22:32:22 segfault kernel: [ 1495.698801] RBP: ffff8801072cbc48 R08: 0000000000000000 R09: 0000000000000000
Feb 11 22:32:22 segfault kernel: [ 1495.698804] R10: ffff8800b288b8f8 R11: 0000000000000000 R12: ffff8800b288b8e0
Feb 11 22:32:22 segfault kernel: [ 1495.698808] R13: ffffffffa0393480 R14: ffff8800b288b9b0 R15: ffff8800b288b8e0
Feb 11 22:32:22 segfault kernel: [ 1495.698812] FS:  00007fb6b0682830(0000) GS:ffff880006200000(0000) knlGS:0000000000000000
Feb 11 22:32:22 segfault kernel: [ 1495.698816] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
Feb 11 22:32:22 segfault kernel: [ 1495.698820] CR2: 00000000000021d0 CR3: 000000010df35000 CR4: 00000000000406f0
Feb 11 22:32:22 segfault kernel: [ 1495.698823] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
Feb 11 22:32:22 segfault kernel: [ 1495.698827] DR3: 0000000000000000 DR6: 00000000ffff0ff0 DR7: 0000000000000400
Feb 11 22:32:22 segfault kernel: [ 1495.698831] Process do-not-directly (pid: 6846, threadinfo ffff8801072ca000, task ffff88012ba38000)
Feb 11 22:32:22 segfault kernel: [ 1495.698835] Stack:
Feb 11 22:32:22 segfault kernel: [ 1495.698837]  0000000000000000 ffff8800b288b848 ffff8800b288b8e0 ffffffffa0393480
Feb 11 22:32:22 segfault kernel: [ 1495.698844] <0> ffff8801072cbc58 ffffffffa0330ef5 ffff8801072cbcc8 ffffffffa0300a3e
Feb 11 22:32:22 segfault kernel: [ 1495.698851] <0> ffff8800b288b88c ffff8800b288b8e0 00000001b288b8f8 01ff880000000001
Feb 11 22:32:22 segfault kernel: [ 1495.698859] Call Trace:
Feb 11 22:32:22 segfault kernel: [ 1495.698882]  [<ffffffffa0330ef5>] radeon_sync_obj_signaled+0xe/0x10 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.698892]  [<ffffffffa0300a3e>] ttm_bo_wait+0x5d/0x165 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.698918]  [<ffffffffa0341c34>] radeon_bo_wait+0xaf/0xd4 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.698944]  [<ffffffffa0341d16>] radeon_gem_busy_ioctl+0x47/0x86 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.698951]  [<ffffffff811d6a00>] ? avc_has_perm+0x5c/0x6e
Feb 11 22:32:22 segfault kernel: [ 1495.698963]  [<ffffffffa02c6512>] drm_ioctl+0x25f/0x371 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.698989]  [<ffffffffa0341ccf>] ? radeon_gem_busy_ioctl+0x0/0x86 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.698995]  [<ffffffff811d7b6e>] ? inode_has_perm+0x7a/0x90
Feb 11 22:32:22 segfault kernel: [ 1495.699000]  [<ffffffff81010115>] ? sched_clock+0x9/0xd
Feb 11 22:32:22 segfault kernel: [ 1495.699005]  [<ffffffff8106a78d>] ? sched_clock_local+0x1c/0x82
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff8110b4cf>] ? rcu_read_unlock+0x0/0x23
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff8106a8b6>] ? sched_clock_cpu+0xc3/0xce
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff8106a904>] ? cpu_clock+0x43/0x5e
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff811174b0>] vfs_ioctl+0x32/0xa6
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff81117a30>] do_vfs_ioctl+0x490/0x4d6
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff8110b549>] ? fget_light+0x57/0xf2
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff81117acc>] sys_ioctl+0x56/0x79
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
Feb 11 22:32:22 segfault kernel: [ 1495.699006] Code: 7f 08 e8 b9 3f ee e0 48 89 d8 5a 5b c9 c3 55 48 89 e5 41 55 41 54 53 48 83 ec 08 0f 1f 44 00 00 48 89 fb 48 85 ff 74 79 48 8b 3f <80> bf d0 21 00 00 00 75 6d 48 81 c7 d8 0c 00 00 e8 00 b7 12 e1
Feb 11 22:32:22 segfault kernel: [ 1495.699006] RIP  [<ffffffffa0330582>] radeon_fence_signaled+0x1d/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.699006]  RSP <ffff8801072cbc28>
Feb 11 22:32:22 segfault kernel: [ 1495.699006] CR2: 00000000000021d0
Feb 11 22:32:22 segfault kernel: [ 1495.699153] ---[ end trace 5608d98ef4e9c78c ]---
Feb 11 22:32:22 segfault kernel: [ 1495.739156] ------------[ cut here ]------------


Feb 11 22:32:22 segfault kernel: [ 1495.763394] ------------[ cut here ]------------
Feb 11 22:32:22 segfault kernel: [ 1495.763408] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:159 radeon_fence_signaled+0x6f/0xa1 [radeon]()
Feb 11 22:32:22 segfault kernel: [ 1495.763410] Hardware name: 4058CTO
Feb 11 22:32:22 segfault kernel: [ 1495.763412] Querying an unemited fence : ffff8800b9157540 !
Feb 11 22:32:22 segfault kernel: [ 1495.763413] Modules linked in: vboxnetadp vboxnetflt vboxdrv aes_x86_64 aes_generic ipv6 acpi_cpufreq cpufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table uinput radeon ttm drm_kms_helper drm i2c_algo_bit fuse coretemp firewire_sbp2 snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_pcm_oss snd_mixer_oss snd_usb_audio snd_pcm thinkpad_acpi uvcvideo hwmon snd_seq_dummy snd_hwdep snd_usb_lib arc4 snd_seq_oss ecb videodev snd_seq_midi snd_rawmidi snd_seq_midi_event iwlagn v4l1_compat snd_seq v4l2_compat_ioctl32 snd_timer iwlcore snd_seq_device i2c_i801 wmi i2c_core mac80211 joydev snd cfg80211 rfkill soundcore snd_page_alloc sdhci_pci sdhci mmc_core firewire_ohci firewire_core crc_itu_t ricoh_mmc e1000e ata_generic pata_acpi video output [last unloaded: scsi_wait_scan]
Feb 11 22:32:22 segfault kernel: [ 1495.763471] Pid: 2307, comm: Xorg Tainted: G      D W  2.6.33-rc7-custom-00263-gd464e51 #1
Feb 11 22:32:22 segfault kernel: [ 1495.763472] Call Trace:
Feb 11 22:32:22 segfault kernel: [ 1495.763476]  [<ffffffff81049640>] warn_slowpath_common+0x7c/0x94
Feb 11 22:32:22 segfault kernel: [ 1495.763479]  [<ffffffff810496af>] warn_slowpath_fmt+0x41/0x43
Feb 11 22:32:22 segfault kernel: [ 1495.763493]  [<ffffffffa0330597>] ? radeon_fence_signaled+0x32/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763507]  [<ffffffffa03305d4>] radeon_fence_signaled+0x6f/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763521]  [<ffffffffa0330ef5>] radeon_sync_obj_signaled+0xe/0x10 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763526]  [<ffffffffa0300a3e>] ttm_bo_wait+0x5d/0x165 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763530]  [<ffffffff8145b80f>] ? _raw_spin_lock+0x62/0x69
Feb 11 22:32:22 segfault kernel: [ 1495.763535]  [<ffffffffa03008ad>] ? spin_lock+0xe/0x10 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763540]  [<ffffffffa0301973>] ttm_bo_cleanup_refs+0x56/0x26b [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763546]  [<ffffffffa030343b>] ttm_bo_release+0x5f/0x7c [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763551]  [<ffffffffa03033dc>] ? ttm_bo_release+0x0/0x7c [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763555]  [<ffffffff8121450d>] kref_put+0x43/0x4d
Feb 11 22:32:22 segfault kernel: [ 1495.763560]  [<ffffffffa0301cf2>] ttm_bo_unref+0x38/0x45 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763574]  [<ffffffffa033205c>] radeon_bo_unref+0x2a/0x3f [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763591]  [<ffffffffa0341fcb>] radeon_gem_object_free+0x33/0x35 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763599]  [<ffffffffa02c7a36>] drm_gem_object_free_unlocked+0x5b/0x73 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763607]  [<ffffffffa02c79db>] ? drm_gem_object_free_unlocked+0x0/0x73 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763610]  [<ffffffff8121450d>] kref_put+0x43/0x4d
Feb 11 22:32:22 segfault kernel: [ 1495.763618]  [<ffffffffa02c78b1>] drm_gem_object_unreference_unlocked+0x1a/0x1c [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763626]  [<ffffffffa02c78f8>] drm_gem_object_handle_unreference_unlocked+0x2e/0x32 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763634]  [<ffffffffa02c7bd8>] drm_gem_close_ioctl+0x7c/0x87 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763641]  [<ffffffffa02c6512>] drm_ioctl+0x25f/0x371 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763649]  [<ffffffffa02c7b5c>] ? drm_gem_close_ioctl+0x0/0x87 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763653]  [<ffffffff811d7b6e>] ? inode_has_perm+0x7a/0x90
Feb 11 22:32:22 segfault kernel: [ 1495.763656]  [<ffffffff8145c1ab>] ? _raw_spin_unlock+0x2b/0x2f
Feb 11 22:32:22 segfault kernel: [ 1495.763659]  [<ffffffff810faf05>] ? spin_unlock+0xe/0x10
Feb 11 22:32:22 segfault kernel: [ 1495.763662]  [<ffffffff810faf6e>] ? add_partial+0x67/0x70
Feb 11 22:32:22 segfault kernel: [ 1495.763667]  [<ffffffffa0301ce1>] ? ttm_bo_unref+0x27/0x45 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763671]  [<ffffffff811174b0>] vfs_ioctl+0x32/0xa6
Feb 11 22:32:22 segfault kernel: [ 1495.763673]  [<ffffffff81117a30>] do_vfs_ioctl+0x490/0x4d6
Feb 11 22:32:22 segfault kernel: [ 1495.763677]  [<ffffffff81117acc>] sys_ioctl+0x56/0x79
Feb 11 22:32:22 segfault kernel: [ 1495.763680]  [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
Feb 11 22:32:22 segfault kernel: [ 1495.763683] ---[ end trace 5608d98ef4e9c798 ]---
Feb 11 22:32:22 segfault kernel: [ 1495.763717] ------------[ cut here ]------------

Feb 11 22:32:22 segfault kernel: [ 1495.763063] ------------[ cut here ]------------
Feb 11 22:32:22 segfault kernel: [ 1495.763077] WARNING: at drivers/gpu/drm/radeon/radeon_fence.c:159 radeon_fence_signaled+0x6f/0xa1 [radeon]()
Feb 11 22:32:22 segfault kernel: [ 1495.763079] Hardware name: 4058CTO
Feb 11 22:32:22 segfault kernel: [ 1495.763081] Querying an unemited fence : ffff8800b9157540 !
Feb 11 22:32:22 segfault kernel: [ 1495.763082] Modules linked in: vboxnetadp vboxnetflt vboxdrv aes_x86_64 aes_generic ipv6 acpi_cpufreq cpufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table uinput radeon ttm drm_kms_helper drm i2c_algo_bit fuse coretemp firewire_sbp2 snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_pcm_oss snd_mixer_oss snd_usb_audio snd_pcm thinkpad_acpi uvcvideo hwmon snd_seq_dummy snd_hwdep snd_usb_lib arc4 snd_seq_oss ecb videodev snd_seq_midi snd_rawmidi snd_seq_midi_event iwlagn v4l1_compat snd_seq v4l2_compat_ioctl32 snd_timer iwlcore snd_seq_device i2c_i801 wmi i2c_core mac80211 joydev snd cfg80211 rfkill soundcore snd_page_alloc sdhci_pci sdhci mmc_core firewire_ohci firewire_core crc_itu_t ricoh_mmc e1000e ata_generic pata_acpi video output [last unloaded: scsi_wait_scan]
Feb 11 22:32:22 segfault kernel: [ 1495.763140] Pid: 2307, comm: Xorg Tainted: G      D W  2.6.33-rc7-custom-00263-gd464e51 #1
Feb 11 22:32:22 segfault kernel: [ 1495.763141] Call Trace:
Feb 11 22:32:22 segfault kernel: [ 1495.763145]  [<ffffffff81049640>] warn_slowpath_common+0x7c/0x94
Feb 11 22:32:22 segfault kernel: [ 1495.763148]  [<ffffffff810496af>] warn_slowpath_fmt+0x41/0x43
Feb 11 22:32:22 segfault kernel: [ 1495.763162]  [<ffffffffa0330597>] ? radeon_fence_signaled+0x32/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763176]  [<ffffffffa03305d4>] radeon_fence_signaled+0x6f/0xa1 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763190]  [<ffffffffa0330ef5>] radeon_sync_obj_signaled+0xe/0x10 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763195]  [<ffffffffa0300a3e>] ttm_bo_wait+0x5d/0x165 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763198]  [<ffffffff8145b80f>] ? _raw_spin_lock+0x62/0x69
Feb 11 22:32:22 segfault kernel: [ 1495.763203]  [<ffffffffa03008ad>] ? spin_lock+0xe/0x10 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763209]  [<ffffffffa0301973>] ttm_bo_cleanup_refs+0x56/0x26b [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763215]  [<ffffffffa030343b>] ttm_bo_release+0x5f/0x7c [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763220]  [<ffffffffa03033dc>] ? ttm_bo_release+0x0/0x7c [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763223]  [<ffffffff8121450d>] kref_put+0x43/0x4d
Feb 11 22:32:22 segfault kernel: [ 1495.763228]  [<ffffffffa0301cf2>] ttm_bo_unref+0x38/0x45 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763243]  [<ffffffffa033205c>] radeon_bo_unref+0x2a/0x3f [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa0341fcb>] radeon_gem_object_free+0x33/0x35 [radeon]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c7a36>] drm_gem_object_free_unlocked+0x5b/0x73 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c79db>] ? drm_gem_object_free_unlocked+0x0/0x73 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff8121450d>] kref_put+0x43/0x4d
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c78b1>] drm_gem_object_unreference_unlocked+0x1a/0x1c [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c78f8>] drm_gem_object_handle_unreference_unlocked+0x2e/0x32 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c7bd8>] drm_gem_close_ioctl+0x7c/0x87 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c6512>] drm_ioctl+0x25f/0x371 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa02c7b5c>] ? drm_gem_close_ioctl+0x0/0x87 [drm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff811d7b6e>] ? inode_has_perm+0x7a/0x90
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff8145c1ab>] ? _raw_spin_unlock+0x2b/0x2f
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff810faf05>] ? spin_unlock+0xe/0x10
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff810faf6e>] ? add_partial+0x67/0x70
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffffa0301ce1>] ? ttm_bo_unref+0x27/0x45 [ttm]
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff811174b0>] vfs_ioctl+0x32/0xa6
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff81117a30>] do_vfs_ioctl+0x490/0x4d6
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff81117acc>] sys_ioctl+0x56/0x79
Feb 11 22:32:22 segfault kernel: [ 1495.763254]  [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
Feb 11 22:32:22 segfault kernel: [ 1495.763254] ---[ end trace 5608d98ef4e9c797 ]---
Feb 11 22:32:22 segfault kernel: [ 1495.763394] ------------[ cut here ]------------
Comment 4 Shawn Starr 2010-02-11 19:56:48 UTC
Created attachment 33242 [details]
crash dump from today
Comment 5 Shawn Starr 2010-02-12 14:49:51 UTC
Today's crash information:

playing Second Life, I managed to trigger a drm error that did not lock up GPU, a 2nd time trying the same steps resulted in the GPU lockup with a kernel oops:

1) in Second Life, I started game, did not have VBOs on and it used the default 128MB for textures
2) Set texture memory size to 256MB (max limit in drm right now) and enable VBO support.

Results:

Feb 12 15:53:56 segfault kernel: [  838.349029] [drm:radeon_ib_get] *ERROR* IB 8 scheduled without a fence.
Feb 12 15:53:56 segfault kernel: [  838.349033] [drm:radeon_cs_ioctl] *ERROR* Failed to get ib !

Took off VBOs for a bit then reenabled them and kernel panic'd not long after:

Feb 12 17:42:09 segfault kernel: [ 7331.346783] ------------[ cut here ]------------
Feb 12 17:42:09 segfault kernel: [ 7331.346794] WARNING: at lib/kref.c:43 kref_get+0x23/0x2d()
Feb 12 17:42:09 segfault kernel: [ 7331.346797] Hardware name: 4058CTO
Feb 12 17:42:09 segfault kernel: [ 7331.346799] Modules linked in: aes_x86_64 aes_generic ipv6 radeon ttm drm_kms_helper drm i2c_algo_bit acpi_cpufreq cpufreq_conservative cpufreq_ondemand cpufreq_powersave cpufreq_stats freq_table uinput fuse coretemp firewire_sbp2 snd_usb_audio uvcvideo videodev v4l1_compat snd_usb_lib v4l2_compat_ioctl32 snd_hda_codec_conexant snd_hda_intel snd_hda_codec snd_hwdep snd_pcm_oss snd_mixer_oss snd_pcm snd_seq_dummy arc4 thinkpad_acpi hwmon snd_seq_oss ecb snd_seq_midi iwlagn snd_rawmidi iwlcore snd_seq_midi_event i2c_i801 mac80211 joydev snd_seq snd_timer i2c_core snd_seq_device snd cfg80211 rfkill wmi soundcore snd_page_alloc sdhci_pci sdhci firewire_ohci firewire_core crc_itu_t mmc_core ricoh_mmc ata_generic pata_acpi e1000e video output [last unloaded: scsi_wait_scan]
Feb 12 17:42:09 segfault kernel: [ 7331.346895] Pid: 4561, comm: do-not-directly Not tainted 2.6.33-rc7-custom-00280-g23e1ef0 #1
Feb 12 17:42:09 segfault kernel: [ 7331.346899] Call Trace:
Feb 12 17:42:09 segfault kernel: [ 7331.346907]  [<ffffffff81049640>] warn_slowpath_common+0x7c/0x94
Feb 12 17:42:09 segfault kernel: [ 7331.346913]  [<ffffffff8104966c>] warn_slowpath_null+0x14/0x16
Feb 12 17:42:09 segfault kernel: [ 7331.346918]  [<ffffffff8121453a>] kref_get+0x23/0x2d
Feb 12 17:42:09 segfault kernel: [ 7331.346945]  [<ffffffffa0333552>] radeon_fence_ref+0x1a/0x21 [radeon]
Feb 12 17:42:09 segfault kernel: [ 7331.346969]  [<ffffffffa033543f>] radeon_bo_list_validate+0xac/0xf4 [radeon]
Feb 12 17:42:09 segfault kernel: [ 7331.346996]  [<ffffffffa034661d>] radeon_cs_parser_relocs+0x19d/0x1fb [radeon]
Feb 12 17:42:09 segfault kernel: [ 7331.347023]  [<ffffffffa0346a6c>] radeon_cs_ioctl+0xca/0x1a2 [radeon]
Feb 12 17:42:09 segfault kernel: [ 7331.347050]  [<ffffffffa0344ca1>] ? radeon_gem_busy_ioctl+0x7a/0x86 [radeon]
Feb 12 17:42:09 segfault kernel: [ 7331.347066]  [<ffffffffa02c9512>] drm_ioctl+0x25f/0x371 [drm]
Feb 12 17:42:09 segfault kernel: [ 7331.347073]  [<ffffffff8106a8b6>] ? sched_clock_cpu+0xc3/0xce
Feb 12 17:42:09 segfault kernel: [ 7331.347099]  [<ffffffffa03469a2>] ? radeon_cs_ioctl+0x0/0x1a2 [radeon]
Feb 12 17:42:09 segfault kernel: [ 7331.347105]  [<ffffffff8106a78d>] ? sched_clock_local+0x1c/0x82
Feb 12 17:42:09 segfault kernel: [ 7331.347113]  [<ffffffff8110b4cf>] ? rcu_read_unlock+0x0/0x23
Feb 12 17:42:09 segfault kernel: [ 7331.347118]  [<ffffffff8106a8b6>] ? sched_clock_cpu+0xc3/0xce
Feb 12 17:42:09 segfault kernel: [ 7331.347124]  [<ffffffff8106a904>] ? cpu_clock+0x43/0x5e
Feb 12 17:42:09 segfault kernel: [ 7331.347130]  [<ffffffff811174b0>] vfs_ioctl+0x32/0xa6
Feb 12 17:42:09 segfault kernel: [ 7331.347135]  [<ffffffff81117a30>] do_vfs_ioctl+0x490/0x4d6
Feb 12 17:42:09 segfault kernel: [ 7331.347141]  [<ffffffff8110b4f0>] ? rcu_read_unlock+0x21/0x23
Feb 12 17:42:09 segfault kernel: [ 7331.347147]  [<ffffffff8110b5d6>] ? fget_light+0xe4/0xf2
Feb 12 17:42:09 segfault kernel: [ 7331.347153]  [<ffffffff8110b549>] ? fget_light+0x57/0xf2
Feb 12 17:42:09 segfault kernel: [ 7331.347158]  [<ffffffff81117acc>] sys_ioctl+0x56/0x79
Feb 12 17:42:09 segfault kernel: [ 7331.347165]  [<ffffffff81009c32>] system_call_fastpath+0x16/0x1b
Feb 12 17:42:09 segfault kernel: [ 7331.347170] ---[ end trace 19e4068e6daee7fc ]---
Comment 6 Shawn Starr 2010-02-12 14:51:02 UTC
Created attachment 33262 [details]
kernel oops dump

easier to read view of bug dump
Comment 7 Jerome Glisse 2010-02-15 03:00:45 UTC
Created attachment 33310 [details] [review]
Add free flag for ib

Please test with that one hopefully it should fix the kref/ib issue
Comment 8 Fabio Pedretti 2010-02-26 01:17:03 UTC
Patch applied in 2.6.33.

Use of freedesktop.org services, including Bugzilla, is subject to our Code of Conduct. How we collect and use information is described in our Privacy Policy.