[37506.958095] ------------[ cut here ]------------
[37506.958119] kernel BUG at drivers/dma-buf/reservation.c:234!
[37506.958151] invalid opcode: 0000 [#1] SMP PTI
[37506.958154] Modules linked in: macvtap macvlan tap nls_utf8 isofs fuse rfcomm devlink nf_conntrack_netbios_ns nf_conntrack_broadcast xt_CT ip6t_rpfilter ip6t_REJECT nf_reject_ipv6 xt_conntrack ip_set nfnetlink ebtable_nat ebtable_broute bridge stp llc ip6table_nat nf_conntrack_ipv6 nf_defrag_ipv6 nf_nat_ipv6 ip6table_mangle ip6table_raw ip6table_security iptable_nat nf_conntrack_ipv4 nf_defrag_ipv4 nf_nat_ipv4 nf_nat nf_conntrack iptable_mangle iptable_raw iptable_security ebtable_filter ebtables ip6table_filter ip6_tables cmac bnep sunrpc xfs vfat fat libcrc32c intel_rapl x86_pkg_temp_thermal intel_powerclamp coretemp kvm_intel kvm irqbypass crct10dif_pclmul crc32_pclmul ghash_clmulni_intel intel_cstate intel_uncore intel_rapl_perf gpio_ich iTCO_wdt iTCO_vendor_support ppdev btusb btrtl btbcm btintel
[37506.958229] bluetooth i2c_i801 snd_hda_codec_realtek snd_hda_codec_generic snd_hda_codec_hdmi huawei_cdc_ncm ecdh_generic cdc_wdm snd_hda_intel cdc_ncm rfkill usbnet lpc_ich option snd_hda_codec usb_wwan joydev snd_hda_core snd_hwdep snd_seq snd_seq_device snd_pcm snd_timer snd mei_me soundcore mei shpchp parport_pc video parport hid_logitech_hidpp hid_logitech_dj uas usb_storage amdkfd amd_iommu_v2 amdgpu chash i2c_algo_bit gpu_sched drm_kms_helper ttm drm crc32c_intel r8169 mii
[37506.958282] CPU: 5 PID: 10897 Comm: totem Not tainted 4.17.0-0.rc3.git4.1.fc29.x86_64 #1
[37506.958284] Hardware name: Gigabyte Technology Co., Ltd. Z87M-D3H/Z87M-D3H, BIOS F11 08/12/2014
[37506.958289] RIP: 0010:reservation_object_add_shared_fence+0x4c2/0x4f0
[37506.958291] RSP: 0018:ffffa62260b77ae8 EFLAGS: 00010246
[37506.958294] RAX: 0000000000000004 RBX: ffff883106d0c258 RCX: 0000000000000003
[37506.958296] RDX: 0000000000000001 RSI: ffff883106d0c2c8 RDI: 0000000000000246
[37506.958298] RBP: ffff882ea9749680 R08: 000073e4152a8320 R09: 0000000000000001
[37506.958300] R10: 0000000000000001 R11: 0000000000000000 R12: 0000000000000000
[37506.958302] R13: 0000000000000170 R14: ffff882bbc989d20 R15: ffff882f57910000
[37506.958304] FS: 00007fc2e8effa80(0000) GS:ffff88337e000000(0000) knlGS:0000000000000000
[37506.958306] CS: 0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[37506.958308] CR2: 00007fc2502ed000 CR3: 0000000332e1c003 CR4: 00000000001626e0
[37506.958310] Call Trace:
[37506.958349] amdgpu_vm_update_directories+0x358/0x3d0 [amdgpu]
[37506.958376] ? amdgpu_vm_free_levels+0x100/0x100 [amdgpu]
[37506.958399] amdgpu_gem_va_ioctl+0x256/0x530 [amdgpu]
[37506.958426] ? amdgpu_gem_metadata_ioctl+0x1b0/0x1b0 [amdgpu]
[37506.958439] drm_ioctl_kernel+0x5b/0xb0 [drm]
[37506.958448] drm_ioctl+0x1b3/0x370 [drm]
[37506.958470] ? amdgpu_gem_metadata_ioctl+0x1b0/0x1b0 [amdgpu]
[37506.958476] ? trace_hardirqs_on_caller+0xed/0x180
[37506.958496] amdgpu_drm_ioctl+0x49/0x80 [amdgpu]
[37506.958517] ? entry_SYSCALL_64_after_hwframe+0x49/0xbe
[37506.958522] Code: 9b fb ff ff 4d 89 74 24 18 41 c7 44 24 10 01 00 00 00 e9 85 fc ff ff 4c 89 ef eb 9b ba 18 00 00 00 b8 01 00 00 00 e9 66 fc ff ff <0f> 0b 8d 50 01 48 8d 04 c5 18 00 00 00 48 01 e8 45 31 ed 4c 89
[37506.958582] RIP: reservation_object_add_shared_fence+0x4c2/0x4f0 RSP: ffffa62260b77ae8
[37506.958585] ---[ end trace 386de0e2732dc154 ]---
Created attachment 139387 [details]
Christian, could this be related to commit a35f2f34b5b4 "dma-buf: make returning the exclusive fence optional", or do you have other ideas? I'm now running into this BUG fairly often, roughly every other daily piglit run on amd-staging-drm-next.
(In reply to Michel Dänzer from comment #2)
> Christian, could this be related to commit a35f2f34b5b4 "dma-buf: make
> returning the exclusive fence optional", or do you have other ideas? I'm now
> running into this BUG fairly often, roughly every other daily piglit run on
Good to know, David and I actually already fixed one cause for this. So I assumed that this should do it, but that obviously isn't the case.
Any usable way to reproduce the issue?
(In reply to Christian König from comment #3)
> Good to know, David and I actually already fixed one cause for this. So I
> assumed that this should do it, but that obviously isn't the case.
FWIW, I hit it from the CS ioctl at least once, like in bug 106870. Unfortunately, the other recent cases didn't seem to get captured in the log files.
> Any usable way to reproduce the issue?
Not particularly, I'm afraid. My script just pulls LLVM, xserver, Mesa etc. from Git, builds them and runs piglit.
My name's Dave Panariti (david.and I'm going to be taking a look at this.
You mention a script that grabs what's needed. If it doesn't build it, could you send instructions on how to do so?
(In reply to davep from comment #5)
> My name's Dave Panariti (david.and I'm going to be taking a look at this.
Thanks for looking into this, Dave. After hitting the BUG_ON again this week, I got fed up, took a closer look, and I think I found the problem here. This patch should fix it:
-- GitLab Migration Automatic Message --
This bug has been migrated to freedesktop.org's GitLab instance and has been closed from further activity.
You can subscribe and participate further through the new bug through this link to our GitLab instance: https://gitlab.freedesktop.org/drm/amd/issues/373.